public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
@ 2004-10-11 14:34 Razya Ladelsky
  2004-10-11 16:13 ` Steven Bosscher
  0 siblings, 1 reply; 8+ messages in thread
From: Razya Ladelsky @ 2004-10-11 14:34 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: gcc-patches, hubicka, Mircea Namolaru, Ayal Zaks

Steven Bosscher <stevenb@suse.de> wrote on 10/10/2004 18:50:14:

> On Sunday 10 October 2004 17:20, Razya Ladelsky wrote:
> > Comments are welcome.
> 
> [ The first part of this is turned into a bit of a rant, as I'm feeling
>   really bad about where you are implementing these optimizations. IMHO
>   the abstraction level is wrong, we are not taking things in the right
>   direction...  Skip to /rant for comments on the actual patch ;-)  ]
> 
> I don't mean to say anything bad about your work, but why do you not
> try to help implementing *proper* IPA, starting with a CFG inliner and
> early global (ie. intraprocedural) optimizations *before* IPA, like
> apparently most compilers do? 

We are trying to help the implementation of "proper" IPA, by first 
starting with 
a simple (the simplest?) flow-insensitive analysis of constant propagation 
using 
the existing infrastructure. This should prove to be very useful, and 
easily portable 
to low-level Gimple/CFG/Tree-SSA etc. This already raised critical issues 
such as 
"how to represent the outcome of IPA?" - for CP we can insert a statement 
to indicate 
that a parameter is known to hold a constant value, but for other analysis 
(value prop,
alias analysis) this does not work.


> To quote myself from "http://gcc.gnu.org/projects/tree-profiling.html":
> 
> ==============================================================
> (...) plans exist for reorganisation of the compiler passes as follows:
> 
> 1. For each SCC in the call graph in reverse topological order
>       do simple intraprocedural optimizations in SSA form
>       (basically, run anything that is cheap and can not
>       create overlapping live ranges, but will improve the
>       results of the next step);
>       collect intraprocedural information necessary for
>       interprocedural data flow analyses;
>       store the function's intermediate representation somewhere (*)

Are you convinced that tree-ssa/flow-sensitive/global intraprocedural is 
necessary?
Most of the benefit we know of can be obtained using scalable 
flow-insensitive analysis.
Your scheme implies that we hold all functions in Tree-ssa form, right?


> 
> 2. Perform interprocedural analyses and optimizations (at least 
>    constant propagation, side-effects analysis, and function cloning
>    and specialization).

Where does the inliner kick in? IPA can open-up new 
opportunities/priorities 
for inlining (it already does so now!). 

> 
> 3. For each SCC in the call graph in reverse topological order 
>       do remaining intraprocedural optimizations;
>       hand over the optimized function to the RTL expanders
>       for finalization and code generation.
> ==============================================================
> 
> (*) Either in memory or in a data base.  And yes I *am* willing to
>     open that discussion again if it turns out that we can do great
>     things with IPA but storing in memory is too expennsive (d'oh!).

IPA (and the inliner) need access to all functions together; the important 

point to decide is the representation in which all functions will be held 
together.

> 
> This means that when we start working on this (most likely before the
> end of this year), we (at SUSE, IBM, or whereever) will have to rewrite
> for the new IPA framework *everything* that you're now sort-of hackishly
> implementing in the existing framework that clearly is unsuitable for
> proper IPA.
> The reason *why* almost everything would have to be redone is because
> your implementation does not work on lowered GIMPLE, and also because
> we'll need to clone function CFGs (ie. not function body trees) or lose
> the CFG for cloned/inlined functions, and reconstruct them later on,
> somehow (but that is not really an option IMO).

AFAWCS, porting our IP/CP from trees to lowered Gimple + CFG seems pretty 
straightforward because it's flow insensitive (and we'll definitely do 
that
as soon as it's available for us!). Cloning function CFGs should be done 
very
similarly to the way they are inlined; in-fact, maybe the inliner should 
first
clone the callee, and then tailor it into the caller.


> 
> This is wasteful.  I would *much* rather have everyone agree on clean
> IPA framework design, and work together on implementing that.
> 
> Don't get me wrong, it is really really cool that y'all want to have 
this
> implemented in GCC.  So do I.  It is also great that you're 
experimenting
> with these algorithms, demonstrating that it can be done for GCC.
> 
> But what is happening right now is making things only more difficult in
> the long run.  We already have this lame excuse for IPA with the "Inter
> Module" C front end hack that will have to be redone if GCC is ever 
going
> to support real IPA. 
> And IMHO we do not need more obfuscated pseudo-IPA/IPO implementations
> like that; not now that we have the opportunity to do it properly with
> the cgraph module and low-GIMPLE which is relatively easy to serialize
> and dump to a data base.
> 
> If only these reasons mattered, I'd really rather not have this patch
> in CVS.  Perhaps other people have a different opinion, of course ;-)
> 

> Thanks,
> 
> Gr.
> Steven
> 
> 

Thanks, 
Razya

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-11 14:34 [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Razya Ladelsky
@ 2004-10-11 16:13 ` Steven Bosscher
  2004-10-14 16:34   ` Razya Ladelsky
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Bosscher @ 2004-10-11 16:13 UTC (permalink / raw)
  To: Razya Ladelsky; +Cc: gcc-patches, hubicka, Mircea Namolaru, Ayal Zaks

On Monday 11 October 2004 16:26, Razya Ladelsky wrote:
> the existing infrastructure. This should prove to be very useful, and
> easily portable
> to low-level Gimple/CFG/Tree-SSA etc. This already raised critical issues
> such as
> "how to represent the outcome of IPA?" - for CP we can insert a statement
> to indicate
> that a parameter is known to hold a constant value, but for other analysis
> (value prop,
> alias analysis) this does not work.

I do agree it is useful, but I doubt it will be so easy to port to
low level GIMPLE.  But we'll see, perhaps it will indeed not be that
hard.


> > To quote myself from "http://gcc.gnu.org/projects/tree-profiling.html":
> >
> > ==============================================================
> > (...) plans exist for reorganisation of the compiler passes as follows:
> >
> > 1. For each SCC in the call graph in reverse topological order
> >       do simple intraprocedural optimizations in SSA form
> >       (basically, run anything that is cheap and can not
> >       create overlapping live ranges, but will improve the
> >       results of the next step);
> >       collect intraprocedural information necessary for
> >       interprocedural data flow analyses;
> >       store the function's intermediate representation somewhere (*)
>
> Are you convinced that tree-ssa/flow-sensitive/global intraprocedural is
> necessary?
> Most of the benefit we know of can be obtained using scalable
> flow-insensitive analysis.

My reasoning is: If we're going to have to to traverse each function
to gather function-local information, then why not clean it up first?


> Most of the benefit we know of can be obtained using scalable
> flow-insensitive analysis.

It would still be flow-insensitive in the above scheme (except maybe
that you get some flow-sensitive stuff for free if we can collect
local information on the function in SSA form).


> Your scheme implies that we hold all functions in Tree-ssa form, right?

I hope so, but I don't know if that is necessary or benificial (or cost
effective, if you like).  I don't know how expensive it would be to take
each function out of SSA after step 1, and rewriting it a second time
for step 3 (ie. post-IPA).  We should definitely keep the function body
in low-GIMPLE, i.e. even if we'd take the function out of SSA form, you
wouldn't run TER during out-of-ssa.


> > 2. Perform interprocedural analyses and optimizations (at least
> >    constant propagation, side-effects analysis, and function cloning
> >    and specialization).
>
> Where does the inliner kick in?

I believe cloning and inliner should be the last IPO passes to run,
cloning before inlining, or simultaneously (dunno what works, or if
it makes a difference).  You definitely want to be able to inline
a clone, of course.


> IPA can open-up new
> opportunities/priorities
> for inlining

That is what Zadeck said too.  I was surprised by that a bit, I had
expected the only new inlining opportunities to come from cloning
after IPCP, and simplifying the clone.  That should be pretty easy
to keep track of.

Perhaps we need something completely different.  You have more
experience with this than I do - so what do you think the framework
for IPA should look like?


> (it already does so now!).

How often?  Do you have an example?
You still haven't shown any numbers :-(


> IPA (and the inliner) need access to all functions together; the important
> point to decide is the representation in which all functions will be held
> together.

IPA would only need all functions together if you indeed would not just
have a phase before IPA to collect global data.  If you have a pre-pass
that collects all global data, IPA would work with just that data, and
you would not need to have all functions together.  This does impose a
few limitations, but it also removes the requirement that you have to be
able to hold all functions together.  You can just dump them somewhere
and do IPA based only on the call graph and  information you've gathered
locally in each function before dumping it.

At least, that was the idea.

If you're going to run some kind of iterating IPA like Honza mentioned,
then you of course do need all functions readily available.  But would
that be possible in general for *very* large applications, and what
does it buy us compared to the scheme I schetched?  For example, does
IPA make new inlining opportunities available often enough to justify
keeping all functions around at all times?

To me, it appears to be a similar problem as that of classic GCSE: You
can always do another iteration to find more redundancies, but in most
cases the costs outweight the benefit.  For IPA, I would *hope* that a
scheme where you collect local data once and do IPA once would catch
the vast majority of IPO opportunities.  If that is indeed the case,
then those last few inlining opportunities and that extra bit of side
effects info that you get from some kind of iterative IPA is maybe not 
worth the trouble.
But since both Zadeck and you argue for keeping all functions around, I
guess it *is* worth the trouble from your points of view.

Zadeck mentioned that he could have several tens of thousands of Java
classes in memory and do full inter-class inlining.  That's mighty
impressive, but not something GCC would be capable of anytime soon,
unless we manage to radically change our data structures to be far more
efficient.

Oh well...  I guess I really do not have enough experience with IPA in
production compilers to make any sensible comments on all this.  Hope
you do ;-)
If you have numbers or some references to papers on this subject, that
would be much appreciated.

> AFAWCS, porting our IP/CP from trees to lowered Gimple + CFG seems pretty
> straightforward because it's flow insensitive (and we'll definitely do
> that as soon as it's available for us!).

Great.


> Cloning function CFGs should be done very
> similarly to the way they are inlined; in-fact, maybe the inliner should
> first clone the callee, and then tailor it into the caller.

Agreed.  I've always considered inlining to be just a special case of
cloning, conceptually.

Gr.
Steven



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-11 16:13 ` Steven Bosscher
@ 2004-10-14 16:34   ` Razya Ladelsky
  2004-10-14 17:00     ` Jan Hubicka
  0 siblings, 1 reply; 8+ messages in thread
From: Razya Ladelsky @ 2004-10-14 16:34 UTC (permalink / raw)
  To: Steven Bosscher; +Cc: Ayal Zaks, gcc-patches, hubicka, Mircea Namolaru

Steven Bosscher <stevenb@suse.de> wrote on 11/10/2004 18:03:51:

> On Monday 11 October 2004 16:26, Razya Ladelsky wrote:
> > the existing infrastructure. This should prove to be very useful, and
> > easily portable
> > to low-level Gimple/CFG/Tree-SSA etc. This already raised critical 
issues
> > such as
> > "how to represent the outcome of IPA?" - for CP we can insert a 
statement
> > to indicate
> > that a parameter is known to hold a constant value, but for other 
analysis
> > (value prop,
> > alias analysis) this does not work.
> 
> I do agree it is useful, but I doubt it will be so easy to port to
> low level GIMPLE.  But we'll see, perhaps it will indeed not be that
> hard.
> 
> 
> > > To quote myself from "
http://gcc.gnu.org/projects/tree-profiling.html":
> > >
> > > ==============================================================
> > > (...) plans exist for reorganisation of the compiler passes as 
follows:
> > >
> > > 1. For each SCC in the call graph in reverse topological order
> > >       do simple intraprocedural optimizations in SSA form
> > >       (basically, run anything that is cheap and can not
> > >       create overlapping live ranges, but will improve the
> > >       results of the next step);
> > >       collect intraprocedural information necessary for
> > >       interprocedural data flow analyses;
> > >       store the function's intermediate representation somewhere (*)
> >
> > Are you convinced that tree-ssa/flow-sensitive/global intraprocedural 
is
> > necessary?
> > Most of the benefit we know of can be obtained using scalable
> > flow-insensitive analysis.
> 
> My reasoning is: If we're going to have to to traverse each function
> to gather function-local information, then why not clean it up first?
> 

We are not sure how much there is to clean up at such an early stage,
and does not seem to justify transforming to tree-ssa. 

> 
> > Most of the benefit we know of can be obtained using scalable
> > flow-insensitive analysis.
> 
> It would still be flow-insensitive in the above scheme (except maybe
> that you get some flow-sensitive stuff for free if we can collect
> local information on the function in SSA form).
> 

Again, it's a cost-performance issue; Tree-ssa can improve performance(and
simplify the intraprocedural stage) but seems too costly for whole 
program.

> 
> > Your scheme implies that we hold all functions in Tree-ssa form, 
right?
> 
> I hope so, but I don't know if that is necessary or benificial (or cost
> effective, if you like).  I don't know how expensive it would be to take
> each function out of SSA after step 1, and rewriting it a second time
> for step 3 (ie. post-IPA).  We should definitely keep the function body
> in low-GIMPLE, i.e. even if we'd take the function out of SSA form, you
> wouldn't run TER during out-of-ssa.
> 
> 
> > > 2. Perform interprocedural analyses and optimizations (at least
> > >    constant propagation, side-effects analysis, and function cloning
> > >    and specialization).
> >
> > Where does the inliner kick in?
> 
> I believe cloning and inliner should be the last IPO passes to run,
> cloning before inlining, or simultaneously (dunno what works, or if
> it makes a difference).  You definitely want to be able to inline
> a clone, of course.
> 
> 

We agree, Inlining does not affect IPA, but :

> > IPA can open-up new
> > opportunities/priorities
> > for inlining
> 
> That is what Zadeck said too.  I was surprised by that a bit, I had
> expected the only new inlining opportunities to come from cloning
> after IPCP, and simplifying the clone.  That should be pretty easy
> to keep track of.
> 
> Perhaps we need something completely different.  You have more
> experience with this than I do - so what do you think the framework
> for IPA should look like?
>

Jan's framework seems reasonable to us for whole program:
- Gathering intraprocedural properties (on low gimple).
- Performing IPA (recording its decisions/results).
  Making inlining decisions.
- Trasformation function by function, starting with materializing the 
clones and inlining.
 
> 
> > (it already does so now!).
> 
> How often?  Do you have an example?
> You still haven't shown any numbers :-(
> 
> > IPA (and the inliner) need access to all functions together; the 
important
> > point to decide is the representation in which all functions will be 
held
> > together.
> 
> IPA would only need all functions together if you indeed would not just
> have a phase before IPA to collect global data.  If you have a pre-pass
> that collects all global data, IPA would work with just that data, and
> you would not need to have all functions together.  This does impose a
> few limitations, but it also removes the requirement that you have to be
> able to hold all functions together.  You can just dump them somewhere
> and do IPA based only on the call graph and  information you've gathered
> locally in each function before dumping it.
> 

Agreed.

> At least, that was the idea.
> 
> If you're going to run some kind of iterating IPA like Honza mentioned,
> then you of course do need all functions readily available.  But would
> that be possible in general for *very* large applications, and what
> does it buy us compared to the scheme I schetched?  For example, does
> IPA make new inlining opportunities available often enough to justify
> keeping all functions around at all times?
> 
> To me, it appears to be a similar problem as that of classic GCSE: You
> can always do another iteration to find more redundancies, but in most
> cases the costs outweight the benefit.  For IPA, I would *hope* that a
> scheme where you collect local data once and do IPA once would catch
> the vast majority of IPO opportunities.  If that is indeed the case,
> then those last few inlining opportunities and that extra bit of side
> effects info that you get from some kind of iterative IPA is maybe not 
> worth the trouble.
> But since both Zadeck and you argue for keeping all functions around, I
> guess it *is* worth the trouble from your points of view.
>

We are not suggesting to have iterative IPA, only to do IPA before 
inlining.
 
> Zadeck mentioned that he could have several tens of thousands of Java
> classes in memory and do full inter-class inlining.  That's mighty
> impressive, but not something GCC would be capable of anytime soon,
> unless we manage to radically change our data structures to be far more
> efficient.
> 
> Oh well...  I guess I really do not have enough experience with IPA in
> production compilers to make any sensible comments on all this.  Hope
> you do ;-)
> If you have numbers or some references to papers on this subject, that
> would be much appreciated.
>

We'll send you pointers to relevant papers shortly.
 
> > AFAWCS, porting our IP/CP from trees to lowered Gimple + CFG seems 
pretty
> > straightforward because it's flow insensitive (and we'll definitely do
> > that as soon as it's available for us!).
> 
> Great.
> 
> 
> > Cloning function CFGs should be done very
> > similarly to the way they are inlined; in-fact, maybe the inliner 
should
> > first clone the callee, and then tailor it into the caller.
> 
> Agreed.  I've always considered inlining to be just a special case of
> cloning, conceptually.
> 
> Gr.
> Steven
> 
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-14 16:34   ` Razya Ladelsky
@ 2004-10-14 17:00     ` Jan Hubicka
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Hubicka @ 2004-10-14 17:00 UTC (permalink / raw)
  To: Razya Ladelsky
  Cc: Steven Bosscher, Ayal Zaks, gcc-patches, hubicka, Mircea Namolaru

> 
> We are not sure how much there is to clean up at such an early stage,
> and does not seem to justify transforming to tree-ssa. 

I have mostly the cleanup cfg and dead code ellimination in mind.  THese
two optimizations alone do a lot to function bodies sizes especially if
you have language with higher abstraction, so they will considerably
modify the inlining decisions.  We also need to profile the functions
and/or estimate the profile that itself needs nontrivial modification of
the bodies.

Prettiest way would be to use our existing SSA optimizers to do the job,
however if this turns out to be disaster performance wise (it should not
- SSA build without aliasing is cheap as is to throw it away
afterwards), we might impleemnt specialized pass similar to what
delete_trivially_dead_insns does on RTL.
We might also want to do constant propagation tougth.
> > I believe cloning and inliner should be the last IPO passes to run,
> > cloning before inlining, or simultaneously (dunno what works, or if
> > it makes a difference).  You definitely want to be able to inline
> > a clone, of course.
> > 
> > 
> 
> We agree, Inlining does not affect IPA, but :

Unforutnately my understanding of Kenneth's vision is that it does
affect IPA in important way.  I can imagine this easilly in Java/C++
environment. For instance you have function that have object as an
argument and calls it's virtual method.  It is likely that you pass
specific inline clone specific instance of object so you might
devirutalizae the function call in the inlined clone (but can't do that
globally) opening more IPA possibilities.

If you do devirtualization before inlining, you will end up with missing
oppurtunities.  If you do it after inlining, you will introduce
posibilities for inliner that won't be taken.

One answer is probably to first do devirutalization, inlining,
devirtualization again and inlining again.  Thiss still can be done with
my scheme (if devirtualization can cope with clones right) but ineed it
is not very nice.

Other approach probably would involve making the inliner smart about
this kinds of scenarios.  I was thinking about the easier case of
function being passed as an arguemnt (for instance to for_each_rtx) and
it is not at all dificult to deal with in the inliner.  The
devirtualization is not much trickier, but of course this scheme will
restrict us to several specific cases of this more general problem.  The
question is whether catching few most common cases is enought or not.
> 
> Jan's framework seems reasonable to us for whole program:

Nice ;)

> - Gathering intraprocedural properties (on low gimple).
> - Performing IPA (recording its decisions/results).
>   Making inlining decisions.
> - Trasformation function by function, starting with materializing the 
> clones and inlining.
> 
> We'll send you pointers to relevant papers shortly.

I am looking forward for the papers.  I went across some of
implementation documents and sources I was able to find.
They seems to be mixture of both possible approaches none covering the
scalability issues of the approach requiring all functions in RAM
seriously (tought there are optimizers that do have memory management
able to swap the function body to disk again or compress it in memory
showing that they run into this kind of limitation)

Honza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-10 15:38 Razya Ladelsky
  2004-10-10 17:10 ` Steven Bosscher
@ 2004-10-12 14:18 ` Jan Hubicka
  1 sibling, 0 replies; 8+ messages in thread
From: Jan Hubicka @ 2004-10-12 14:18 UTC (permalink / raw)
  To: Razya Ladelsky; +Cc: gcc-patches, stevenb, hubicka, Mircea Namolaru, Ayal Zaks

> Hello,
> 
> Attached is the patch modified according to the suggestions
> we received.
> 
> This patch implements function cloning.
> It also contains extension to IPCP (Interprocedural constant propagation) 
> optimization http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00998.html for 
> using function cloning.
> 
> Now in order to enable IPCP for a given application, we should use the 
> command
> gcc -fipa-cp -combine file1.c file2.c file3.c 
> where file1.c, file2.c, file3.c are the files of the application.
> 
> Bootstrapped and regression tests successfully passed on POWER4.
> 
> Comments are welcome.

Hi,
The cgraph bits looks almost fine to me now (modulo minor problems),
but I still have few questions ;)

Index: cgraph.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraph.c,v
retrieving revision 1.4.4.18.2.8
diff -c -3 -p -r1.4.4.18.2.8 cgraph.c
*** cgraph.c	25 Sep 2004 23:17:32 -0000	1.4.4.18.2.8
--- cgraph.c	5 Oct 2004 08:35:54 -0000
*************** cgraph_unnest_node (struct cgraph_node *
*** 706,709 ****
--- 706,737 ----
    *node2 = node->next_nested;
    node->origin = NULL;
  }
+ 
+ /* Copy collees of node FROM to node TO.  */
+ void
+ cgraph_copy_node_callees (struct cgraph_node *to,
+                           struct cgraph_node *from)
+ {
+   struct cgraph_edge *e;
+   
+   for (e = from->callees;e; e=e->next_callee)
+     cgraph_clone_edge (e, to, e->call_expr);
+   
+ }
+ 
+ 
+ void
+ cgraph_copy_local_info (struct cgraph_node *new_node, 
+ 			struct cgraph_node *orig_node)
+ {
+   new_node->local.self_insns = orig_node->local.self_insns;
+   new_node->local.local = orig_node->local.local;
+   new_node->local.finalized = orig_node->local.finalized;
+   new_node->local.disregard_inline_limits = 
+     orig_node->local.disregard_inline_limits;
+   new_node->local.inlinable = orig_node->local.inlinable;
+   new_node->local.redefined_extern_inline = 
+     orig_node->local.redefined_extern_inline;
+ }

The function lacks comment.
Why new_node->local = old_node->local won't work?  This is how the info
is copied in the cgraph_clone_node.

If you want to have special function for this, it is OK with me, but
please make it used by the clone_node function (same for
cgraph_copy_node_callees)

+ /* Get first callee of node.  */
+ #define FIRST_CALLEE(node)               ((node)->callees)
+ /* Get next callee from edge.  */
+ #define NEXT_CALLEE(edge)                ((edge)->next_callee)
+ /* Returns whether edge is last callee.  */
+ #define CALLEE_NOT_LAST(edge)            ((edge) != NULL)
+ 

Similarly if you want to have the macros, please make them used
consistently.  I tend to preffer node->callees instead of FIRST_CALLEE
tought, but since we have both schemes in GCC already, I am fine.
The CALLEE_NOT_LAST should be probably CALLEE_NOT_LAST_P as it is
predicate and I would preffer it to be CALLEE_LAST_P to avoid double
negatives.

+ struct cgraph_node *
+ cgraph_copy_node_for_versioning (struct cgraph_node *old_version,
+  				 tree new_decl, varray_type redirect_callers)
+ {
+   struct cgraph_node *new_version;
+   struct cgraph_edge *e;
+   struct cgraph_edge *next_callee;
+   unsigned i;
+   
+   if (old_version == NULL)
+     abort ();
+   
+   new_version = cgraph_node (new_decl);
+   
+   new_version->needed = false; 
+   new_version->decl = new_decl;

cgraph_node should make this job for you.  Perhaps adding "new_decl"
argument to cgraph_clone_node would kill most of the redundancy with
this function?  (Ie you will need to call cgraph_clone_node and 
then fixup the recursive edges).  When you do have recursion, why you
want the recursive call to always point to new clone (I would guess that
the function can actually change the value of operand)..
Index: cp/tree.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/tree.c,v
retrieving revision 1.286.2.43.2.11
diff -c -3 -p -r1.286.2.43.2.11 tree.c
*** cp/tree.c	25 Sep 2004 23:18:41 -0000	1.286.2.43.2.11
--- cp/tree.c	5 Oct 2004 08:35:58 -0000
*************** cp_cannot_inline_tree_fn (tree* fnp)
*** 2057,2062 ****
--- 2057,2088 ----
    return 0;
  }
  
+ /* Decide whether there are language-specific reasons to not version a
+    function as a tree.  */
+ 
+ int 
+ cp_cannot_version_tree_fn (tree *fnp)
+ {
+   tree fn = *fnp;
+  
+   /* ??? Is there a way to version a
+      constructor?  */ 
+   if (DECL_COMPLETE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_COMPLETE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_OVERLOADED_OPERATOR_P (fn))
+     return 1;

Why this is needed?
(ie when you have already lowered form of the program you should either
see the low level information why clonning is not possible or you should
be able to clone it).

Otherwise the patch looks almost OK to me (with the incremental plan to
work on CFG based inlining/cloning next :)
Please also incorporate the Steven's comments (especially one about
moving the function out of integrate.c - that file should die soon). I
think it is resonable to place new functions in cgraph/cgraphunit.c as
they are pretty trivial now and the main infrastructure should have
friendly way to clone the function)

Also, please, prepare small testsuite catching the cases you can deal
with now.  Since we lack the IPA directory in testsuite, it would be
nice to add one. See tree-ssa directory on how to parse the dump files,
in fact copying around it's .exp file should (I hope) mostly work for
us.

Like Steven, I would be also interested to hear about the results in
some benchmarks (at least absolute numbers about how much clonning you
do and how much arguemnts you find to be constant).

Thank you,
Honza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-10 17:10 ` Steven Bosscher
@ 2004-10-10 21:25   ` Jan Hubicka
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Hubicka @ 2004-10-10 21:25 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: Razya Ladelsky, gcc-patches, hubicka, Mircea Namolaru, Ayal Zaks

> On Sunday 10 October 2004 17:20, Razya Ladelsky wrote:
> > Comments are welcome.
> 
> [ The first part of this is turned into a bit of a rant, as I'm feeling
>   really bad about where you are implementing these optimizations. IMHO
>   the abstraction level is wrong, we are not taking things in the right
>   direction...  Skip to /rant for comments on the actual patch ;-)  ]
> 
> I don't mean to say anything bad about your work, but why do you not
> try to help implementing *proper* IPA, starting with a CFG inliner and
> early global (ie. intraprocedural) optimizations *before* IPA, like
> apparently most compilers do? 
> To quote myself from "http://gcc.gnu.org/projects/tree-profiling.html":

Actually I don't think it is bad to have one nontrivial IPA optimization
while working on implementation of the IPA infrastructure (tought I
would not recomment put too much effort into implementing more
optimizations at this shape of infrastructure or pushing the quality
of analysis of high gimple in IPA too far.

If we do it right, most of IPA optimizers won't care much about the
actual underlying function representation.  There are number of issues
that needs to be resolved with the infrastructural changes and sometimes
it is good to give it a try in practice.  Lets try to get the clones
cgraph interface right now and we can clean up the actual function
clonning and gimple analysis once we are ready for this (I hope that
this will be relatively soon too)

> 
> ==============================================================
> (...) plans exist for reorganisation of the compiler passes as follows:
> 
> 1. For each SCC in the call graph in reverse topological order
>       do simple intraprocedural optimizations in SSA form
>       (basically, run anything that is cheap and can not
>       create overlapping live ranges, but will improve the
>       results of the next step);
>       collect intraprocedural information necessary for
>       interprocedural data flow analyses;
>       store the function's intermediate representation somewhere (*)
>  
> 2. Perform interprocedural analyses and optimizations (at least 
>    constant propagation, side-effects analysis, and function cloning
>    and specialization).
> 
> 3. For each SCC in the call graph in reverse topological order 
>       do remaining intraprocedural optimizations;
>       hand over the optimized function to the RTL expanders
>       for finalization and code generation.
> ==============================================================
> 
> (*) Either in memory or in a data base.  And yes I *am* willing to
>     open that discussion again if it turns out that we can do great
>     things with IPA but storing in memory is too expennsive (d'oh!).

Yep, we really need to resolve the issues concerning the link time
optimizations.

While I personally agree with the scheme you outlined (and also
described it in the GCC Summit paper), the requirement of not loading
whole program into memory at once (so relying only on global date while
doing IPA and performing the actual transformations later) actually
brings some stress on the implementation of IPA optimizers (ie
optimizers done after inlining needs to deal with the clones and it is
dificult to iterate optimizations like we commonly do for local
optimizations).  In fact Kenneth already suggested in private mail to
not use this approach and instead load everything in the memory at once
and do kind of iteration on the IPA.  I would like to hear some opinions
here.

It is quite interesting for me to see how the cprop and Kenneth's alias
analysis are impleemnted and see how much dificulty it is to make it fit
this scheme.
> 
> This means that when we start working on this (most likely before the
> end of this year), we (at SUSE, IBM, or whereever) will have to rewrite
> for the new IPA framework *everything* that you're now sort-of hackishly
> implementing in the existing framework that clearly is unsuitable for
> proper IPA.
> The reason *why* almost everything would have to be redone is because
> your implementation does not work on lowered GIMPLE, and also because
> we'll need to clone function CFGs (ie. not function body trees) or lose
> the CFG for cloned/inlined functions, and reconstruct them later on,
> somehow (but that is not really an option IMO).

Hopefully with the new CFG inliner implementation we will get cloning
functions with CFG needed for cprop for free too (in fact following the
approach of implementing IPA optimizations at once after propagation
pass, it might be nice to make the changes while inlining the function)

Honza

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-10 15:38 Razya Ladelsky
@ 2004-10-10 17:10 ` Steven Bosscher
  2004-10-10 21:25   ` Jan Hubicka
  2004-10-12 14:18 ` Jan Hubicka
  1 sibling, 1 reply; 8+ messages in thread
From: Steven Bosscher @ 2004-10-10 17:10 UTC (permalink / raw)
  To: Razya Ladelsky, gcc-patches; +Cc: hubicka, Mircea Namolaru, Ayal Zaks

On Sunday 10 October 2004 17:20, Razya Ladelsky wrote:
> Comments are welcome.

[ The first part of this is turned into a bit of a rant, as I'm feeling
  really bad about where you are implementing these optimizations. IMHO
  the abstraction level is wrong, we are not taking things in the right
  direction...  Skip to /rant for comments on the actual patch ;-)  ]

I don't mean to say anything bad about your work, but why do you not
try to help implementing *proper* IPA, starting with a CFG inliner and
early global (ie. intraprocedural) optimizations *before* IPA, like
apparently most compilers do? 
To quote myself from "http://gcc.gnu.org/projects/tree-profiling.html":

==============================================================
(...) plans exist for reorganisation of the compiler passes as follows:

1. For each SCC in the call graph in reverse topological order
      do simple intraprocedural optimizations in SSA form
      (basically, run anything that is cheap and can not
      create overlapping live ranges, but will improve the
      results of the next step);
      collect intraprocedural information necessary for
      interprocedural data flow analyses;
      store the function's intermediate representation somewhere (*)
 
2. Perform interprocedural analyses and optimizations (at least 
   constant propagation, side-effects analysis, and function cloning
   and specialization).

3. For each SCC in the call graph in reverse topological order 
      do remaining intraprocedural optimizations;
      hand over the optimized function to the RTL expanders
      for finalization and code generation.
==============================================================

(*) Either in memory or in a data base.  And yes I *am* willing to
    open that discussion again if it turns out that we can do great
    things with IPA but storing in memory is too expennsive (d'oh!).

This means that when we start working on this (most likely before the
end of this year), we (at SUSE, IBM, or whereever) will have to rewrite
for the new IPA framework *everything* that you're now sort-of hackishly
implementing in the existing framework that clearly is unsuitable for
proper IPA.
The reason *why* almost everything would have to be redone is because
your implementation does not work on lowered GIMPLE, and also because
we'll need to clone function CFGs (ie. not function body trees) or lose
the CFG for cloned/inlined functions, and reconstruct them later on,
somehow (but that is not really an option IMO).

This is wasteful.  I would *much* rather have everyone agree on clean
IPA framework design, and work together on implementing that.

Don't get me wrong, it is really really cool that y'all want to have this
implemented in GCC.  So do I.  It is also great that you're experimenting
with these algorithms, demonstrating that it can be done for GCC.

But what is happening right now is making things only more difficult in
the long run.  We already have this lame excuse for IPA with the "Inter
Module" C front end hack that will have to be redone if GCC is ever going
to support real IPA. 
And IMHO we do not need more obfuscated pseudo-IPA/IPO implementations
like that; not now that we have the opportunity to do it properly with
the cgraph module and low-GIMPLE which is relatively easy to serialize
and dump to a data base.

If only these reasons mattered, I'd really rather not have this patch
in CVS.  Perhaps other people have a different opinion, of course ;-)

</rant>


Now, about the patch.
First off, you should probably just create a new file cgraph-clone.c or
something to put most of the new functions.  cgraph*.[ch] is already a
bit messy and we plan to clean it up.

Second, I'm still very curious about numbers.  What does this do to the
compile time?  Have you counted propagated constants and/or cloned
functions on SPEC or GCC itself (or some other benchmark)?

A few comments on the code itself:

> +       /* If the parametr's address is taken,
> +          it could be modified.  */
>         if( TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL )
>           {
>   	  i = ipcp_method_tree_map (mt, TREE_OPERAND (t, 0));
>             ipcp_method_modify_set (mt, i, true);
>           }
>         break;

This is one of the reasons why I believe we should first work on some
framework to do early optimizations.  After the tree optimizers, we'll
have much better information about what parameters have their address
taken.
Also, funny you have a tab before "i =" and spaces almost everywhere
else.  I don't care about spaces vs. tabs much, but I believe the GCC
style requires tabs.  (I'd say, at least don't do both! ;-)


>   /* Perform reachability analysis and reclaim all unreachable nodes.
> !    If BEFORE_INLINING_P is true this function is called before inlining
> !    decisions has been made.  If BEFORE_INLINING_P is false this function also
> !    removes unneeded bodies of extern inline functions.  */
>   static bool
> ! cgraph_remove_unreachable_nodes (bool before_inlining_p)

You should really explain what the difference between BEFORE_INLINING_P
and !BEFORE_INLINING_P is: How does this function behave differently with
this flag set or not set?


> +   for (i=0; i < strlen (tmp_name); i++)
> +   {
> +      if (tmp_name[i] == '.')
> +            tmp_name[i] = '_';
> +   }

So you're going to call strlen for every loop iteration?



> --- integrate.c	5 Oct 2004 08:35:55 -0000
> *************** copy_decl_for_inlining (tree decl, tree
> *** 176,181 ****
> --- 176,248 ----
> 
>     return copy;
>   }
> +
> + /* Copy NODE (which must be a DECL).  The DECL originally was in the FROM_FN,
> +    but now it will be in the TO_FN. As apposed for to copy_decl_for_inlining ()
> +    function; we do not give a special treatment to parm_decl and result_decl.  */
> + tree
> + copy_decl_for_versioning (tree decl, tree from_fn, tree to_fn)

This function has no business in integrate.c, this file used to implement
the RTL inliner which is long dead and gone.  Do not put new code there.
This should go into the new file, or otherwise I guess tree-inline is a
more appropriate place.
Indeed, copy_decl_for_inlining should be moved out of this file also.


> +   if (TREE_CODE (copy) == LABEL_DECL)
> +     {
> +       TREE_ADDRESSABLE (copy) = 0;
> +     }

This is not GNU/GCC code style.


> ! 	    {
> ! 	      cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
> !               if (const_param == 0)
> !                 {
> !                   /* Compute how many callers node has.  */
> !                   node_callers = 0;
> !                   for (cs = node->callers; cs != NULL;
> !                        cs = cs->next_caller)
> !                     node_callers++;

The indent of "if" also is not quite GNU/GCC style conforming.
And would it be useful to cache the number of callers/callees in the
cgraph_node?  IIRC I've seen this kind of counting elsewhere, too (but
I'm not sure).


> +   /* ??? Remove the following gaurd.  */
> +   if (lang_hooks.tree_versioning.cannot_version_tree_fn (&fndecl))
> +     return false;

Typo: guard ;-)
And I actually like the idea that the language can say, "do not clone
this", why would you want to remove this?


> +   /* ??? Is there a way to version a
> +      constructor?  */

Good question.  I would hope so, I'd expect (but have no data to support
the suspicion, unfortunately) to see constants in constructors a lot, e.g.
booleans, number of items/slots/elements/etc...
Does anyone know if this has been studied and any data has been published?


> +   /* The new variable/label has no RTL, yet.  */
> +   if (!TREE_STATIC (copy) && !DECL_EXTERNAL (copy))
> +     SET_DECL_RTL (copy, NULL_RTX);

Hmm...  This somehow looks odd, looks like a remnant of the RTL inliner.
My understanding is that since we inline before expand, we should never
have RTL for local labels and variables when we get here.  Hence, this
check might very well be redundant (you could try replacing it with a
"gcc_assert (!DECL_RTL (copy))").


> + /* lang_hooks.tree_versioning.cannot_version_tree_fn is called to
> +    determine whether there are language-specific reasons for not
> +    creating a new version to a given function.  */

"for a given function".


> *************** ipcp_method_compute_modify (ipcp_method
> *** 743,749 ****
> 
>     ipcp_method_modify_init (mt);
>     decl = mt->decl;
> !   if (!tree_inlinable_function_p (decl))
>       {
>         for (j = 0; j < ipcp_method_formal_count (mt); j++)
>   	{
> --- 799,807 ----
> 
>     ipcp_method_modify_init (mt);
>     decl = mt->decl;
> !   /* ??? Handle pending sizes case. Set all parameters
> !      of the method to be modified.  */
> !   if (DECL_UNINLINABLE (decl))
>       {
>         for (j = 0; j < ipcp_method_formal_count (mt); j++)
>   	{

What is the reason for this change?  I cannot tell from the ChangeLog,
because you don't really *have* a ChangeLog entry for ipa_prop.c:

> 	* ipa_prop.c: Update IPA constant propagation.
> 	* ipa_prop.h: Same.

Please add a proper ChangeLog entry for all patches you post.  It is
required for good reasons, patch review without a ChangeLog entry is
very difficult sometimes.

It would generally be helpful if you give a short explanation for the
changes you make to existing files, especially for the places you've 
annotated with a FIXME, ???, XXX, or some other marker indicating that
the code as-is may not be perfect.

Thanks,

Gr.
Steven


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
@ 2004-10-10 15:38 Razya Ladelsky
  2004-10-10 17:10 ` Steven Bosscher
  2004-10-12 14:18 ` Jan Hubicka
  0 siblings, 2 replies; 8+ messages in thread
From: Razya Ladelsky @ 2004-10-10 15:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: stevenb, hubicka, Mircea Namolaru, Ayal Zaks

[-- Attachment #1: Type: text/plain, Size: 608 bytes --]

Hello,

Attached is the patch modified according to the suggestions
we received.

This patch implements function cloning.
It also contains extension to IPCP (Interprocedural constant propagation) 
optimization http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00998.html for 
using function cloning.

Now in order to enable IPCP for a given application, we should use the 
command
gcc -fipa-cp -combine file1.c file2.c file3.c 
where file1.c, file2.c, file3.c are the files of the application.

Bootstrapped and regression tests successfully passed on POWER4.

Comments are welcome.

Thanks,
Razya and Revital.



[-- Attachment #2: ChangeLog5_10 --]
[-- Type: application/octet-stream, Size: 2098 bytes --]

2004-10-05  Razya Ladelsky  <razya@il.ibm.com>

	* ipa_prop.c: Update IPA constant propagation.
	* ipa_prop.h: Same.
	* cgraph.h:  Add interface for accessing existing data structures,
	Support for cloning.
	* Makefile.in (CGRAPH_H): Add dependency to varray.h.
        (ipa_prop.c): Change dependencies.
	* common.opt: Remove fipa-no-cloning flag.
	* cgraphunit.c (cgraph_optimize, cgraph_remove_unreachable_nodes):  
	Support for IPA constant propagation.
	(update_call_expr, cgraph_copy_node_for_versioning, 
	cgraph_function_versioning): New functions to support versioning.
	* cgraph.c: (cgraph_copy_node_callees, cgraph_copy_local_info): New
	functions to support versioning.
	* integrate.c: (copy_decl_for_versioning): New function. Copies decl tree
	node for the purpose of versioning.
	* integrate.h:   (copy_decl_for_versioning): New declaration.
        * langhooks-def.h: (lhd_tree_versioning_cannot_version_tree_fn): Declare.
        (LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN): New definition.
	* langhooks.c: (lhd_tree_versioning_cannot_version_tree_fn):
	Check whether there are language-specific reasons for not
	creating a new version to a given function.
        * langhooks.h: (struct lang_hooks_for_tree_versioning,
	tree_versioning): New declaration.
        * gimplify.c: (create_function_name): New function.
	* tree-gimple.h: (create_function_name): New declaration.
	* tree-inline.c: (struct inline_data): New field to support versioning.
	(copy_arguments_for_versioning, version_body,
	copy_static_chain, function_versionable_p,
	tree_function_versioning): New functions to support tree
	function versioning.
	(remap_decl, copy_body_r):  Support function versioning.
	* tree-inline.h: (tree_versionable_function_p,
	tree_function_versioning): New declarations.
	* cp/cp-objcp-common.h: (LANG_HOOKS_TREE_VERSIONING_CA
	NNOT_VERSION_TREE_FN): New definition.
	* cp/cp-tree.h: (cp_cannot_version_tree_fn): New declaration.
	* cp/tree.c: (cp_cannot_version_tree_fn): Don't auto-version
	constructors/destructors and operators.

[-- Attachment #3: diff_5_10 --]
[-- Type: application/octet-stream, Size: 86534 bytes --]

Index: Makefile.in
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.903.2.179.2.18
diff -c -3 -p -r1.903.2.179.2.18 Makefile.in
*** Makefile.in	25 Sep 2004 23:17:22 -0000	1.903.2.179.2.18
--- Makefile.in	5 Oct 2004 08:35:53 -0000
*************** SCHED_INT_H = sched-int.h $(INSN_ATTR_H)
*** 703,709 ****
  INTEGRATE_H = integrate.h varray.h
  CFGLAYOUT_H = cfglayout.h $(BASIC_BLOCK_H)
  CFGLOOP_H = cfgloop.h $(BASIC_BLOCK_H) $(RTL_H)
! CGRAPH_H = cgraph.h bitmap.h tree.h $(HASHTAB_H)
  DF_H = df.h bitmap.h sbitmap.h $(BASIC_BLOCK_H)
  DDG_H = ddg.h sbitmap.h $(DF_H)
  GCC_H = gcc.h version.h
--- 703,709 ----
  INTEGRATE_H = integrate.h varray.h
  CFGLAYOUT_H = cfglayout.h $(BASIC_BLOCK_H)
  CFGLOOP_H = cfgloop.h $(BASIC_BLOCK_H) $(RTL_H)
! CGRAPH_H = cgraph.h bitmap.h tree.h varray.h $(HASHTAB_H)
  DF_H = df.h bitmap.h sbitmap.h $(BASIC_BLOCK_H)
  DDG_H = ddg.h sbitmap.h $(DF_H)
  GCC_H = gcc.h version.h
*************** simplify-rtx.o : simplify-rtx.c $(CONFIG
*** 1915,1923 ****
  cgraph.o : cgraph.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) gt-cgraph.h \
     output.h intl.h
! ipa_prop.o : ipa_prop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
!    langhooks.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) cgraph.h gt-cgraph.h \
!    output.h intl.h ipa_prop.h tree.h tree-iterator.h tree-gimple.h tree-inline.h
  cgraphunit.o : cgraphunit.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h tree-inline.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) intl.h \
     function.h $(TREE_GIMPLE_H) $(TREE_FLOW_H) ipa_prop.h
--- 1915,1923 ----
  cgraph.o : cgraph.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) gt-cgraph.h \
     output.h intl.h
! ipa_prop.o : ipa_prop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TREE_H) \
!    langhooks.h $(GGC_H) $(CGRAPH_H) output.h ipa_prop.h tree-iterator.h \
!    tree-inline.h  
  cgraphunit.o : cgraphunit.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h tree-inline.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) intl.h \
     function.h $(TREE_GIMPLE_H) $(TREE_FLOW_H) ipa_prop.h
Index: cgraph.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraph.c,v
retrieving revision 1.4.4.18.2.8
diff -c -3 -p -r1.4.4.18.2.8 cgraph.c
*** cgraph.c	25 Sep 2004 23:17:32 -0000	1.4.4.18.2.8
--- cgraph.c	5 Oct 2004 08:35:54 -0000
*************** cgraph_unnest_node (struct cgraph_node *
*** 706,709 ****
--- 706,737 ----
    *node2 = node->next_nested;
    node->origin = NULL;
  }
+ 
+ /* Copy collees of node FROM to node TO.  */
+ void
+ cgraph_copy_node_callees (struct cgraph_node *to,
+                           struct cgraph_node *from)
+ {
+   struct cgraph_edge *e;
+   
+   for (e = from->callees;e; e=e->next_callee)
+     cgraph_clone_edge (e, to, e->call_expr);
+   
+ }
+ 
+ 
+ void
+ cgraph_copy_local_info (struct cgraph_node *new_node, 
+ 			struct cgraph_node *orig_node)
+ {
+   new_node->local.self_insns = orig_node->local.self_insns;
+   new_node->local.local = orig_node->local.local;
+   new_node->local.finalized = orig_node->local.finalized;
+   new_node->local.disregard_inline_limits = 
+     orig_node->local.disregard_inline_limits;
+   new_node->local.inlinable = orig_node->local.inlinable;
+   new_node->local.redefined_extern_inline = 
+     orig_node->local.redefined_extern_inline;
+ }
+ 
  #include "gt-cgraph.h"
Index: cgraph.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraph.h,v
retrieving revision 1.1.4.16.2.7
diff -c -3 -p -r1.1.4.16.2.7 cgraph.h
*** cgraph.h	25 Sep 2004 23:17:32 -0000	1.1.4.16.2.7
--- cgraph.h	5 Oct 2004 08:35:54 -0000
*************** Software Foundation, 59 Temple Place - S
*** 24,29 ****
--- 24,30 ----
  #include "hashtab.h"
  #include "bitmap.h"
  #include "tree.h"
+ #include "varray.h"
  
  /* Information about the function collected locally.
     Available after function is analyzed.  */
*************** struct cgraph_varpool_node GTY(())
*** 232,237 ****
--- 233,245 ----
    bool output;
  };
  
+ /* Get first callee of node.  */
+ #define FIRST_CALLEE(node)               ((node)->callees)
+ /* Get next callee from edge.  */
+ #define NEXT_CALLEE(edge)                ((edge)->next_callee)
+ /* Returns whether edge is last callee.  */
+ #define CALLEE_NOT_LAST(edge)            ((edge) != NULL)
+ 
  extern GTY(()) struct cgraph_node *cgraph_nodes;
  extern GTY(()) int cgraph_n_nodes;
  extern GTY(()) int cgraph_max_uid;
*************** struct cgraph_rtl_info *cgraph_rtl_info 
*** 258,270 ****
  const char * cgraph_node_name (struct cgraph_node *);
  struct cgraph_edge * cgraph_clone_edge (struct cgraph_edge *, struct cgraph_node *, tree);
  struct cgraph_node * cgraph_clone_node (struct cgraph_node *);
! 
  struct cgraph_varpool_node *cgraph_varpool_node (tree decl);
  void cgraph_varpool_mark_needed_node (struct cgraph_varpool_node *);
  void cgraph_varpool_finalize_decl (tree);
  bool cgraph_varpool_assemble_pending_decls (void);
  void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
! 
  bool cgraph_function_possibly_inlined_p (tree);
  void cgraph_unnest_node (struct cgraph_node *node);
  
--- 266,280 ----
  const char * cgraph_node_name (struct cgraph_node *);
  struct cgraph_edge * cgraph_clone_edge (struct cgraph_edge *, struct cgraph_node *, tree);
  struct cgraph_node * cgraph_clone_node (struct cgraph_node *);
! void cgraph_copy_node_callees (struct cgraph_node *,
!                                struct cgraph_node *);
  struct cgraph_varpool_node *cgraph_varpool_node (tree decl);
  void cgraph_varpool_mark_needed_node (struct cgraph_varpool_node *);
  void cgraph_varpool_finalize_decl (tree);
  bool cgraph_varpool_assemble_pending_decls (void);
  void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
! void cgraph_copy_local_info (struct cgraph_node *,
!                              struct cgraph_node *);
  bool cgraph_function_possibly_inlined_p (tree);
  void cgraph_unnest_node (struct cgraph_node *node);
  
*************** void cgraph_build_static_cdtor (char whi
*** 286,291 ****
--- 296,308 ----
  void cgraph_reset_static_var_maps (void);
  bitmap get_global_statics_not_read (tree fn);
  bitmap get_global_statics_not_written(tree fn);
+ void update_call_expr (struct cgraph_node *, varray_type);
+ struct cgraph_node *cgraph_copy_node_for_versioning (struct cgraph_node *,
+                                                      tree, varray_type);
+ struct cgraph_node *cgraph_function_versioning (struct cgraph_node *,
+                                                 varray_type);
+ 
  void init_cgraph (void);
  
+ 
  #endif  /* GCC_CGRAPH_H  */
Index: cgraphunit.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraphunit.c,v
retrieving revision 1.1.4.35.2.16
diff -c -3 -p -r1.1.4.35.2.16 cgraphunit.c
*** cgraphunit.c	25 Sep 2004 23:17:32 -0000	1.1.4.35.2.16
--- cgraphunit.c	5 Oct 2004 08:35:54 -0000
*************** record_call_1 (tree *tp, int *walk_subtr
*** 670,677 ****
  		       visited_nodes);
  	    *walk_subtrees = 0;
  	  }
-         if (flag_ipa_cp && decl == NULL_TREE)
-           flag_ipa_cp = 0;
  	break;
        }
  
--- 670,675 ----
*************** cgraph_reduced_inorder (struct cgraph_no
*** 1270,1279 ****
  }
  
  /* Perform reachability analysis and reclaim all unreachable nodes.
!    This function also remove unneeded bodies of extern inline functions
!    and thus needs to be done only after inlining decisions has been made.  */
  static bool
! cgraph_remove_unreachable_nodes (void)
  {
    struct cgraph_node *first = (void *) 1;
    struct cgraph_node *node;
--- 1268,1278 ----
  }
  
  /* Perform reachability analysis and reclaim all unreachable nodes.
!    If BEFORE_INLINING_P is true this function is called before inlining
!    decisions has been made.  If BEFORE_INLINING_P is false this function also 
!    removes unneeded bodies of extern inline functions.  */
  static bool
! cgraph_remove_unreachable_nodes (bool before_inlining_p)
  {
    struct cgraph_node *first = (void *) 1;
    struct cgraph_node *node;
*************** cgraph_remove_unreachable_nodes (void)
*** 1291,1298 ****
  #endif
    for (node = cgraph_nodes; node; node = node->next)
      if (node->needed && !node->global.inlined_to
! 	&& (!DECL_EXTERNAL (node->decl) || !node->analyzed))
        {
  	node->aux = first;
  	first = node;
        }
--- 1290,1300 ----
  #endif
    for (node = cgraph_nodes; node; node = node->next)
      if (node->needed && !node->global.inlined_to
! 	&& ((!DECL_EXTERNAL (node->decl)) 
!             || !node->analyzed
!             || before_inlining_p))
        {
+         node->reachable = true;
  	node->aux = first;
  	first = node;
        }
*************** cgraph_remove_unreachable_nodes (void)
*** 1312,1320 ****
  	if (!e->callee->aux
  	    && node->analyzed
  	    && (!e->inline_failed || !e->callee->analyzed
! 		|| !DECL_EXTERNAL (e->callee->decl)))
! 	  {
  	    e->callee->aux = first;
  	    first = e->callee;
  	  }
      }
--- 1314,1324 ----
  	if (!e->callee->aux
  	    && node->analyzed
  	    && (!e->inline_failed || !e->callee->analyzed
! 		|| (!DECL_EXTERNAL (e->callee->decl))
!                 || before_inlining_p))
!         {
  	    e->callee->aux = first;
+             e->callee->reachable = true;
  	    first = e->callee;
  	  }
      }
*************** cgraph_remove_unreachable_nodes (void)
*** 1341,1347 ****
  	    local_insns = 0;
  	  if (cgraph_dump_file)
  	    fprintf (cgraph_dump_file, " %s", cgraph_node_name (node));
! 	  if (!node->analyzed || !DECL_EXTERNAL (node->decl))
  	    cgraph_remove_node (node);
  	  else
  	    {
--- 1345,1352 ----
  	    local_insns = 0;
  	  if (cgraph_dump_file)
  	    fprintf (cgraph_dump_file, " %s", cgraph_node_name (node));
! 	  if (!node->analyzed || !DECL_EXTERNAL (node->decl)
!               || before_inlining_p)
  	    cgraph_remove_node (node);
  	  else
  	    {
*************** cgraph_decide_inlining (void)
*** 1960,1966 ****
    /* We will never output extern functions we didn't inline. 
       ??? Perhaps we can prevent accounting of growth of external
       inline functions.  */
!   cgraph_remove_unreachable_nodes ();
  
    if (cgraph_dump_file)
      fprintf (cgraph_dump_file,
--- 1965,1971 ----
    /* We will never output extern functions we didn't inline. 
       ??? Perhaps we can prevent accounting of growth of external
       inline functions.  */
!   cgraph_remove_unreachable_nodes (false);
  
    if (cgraph_dump_file)
      fprintf (cgraph_dump_file,
*************** cgraph_optimize (void)
*** 2785,2792 ****
  #endif
    if (!flag_unit_at_a_time)
      return;
!   if (flag_ipa_cp && flag_ipa_no_cloning)
      ipcp_driver ();
    timevar_push (TV_CGRAPHOPT);
    if (!quiet_flag)
      fprintf (stderr, "Performing intraprocedural optimizations\n");
--- 2790,2801 ----
  #endif
    if (!flag_unit_at_a_time)
      return;
!   if (flag_ipa_cp)
!   {
      ipcp_driver ();
+     /* Clean the callgraph.  */
+     cgraph_remove_unreachable_nodes (true);
+   }
    timevar_push (TV_CGRAPHOPT);
    if (!quiet_flag)
      fprintf (stderr, "Performing intraprocedural optimizations\n");
*************** init_cgraph (void)
*** 2932,2935 ****
--- 2941,3083 ----
    cgraph_dump_file = dump_begin (TDI_cgraph, NULL);
    memory_identifier = get_identifier("memory");
  }
+ 
+  
+ /* Update the CALL_EXPR in NEW_VERSION node callers edges.  */
+ 
+ void
+ update_call_expr (struct cgraph_node *new_version, 
+  		  varray_type redirect_callers)
+ {
+   struct cgraph_edge *e;
+   unsigned i;
+   
+   if (new_version == NULL)
+     abort ();
+   
+   for (i = 0; i < VARRAY_ACTIVE_SIZE (redirect_callers); i++)
+     {
+       e = VARRAY_GENERIC_PTR (redirect_callers, i);
+       
+       /* Update the call expr on the edges
+          to the new version. */ 
+       TREE_OPERAND (TREE_OPERAND (e->call_expr, 0), 0) = new_version->decl;
+     }
+   for (e = new_version->callers; e; e = e->next_caller)
+     { 
+       /* Update the call expr on the edges
+          of recursive calls. */
+       if (e->caller == new_version)
+         TREE_OPERAND (TREE_OPERAND (e->call_expr, 0), 0) = new_version->decl;            
+     }           
+ }
+ 
+ 
+ /* Create a new cgraph node which is the new version of 
+    OLD_VERSION node.  REDIRECT_CALLERS holds the callers
+    of OLD_VERSION which should be redirected to point to  
+    NEW_VERSION.  ALL the callees edges of OLD_VERSION 
+    are cloned to the new version node.  Return the new 
+    version node.  */
+ 
+ struct cgraph_node *
+ cgraph_copy_node_for_versioning (struct cgraph_node *old_version,
+  				 tree new_decl, varray_type redirect_callers)
+ {
+   struct cgraph_node *new_version;
+   struct cgraph_edge *e;
+   struct cgraph_edge *next_callee;
+   unsigned i;
+   
+   if (old_version == NULL)
+     abort ();
+   
+   new_version = cgraph_node (new_decl);
+   
+   new_version->needed = false; 
+   new_version->decl = new_decl;
+   memset (&new_version->local, 0, sizeof (new_version->local));
+   cgraph_copy_local_info (new_version,
+  			  old_version);
+   memset (&new_version->global, 0, sizeof (new_version->global));
+   new_version->global.insns = old_version->global.insns;
+   memset (&new_version->rtl, 0, sizeof (new_version->rtl));
+   new_version->analyzed = true;
+   
+   /* Clone the old node callees.  Recursive calls are
+      also cloned.  */ 
+   cgraph_copy_node_callees (new_version, old_version);
+   
+   
+   /* Fix recursive calls. 
+      If old_version has a recursive call after the 
+      previous cloning the new version will have an edge
+      pointing to the old version which is wrong;
+      so redirect it to point to the new version. */
+   for (e = new_version->callees ; e; e = next_callee)
+     {
+       next_callee = e->next_callee;
+       if (e->callee == old_version)
+         {
+           cgraph_redirect_edge_callee (e, new_version);
+         }
+        if (!next_callee)
+          break;
+     }
+   for (i = 0; i < VARRAY_ACTIVE_SIZE (redirect_callers); i++) 
+     {
+       e = VARRAY_GENERIC_PTR (redirect_callers, i); 
+       /* Redirect calls to the old version node
+  	 to point to it's new version.  */ 
+       cgraph_redirect_edge_callee (e, new_version);
+     }
+   
+   allocate_struct_function (new_decl);
+   cfun->function_end_locus = DECL_SOURCE_LOCATION (new_decl);
+   
+   return new_version;
+ }
+ 
+ /* Perform function versioning. 
+    Function versioning includes:
+    1) Generating a new cgraph node for the new version
+    and redirect it's edges respectively. 
+    2) Copying the old version tree to the new
+    version.  
+    The function gets REDIRECT_CALLERS varray, the 
+    edges to be redirected to the new version and
+    the old version's cgraph node.  It returns the new version's 
+    cgraph node.  */ 
+ 
+ struct cgraph_node *
+ cgraph_function_versioning (struct cgraph_node *old_version_node, 
+                             varray_type redirect_callers)
+ {
+   tree old_decl = old_version_node->decl;
+   struct cgraph_node *new_version_node = NULL;
+   tree new_decl;
+   
+   if (!tree_versionable_function_p (old_decl))
+     return NULL;
+   
+   /* Make a new FUNCTION_DECL tree node for the
+      new version. */
+   new_decl = copy_node (old_decl);
+   
+   /* Create the new version's call-graph node. 
+      and update the edges of the new node. */
+   new_version_node = 
+     cgraph_copy_node_for_versioning (old_version_node, new_decl,
+                                      redirect_callers);  
+   
+   /* Copy the old version's function tree to the new
+      version.  */
+   tree_function_versioning (old_decl, new_decl);
+   /* Update the call_expr on the edges
+      to the new version node. */
+   update_call_expr (new_version_node, redirect_callers);
+   return new_version_node;
+ }
+ 
+ 
  #include "gt-cgraphunit.h"
Index: common.opt
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/common.opt,v
retrieving revision 1.14.2.20.2.13
diff -c -3 -p -r1.14.2.20.2.13 common.opt
*** common.opt	18 Sep 2004 22:46:58 -0000	1.14.2.20.2.13
--- common.opt	5 Oct 2004 08:35:54 -0000
*************** fipa-cp
*** 888,897 ****
  Common Report Var(flag_ipa_cp)
  Perform IPA constant propagation
  
- fipa-no-cloning
- Common Report Var(flag_ipa_no_cloning)
- Perform IPA cp, without cloning of methods
- 
  funroll-loops
  Common Report Var(flag_unroll_loops)
  Perform loop unrolling when iteration count is known
--- 888,893 ----
Index: gimplify.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/gimplify.c,v
retrieving revision 1.1.2.141.2.13
diff -c -3 -p -r1.1.2.141.2.13 gimplify.c
*** gimplify.c	25 Sep 2004 23:17:49 -0000	1.1.2.141.2.13
--- gimplify.c	5 Oct 2004 08:35:55 -0000
*************** create_tmp_var_name (const char *prefix)
*** 315,320 ****
--- 315,344 ----
    return get_identifier (tmp_name);
  }
  
+ /* Create a new function name with PREFIX.  Returns an identifier.  */
+ tree
+ create_function_name (const char *prefix)
+ {
+   char *tmp_name;
+   unsigned i;
+   
+   if (prefix)
+     {
+       char *preftmp = ASTRDUP (prefix);
+ 
+       remove_suffix (preftmp, strlen (preftmp));
+       prefix = preftmp;
+     }
+   ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix ? prefix : "F", tmp_var_id_num++);
+   for (i=0; i < strlen (tmp_name); i++)
+   {
+      if (tmp_name[i] == '.')
+            tmp_name[i] = '_';
+   }
+ 
+   return get_identifier (tmp_name);
+ }
+ 
  
  /* Create a new temporary variable declaration of type TYPE.
     Does NOT push it into the current binding.  */
Index: integrate.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/integrate.c,v
retrieving revision 1.197.2.31.2.7
diff -c -3 -p -r1.197.2.31.2.7 integrate.c
*** integrate.c	9 Sep 2004 10:55:04 -0000	1.197.2.31.2.7
--- integrate.c	5 Oct 2004 08:35:55 -0000
*************** copy_decl_for_inlining (tree decl, tree 
*** 176,181 ****
--- 176,248 ----
  
    return copy;
  }
+ 
+ /* Copy NODE (which must be a DECL).  The DECL originally was in the FROM_FN,
+    but now it will be in the TO_FN. As apposed for to copy_decl_for_inlining ()
+    function; we do not give a special treatment to parm_decl and result_decl.  */ 
+ tree
+ copy_decl_for_versioning (tree decl, tree from_fn, tree to_fn)
+ {
+   tree copy;
+ 
+   /* Copy the declaration.  */
+   copy = copy_node (decl);
+       
+   /* The COPY is not abstract; it will be generated in TO_FN.  */
+   DECL_ABSTRACT (copy) = 0;
+   lang_hooks.dup_lang_specific_decl (copy);
+       
+   /* TREE_ADDRESSABLE isn't used to indicate that a label's
+      address has been taken; it's for internal bookkeeping in
+      expand_goto_internal.  */
+   if (TREE_CODE (copy) == LABEL_DECL)
+     {
+       TREE_ADDRESSABLE (copy) = 0;
+     }
+   
+   /* FIXME:  For the moment debug information for the copy
+      is not supported.  */ 
+   DECL_ARTIFICIAL (copy) = 1;
+   DECL_IGNORED_P (copy) = 1;
+   /* DECL_ABSTRACT_ORIGIN (copy) = DECL_ORIGIN (decl);  */
+   
+   if (TREE_STATIC (copy))
+     {
+       /* Set the DECL_ABSTRACT_ORIGIN of copy to point to the 
+          old version decl in case it is static so it can 
+          be retrieved when needed.  */
+       DECL_ABSTRACT_ORIGIN (copy) = decl;
+       /* FIXME: Again - debug information is not supported yet.  */
+       DECL_IGNORED_P (decl) = 1;
+     }
+    
+   /* The new variable/label has no RTL, yet.  */
+   if (!TREE_STATIC (copy) && !DECL_EXTERNAL (copy))
+     SET_DECL_RTL (copy, NULL_RTX);
+  
+   /* These args would always appear unused, if not for this.  */
+   TREE_USED (copy) = 1;
+   
+   /* Set the context for the new declaration.  */
+   if (!DECL_CONTEXT (decl))
+     /* Globals stay global.  */
+     ;
+   else if (TREE_STATIC (decl))
+     /* Function-scoped static variables should stay in the original
+        function.  */
+     ;
+   else if (DECL_CONTEXT (decl) != from_fn)
+     /* Things that weren't in the scope of the function we're inlining
+        from aren't in the scope we're inlining to, either.  */
+     ;
+   else
+     /* Ordinary automatic local variables are now in the scope of the
+        new function.  */
+     DECL_CONTEXT (copy) = to_fn;
+   
+   return copy;
+ }
+ 
  \f
  /* Unfortunately, we need a global copy of const_equiv map for communication
     with a function called from note_stores.  Be *very* careful that this
Index: integrate.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/integrate.h,v
retrieving revision 1.25.2.2.6.1
diff -c -3 -p -r1.25.2.2.6.1 integrate.h
*** integrate.h	22 Feb 2004 15:35:47 -0000	1.25.2.2.6.1
--- integrate.h	5 Oct 2004 08:35:55 -0000
*************** extern void allocate_initial_values (rtx
*** 136,142 ****
  /* Copy a declaration when one function is substituted inline into
     another.  */
  extern tree copy_decl_for_inlining (tree, tree, tree);
! 
  /* Check whether there's any attribute in a function declaration that
     makes the function uninlinable.  Returns false if it finds any,
     true otherwise.  */
--- 136,142 ----
  /* Copy a declaration when one function is substituted inline into
     another.  */
  extern tree copy_decl_for_inlining (tree, tree, tree);
! extern tree copy_decl_for_versioning (tree, tree, tree);
  /* Check whether there's any attribute in a function declaration that
     makes the function uninlinable.  Returns false if it finds any,
     true otherwise.  */
Index: ipa_prop.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/ipa_prop.c,v
retrieving revision 1.1.2.3
diff -c -3 -p -r1.1.2.3 ipa_prop.c
*** ipa_prop.c	12 Sep 2004 16:39:17 -0000	1.1.2.3
--- ipa_prop.c	5 Oct 2004 08:35:56 -0000
***************
*** 1,6 ****
! /* Callgraph based intraprocedural optimizations.
!    Copyright (C) 2003, 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky.
  
  This file is part of GCC.
  
--- 1,6 ----
! /* Interprocedural constant propagation
!    Copyright (C) 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky <RAZYA@il.ibm.com>
  
  This file is part of GCC.
  
*************** Software Foundation, 59 Temple Place - S
*** 20,27 ****
  02111-1307, USA.  */
  
  /* Interprocedural constant propagation.
! The aim of IPA constant propagation is to find which function's argument
! has the same constant value in each invocation throughout the whole program.
  For example, for an application consisting of two files, foo1.c, foo2.c:
  
  foo1.c contains :
--- 20,27 ----
  02111-1307, USA.  */
  
  /* Interprocedural constant propagation.
! The aim of interprocedural constant propagation (IPCP) is to find which function's 
! argument has the same constant value in each invocation throughout the whole program.
  For example, for an application consisting of two files, foo1.c, foo2.c:
  
  foo1.c contains :
*************** int g (int y)
*** 47,142 ****
    printf ("value is %d",y);
  }
  
! The IPCP algorithm should find that g's formal argument y
  is always called with the value 3. 
  
  The optimization is divided into three stages:
  
  First stage - intraprocedural analysis  
  =======================================
  This phase computes jump_function and modify information. 
  
! Jump functions represent the value that is passes to each actual argument 
! by this callsite.
! there are three values :
!   Formal - the caller's formal parameter is passed as a parameter.
!   Constant - a constant is passed as a parameter.
    Unknown - neither of the above.
  
! Modify is true for a formal parameter if it is modified in its function.
! 
! In the case that the formal parameter passed in the callsite is 
! modified, the jump fumction value for this parameter turns to UNKOWN.
  
! The jump function structure, ipcp_jump_func, is defined in ipa_edge
  structure (defined in ipa_prop.h and pointed to by cgraph_node->aux)
  The modify info, ipcp_modify, is defined in ipa_node structure
  (defined in ipa_prop.h and pointed to by cgraph_edge->aux).
  
! The ipcp_tree_map (a mapping from the PARAM_DECL tree to the inner
! representation of formals) and ipcp_formal (explained below) structures
! are also pointed to by cgraph_node->aux and also initialized 
! in this phase.
! 
! -ipcp_stage1() is the first stage driver.
  
  Second stage - interprocedural analysis
  ========================================
! This phase solves the interprocedural constant propagation problem.
! The algorithm is based on a paper by Challahan David, 
! Keith D Cooper, Ken Kennedy, Linda Torczon, 
! Interprocedural Constant Propagation, Comp86.
! 
! Each formal has cval info (denoted as ipcp_formal) which represents its 
! value computed so far. 
! possible values : 
!   TOP - not known yet. 
!   BOTTOM - considered as not constant. 
!   CONSTANT_TYPE - has constant value.
  
- The algoritm computes for all formal parameters in the program 
- their cval value.
  Cval of formal f will have a constant value if all callsites to this 
! function have the same constant value passed to f 
! and will have BOTTOM value otherwise.
  
! This stage propagates the constant information from formal paranmeters
! to the actual parameters.
  
! -ipcp_stage2() is the second stage driver.
  
! Third phase 
! ============
  Propagates the constant-valued formals into the function.
  
! -ipcp_stage3() is the third phase driver.
     
  */
  
  #include "config.h"
  #include "system.h"
! #include "coretypes.h"
! #include "tm.h"
  #include "tree.h"
  #include "langhooks.h"
- #include "hashtab.h"
- #include "toplev.h"
- #include "flags.h"
  #include "ggc.h"
- #include "debug.h"
- #include "target.h"  
  #include "cgraph.h"
- #include "varray.h"
  #include "output.h"  
  #include "tree-iterator.h"
- #include "tree-gimple.h"
  #include "tree-inline.h"
  #include "ipa_prop.h"
  
  typedef struct cgraph_node *ipcp_method ;
  typedef  struct cgraph_edge *ipcp_callsite;
  
! struct ipcp_methodlist
  {
    ipcp_method method_p;
    struct ipcp_methodlist *next_method;
--- 47,144 ----
    printf ("value is %d",y);
  }
  
! The IPCP algorithm will find that g's formal argument y
  is always called with the value 3. 
  
+ The algorithm used is based on "Interprocedural Constant Propagation",
+ by Challahan David, Keith D Cooper, Ken Kennedy, Linda Torczon, Comp86, pg 152-161
+ 
  The optimization is divided into three stages:
  
  First stage - intraprocedural analysis  
  =======================================
  This phase computes jump_function and modify information. 
  
! A jump function for a callsite represents the values passed as actual arguments 
! of the callsite. There are three types of values :
!   Formal - the caller's formal parameter is passed as an actual argument.
!   Constant - a constant is passed as a an actual argument.
    Unknown - neither of the above.
  
! In order to compute the jump functions, we need the modify information for the formal
! parameters of methods.
  
! The jump function info, ipcp_jump_func, is defined in ipa_edge
  structure (defined in ipa_prop.h and pointed to by cgraph_node->aux)
  The modify info, ipcp_modify, is defined in ipa_node structure
  (defined in ipa_prop.h and pointed to by cgraph_edge->aux).
  
! -ipcp_init_stage() is the first stage driver.
  
  Second stage - interprocedural analysis
  ========================================
! This phase does the interprocedural constant propagation.
! It computes for all formal parameters in the program 
! their cval value that may be:
!   TOP - unknown. 
!   BOTTOM - non constant. 
!   CONSTANT_TYPE - constant value.
  
  Cval of formal f will have a constant value if all callsites to this 
! function have the same constant value passed to f. 
  
! The cval info, ipcp_formal, is defined in ipa_node structure
! (defined in ipa_prop.h and pointed to by cgraph_edge->aux). 
  
! -ipcp_iterate_stage() is the second stage driver.
  
! Third phase - transformation of methods code
! ============================================
  Propagates the constant-valued formals into the function.
+ For each method mt, whose parameters are consts, we create a clone.
+ 
+ We insert an assignment statement 'parameter = const' at the beginning 
+ of the cloned method.
  
! We also need to modify some callsites to call to the cloned methods instead 
! of the original ones. For a callsite passing an argument found to be a constant 
! by IPCP, there are two different cases to handle:
! 1. A constant is passed as an argument.
! 2. A parameter (of the caller) passed as an argument (pass through argument).
! 
! In the first case, the callsite in the original caller should be redirected 
! to call the cloned callee.
! In the second case, both the caller and the callee have clones
! and the callsite of the cloned caller would be redirected to call to 
! the cloned callee.
! 
! The callgraph is updated accordingly.
! 
! This update is done in two stages: 
! First all cloned methods are created during a traversal of the callgraph, 
! during which all callsites are redirected to call the cloned method.
! Then the callsites are traversed and updated as described above.
! 
! -ipcp_insert_stage() is the third phase driver.
     
  */
  
  #include "config.h"
  #include "system.h"
! #include "coretypes.h" 
  #include "tree.h"
  #include "langhooks.h"
  #include "ggc.h"
  #include "cgraph.h"
  #include "output.h"  
  #include "tree-iterator.h"
  #include "tree-inline.h"
  #include "ipa_prop.h"
  
  typedef struct cgraph_node *ipcp_method ;
  typedef  struct cgraph_edge *ipcp_callsite;
  
! struct ipcp_methodlist GTY(())
  {
    ipcp_method method_p;
    struct ipcp_methodlist *next_method;
*************** struct ipcp_methodlist
*** 144,150 ****
  
  typedef struct ipcp_methodlist *ipcp_methodlist_p;
  
! /* WorlList Interface.  */
  static inline ipcp_methodlist_p ipcp_create_methodlist_node (void); 
  static inline bool ipcp_methodlist_not_empty (ipcp_methodlist_p);
  static inline ipcp_method ipcp_methodlist_method (ipcp_methodlist_p); 
--- 146,152 ----
  
  typedef struct ipcp_methodlist *ipcp_methodlist_p;
  
! /* ipcp_worlList Interface.  */
  static inline ipcp_methodlist_p ipcp_create_methodlist_node (void); 
  static inline bool ipcp_methodlist_not_empty (ipcp_methodlist_p);
  static inline ipcp_method ipcp_methodlist_method (ipcp_methodlist_p); 
*************** static void ipcp_callsite_compute_param 
*** 172,177 ****
--- 174,183 ----
  static void ipcp_callsite_compute_count (ipcp_callsite);
  
  /* ipcp_method interface.  */
+ static inline void ipa_node_create (ipcp_method);
+ static inline bool ipcp_method_is_cloned (ipcp_method);
+ static inline void ipcp_method_set_orig_node (ipcp_method, ipcp_method);
+ static inline ipcp_method ipcp_method_orig_node (ipcp_method);
  static inline void ipcp_formal_create (ipcp_method); 
  static inline int ipcp_method_formal_count (ipcp_method);
  static inline void ipcp_method_formal_count_set (ipcp_method, int); 
*************** static inline tree ipcp_method_get_tree 
*** 184,192 ****
  static inline void ipcp_method_tree_map_create (ipcp_method);
  static inline void ipcp_method_modify_create (ipcp_method); 
  static inline void ipcp_method_modify_set (ipcp_method, int, bool);
! static inline ipcp_callsite ipcp_cs_get_first (ipcp_method);
! static inline ipcp_callsite ipcp_cs_get_next (ipcp_callsite);
! static inline bool ipcp_cs_not_last (ipcp_callsite); 
  static int  ipcp_method_tree_map (ipcp_method, tree);
  static void ipcp_method_compute_tree_map (ipcp_method);
  static void ipcp_cval_meet (struct ipcp_formal *, struct ipcp_formal *,
--- 190,196 ----
  static inline void ipcp_method_tree_map_create (ipcp_method);
  static inline void ipcp_method_modify_create (ipcp_method); 
  static inline void ipcp_method_modify_set (ipcp_method, int, bool);
! static void ipa_cloned_create (ipcp_method orig_node, ipcp_method new_node);
  static int  ipcp_method_tree_map (ipcp_method, tree);
  static void ipcp_method_compute_tree_map (ipcp_method);
  static void ipcp_cval_meet (struct ipcp_formal *, struct ipcp_formal *,
*************** static inline ipcp_int *ipcp_cval_get_cv
*** 216,228 ****
  static inline enum Cvalue_type ipcp_cval_get_cvalue_type (struct ipcp_formal *);
  static inline bool ipcp_cval_equal_cvalues (ipcp_int *, ipcp_int *);
  
! static void ipcp_stage1 (void);
! static void ipcp_stage2 (void);
! static void ipcp_stage3 (void);
  static void ipcp_free (void);
  static void ipa_nodes_create (void);
  
! static bool ipcp_method_dont_propagate (ipcp_method);
  static tree ipcp_asm_walk_tree (tree *, int *, void *);
  static bool ipcp_method_contains_asm (ipcp_method);
  static void ipcp_walk_statements (ipcp_method, tree *);
--- 220,235 ----
  static inline enum Cvalue_type ipcp_cval_get_cvalue_type (struct ipcp_formal *);
  static inline bool ipcp_cval_equal_cvalues (ipcp_int *, ipcp_int *);
  
! static void ipcp_init_stage (void);
! static void ipcp_iterate_stage (void);
! static void ipcp_insert_stage (void);
! static void ipcp_update_callgraph (void);
! static bool ipcp_redirect (ipcp_callsite);
  static void ipcp_free (void);
  static void ipa_nodes_create (void);
+ static void ipa_edges_create (void);
  
! static bool ipcp_method_dont_insert_const (ipcp_method);
  static tree ipcp_asm_walk_tree (tree *, int *, void *);
  static bool ipcp_method_contains_asm (ipcp_method);
  static void ipcp_walk_statements (ipcp_method, tree *);
*************** static void ipcp_callsite_param_print (F
*** 238,273 ****
--- 245,286 ----
  
  /* WorlList Interface.  */
  
+ /* Creates the worklist for ipcp_iterate_stage().  */
  static inline ipcp_methodlist_p
  ipcp_create_methodlist_node (void) 
  {
    return ggc_alloc (sizeof (struct ipcp_methodlist));
  }
  
+ /* Checks that worklist is not empty.  */
  static inline bool
  ipcp_methodlist_not_empty (ipcp_methodlist_p wl)
  {
    return (wl != NULL);
  }
  
+ /* Gets method from worklist's node.  */
  static inline ipcp_method
  ipcp_methodlist_method (ipcp_methodlist_p wl) 
  {
    return wl->method_p;   
  }
  
+ /* Sets worklist's node to point to mt.  */
  static inline void
  ipcp_methodlist_method_set (ipcp_methodlist_p wl, ipcp_method mt)
  {
    wl->method_p = mt;
  }
  
+ /* Gets next worklist's node.  */
  static inline ipcp_methodlist_p
  ipcp_methodlist_next_method (ipcp_methodlist_p wl) 
  { 
    return wl->next_method;    
  }
  
+ /* Sets worklist node wl1 to point to  worklist node wl2.  */
  static inline void
  ipcp_methodlist_next_method_set (ipcp_methodlist_p wl1, ipcp_methodlist_p wl2) 
  { 
*************** ipcp_remove_method (ipcp_methodlist_p *w
*** 312,349 ****
    return ipcp_methodlist_method (first);
  }
  
! /* callsite interface.  */
  static inline int 
  ipcp_callsite_param_count (ipcp_callsite cs) 
  { 
    return ((struct ipa_edge *)cs->aux)->ipcp_param_num;
  }
  
  static inline void 
  ipcp_callsite_param_count_set (ipcp_callsite cs, int i) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_num = i;
  }
  
  static inline struct ipcp_jump_func *
  ipcp_callsite_param (ipcp_callsite cs, int i) 
  {
    return &(((struct ipa_edge *)cs->aux)->ipcp_param_map[i]);
  }
  
  static inline ipcp_method 
  ipcp_callsite_callee (ipcp_callsite cs)
  {
    return cs->callee;
  }
  
! 
  static inline void
  ipcp_callsite_param_set_type (ipcp_callsite cs, int i,  enum Jfunc_type type1) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_map[i].type = type1;
  }
  
  static inline void
  ipcp_callsite_param_set_info_type (ipcp_callsite cs, int i, 
  				   ipcp_int *info_type1) 
--- 325,368 ----
    return ipcp_methodlist_method (first);
  }
  
! /* ipcp_callsite interface.  */
! 
! /* Gets how many arguments the callsite has.  */
  static inline int 
  ipcp_callsite_param_count (ipcp_callsite cs) 
  { 
    return ((struct ipa_edge *)cs->aux)->ipcp_param_num;
  }
  
+ /* Sets how many arguments the callsite has.  */
  static inline void 
  ipcp_callsite_param_count_set (ipcp_callsite cs, int i) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_num = i;
  }
  
+ /* Gets the jump function for argument i of callsite cs.  */
  static inline struct ipcp_jump_func *
  ipcp_callsite_param (ipcp_callsite cs, int i) 
  {
    return &(((struct ipa_edge *)cs->aux)->ipcp_param_map[i]);
  }
  
+ /* Gets the callee of callsite cs.  */
  static inline ipcp_method 
  ipcp_callsite_callee (ipcp_callsite cs)
  {
    return cs->callee;
  }
  
! /* Sets the jump function's type for argument i of callsite cs.  */
  static inline void
  ipcp_callsite_param_set_type (ipcp_callsite cs, int i,  enum Jfunc_type type1) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_map[i].type = type1;
  }
  
+ /* Sets the jump function's info type for argument i of callsite cs.  */
  static inline void
  ipcp_callsite_param_set_info_type (ipcp_callsite cs, int i, 
  				   ipcp_int *info_type1) 
*************** ipcp_callsite_param_set_info_type (ipcp_
*** 351,356 ****
--- 370,376 ----
    ipcp_jf_set_info_type (ipcp_callsite_param (cs, i), info_type1);
  }
  
+ /* Allocates space for the jump functions.  */
  static inline void
  ipcp_callsite_param_map_create (ipcp_callsite cs) 
  {
*************** ipcp_callsite_param_map_create (ipcp_cal
*** 358,369 ****
--- 378,391 ----
      xcalloc (ipcp_callsite_param_count (cs), sizeof (struct ipcp_jump_func));   
  }
  
+ /* Returns the call expr tree realted to callsite cs.  */
  static inline tree
  ipcp_callsite_tree (ipcp_callsite cs) 
  {
    return cs->call_expr;
  }
  
+ /* Gets the caller of callsite cs.  */
  static inline ipcp_method 
  ipcp_callsite_caller (ipcp_callsite cs) 
  {
*************** ipcp_callsite_compute_param (ipcp_callsi
*** 445,450 ****
--- 467,501 ----
  
  /* ipcp_method interface.  */
  
+ /* Allocate and initialize ipa_node structure.  */
+ static inline void
+ ipa_node_create (ipcp_method node)
+ {
+    node->aux = xcalloc (1, sizeof (struct ipa_node));
+ }
+ 
+ /* Returns true if node is a cloned/versioned node.  */
+ static inline bool
+ ipcp_method_is_cloned (ipcp_method node)
+ {
+   return (ipcp_method_orig_node (node) != NULL);
+ }
+ 
+ /* Get orig node of method mt.  */
+ static inline ipcp_method
+ ipcp_method_orig_node (ipcp_method mt)
+ {
+   return ((struct ipa_node *)mt->aux)->ipcp_orig_node;
+ }
+ 
+ /* Sets orig node of method node.  */
+ static inline void
+ ipcp_method_set_orig_node (ipcp_method node, ipcp_method orig_node)
+ {
+   ((struct ipa_node *)node->aux)->ipcp_orig_node = orig_node;
+ }
+ 
+ /* Create cval structure for method mt.  */
  static inline void
  ipcp_formal_create (ipcp_method mt) 
  {
*************** ipcp_formal_create (ipcp_method mt) 
*** 452,469 ****
--- 503,523 ----
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_formal));
  }
  
+ /* Get number of formals of method mt.  */
  static inline int
  ipcp_method_formal_count (ipcp_method mt)
  {
    return ((struct ipa_node *)mt->aux)->ipcp_arg_num;
  }
  
+ /* Set number of formals of method mt.  */
  static inline void 
  ipcp_method_formal_count_set (ipcp_method mt, int i) 
  {
    ((struct ipa_node *)mt->aux)->ipcp_arg_num = i;
  }
  
+ /* Set cval structure of i-th formal of mt to cval.  */
  static inline void 
  ipcp_method_cval_set (ipcp_method mt, int i, struct ipcp_formal *cval)             
  {
*************** ipcp_method_cval_set (ipcp_method mt, in
*** 471,482 ****
--- 525,538 ----
    ipcp_cval_set_cvalue ( ipcp_method_cval (mt, i), ipcp_cval_get_cvalue (cval));     
  }
  
+ /* Get cval structure of i-th formal of mt.  */
  static inline struct ipcp_formal *
  ipcp_method_cval (ipcp_method mt, int info_type) 
  {
    return &(((struct ipa_node *)mt->aux)->ipcp_cval[info_type]);
  }
  
+ /* Set type of cval structure of formal i of mt to cval_type.  */
  static inline void 
  ipcp_method_cval_set_cvalue_type (ipcp_method mt, int i, 
  				  enum Cvalue_type cval_type)  
*************** ipcp_method_cval_set_cvalue_type (ipcp_m
*** 484,501 ****
--- 540,560 ----
    ((struct ipa_node *)mt->aux)->ipcp_cval[i].cvalue_type = cval_type;
  }
  
+ /* Returns whether i-th formal of mt is modified in mt.  */
  static inline bool 
  ipcp_method_is_modified (ipcp_method mt, int i)
  {
    return ((struct ipa_node *)mt->aux)->ipcp_mod[i].mod;
  }
  
+ /* Get param tree of i-th formal of mt.  */
  static inline tree 
  ipcp_method_get_tree (ipcp_method mt, int i)
  {
    return ((struct ipa_node *)mt->aux)->ipcp_param_tree[i].param_tree;
  }
  
+ /* Create tree map structure of mt.  */
  static inline void 
  ipcp_method_tree_map_create (ipcp_method mt)
  { 
*************** ipcp_method_tree_map_create (ipcp_method
*** 503,508 ****
--- 562,568 ----
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_tree_map));
  }
  
+ /* Create modify structure of mt.  */
  static inline void  
  ipcp_method_modify_create (ipcp_method mt) 
  { 
*************** ipcp_method_modify_create (ipcp_method m
*** 510,537 ****
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_modify));
  }
  
  static inline void  
  ipcp_method_modify_set (ipcp_method mt, int i, bool val)
  {
    ((struct ipa_node *)mt->aux)->ipcp_mod[i].mod = val;
  }
  
! static inline ipcp_callsite
! ipcp_cs_get_first (ipcp_method mt)
! {
!   return mt->callees;
! }
! 
! static inline ipcp_callsite
! ipcp_cs_get_next (ipcp_callsite cs)
  {
!   return cs->next_callee;
! }
  
- static inline bool 
- ipcp_cs_not_last (ipcp_callsite cs) 
- {
-   return (cs != NULL);
  }
  
  /* Returning the parameter index of the ptree.  */
--- 570,590 ----
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_modify));
  }
  
+ /* Set modify of i-th formal of mt.  */
  static inline void  
  ipcp_method_modify_set (ipcp_method mt, int i, bool val)
  {
    ((struct ipa_node *)mt->aux)->ipcp_mod[i].mod = val;
  }
  
! static void
! ipa_cloned_create (ipcp_method orig_node, ipcp_method new_node)
  {
!   ipa_node_create (new_node);
!   ipcp_method_set_orig_node (new_node, orig_node);
!   ipcp_method_formal_compute_count (new_node);
!   ipcp_method_compute_tree_map (new_node);
  
  }
  
  /* Returning the parameter index of the ptree.  */
*************** ipcp_modify_walk_tree (tree *tp, void *d
*** 702,714 ****
  	}
        break;
      case ADDR_EXPR:
        if( TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL )
          {
  	  i = ipcp_method_tree_map (mt, TREE_OPERAND (t, 0));
            ipcp_method_modify_set (mt, i, true);
          }
        break;
!     case ASM_EXPR:   
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
  	  ipcp_method_modify_set (mt, j, true);
--- 755,770 ----
  	}
        break;
      case ADDR_EXPR:
+       /* If the parametr's address is taken, 
+          it could be modified.  */
        if( TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL )
          {
  	  i = ipcp_method_tree_map (mt, TREE_OPERAND (t, 0));
            ipcp_method_modify_set (mt, i, true);
          }
        break;
!     case ASM_EXPR:  
!       /* Asm code could modify any of the parameters.  */
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
  	  ipcp_method_modify_set (mt, j, true);
*************** ipcp_method_compute_modify (ipcp_method 
*** 743,749 ****
  
    ipcp_method_modify_init (mt);
    decl = mt->decl;
!   if (!tree_inlinable_function_p (decl))
      {
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
--- 799,807 ----
  
    ipcp_method_modify_init (mt);
    decl = mt->decl;
!   /* ??? Handle pending sizes case. Set all parameters 
!      of the method to be modified.  */  
!   if (DECL_UNINLINABLE (decl))
      {
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
*************** ipcp_walk_statements (ipcp_method mt, tr
*** 784,803 ****
      }
  }
  
! /* jump function interface.  */
  
  static inline enum Jfunc_type
  get_type (struct ipcp_jump_func *jf)
  {
    return jf->type;
  }
  
  static inline ipcp_int *
  ipcp_jf_get_info_type (struct ipcp_jump_func *jf) 
  {
    return &(jf->info_type);
  }
  
  static inline void 
  ipcp_jf_set_info_type (struct ipcp_jump_func *jf, ipcp_int *info_type)
  {
--- 842,864 ----
      }
  }
  
! /* ipcp_jump_func interface.  */
  
+ /* Get type of jump function jf.  */
  static inline enum Jfunc_type
  get_type (struct ipcp_jump_func *jf)
  {
    return jf->type;
  }
  
+ /* Get info type of jump function jf.  */
  static inline ipcp_int *
  ipcp_jf_get_info_type (struct ipcp_jump_func *jf) 
  {
    return &(jf->info_type);
  }
  
+ /* Set info type of jump function jf.  */
  static inline void 
  ipcp_jf_set_info_type (struct ipcp_jump_func *jf, ipcp_int *info_type)
  {
*************** ipcp_jf_set_info_type (struct ipcp_jump_
*** 805,810 ****
--- 866,872 ----
    jf->info_type.high = info_type->high;
  }
  
+ /* Transform const_val to an ipcp_int type.  */
  static inline void 
  ipcp_int2ipcp_int (ipcp_int *info_type, int const_val)
  {
*************** ipcp_int2ipcp_int (ipcp_int *info_type, 
*** 812,817 ****
--- 874,880 ----
    info_type->high = 0;
  }
  
+ /* Returns low part of info_type as an int.  */
  static inline int
  ipcp_get_int (ipcp_int *info_type)
  {
*************** ipcp_get_int (ipcp_int *info_type)
*** 820,831 ****
--- 883,896 ----
  
  /* cval interface.  */
  
+ /* Sets type to cval.  */
  static inline void 
  ipcp_cval_set_cvalue_type (struct ipcp_formal *cval, enum Cvalue_type type)
  {
    cval->cvalue_type = type;
  }
  
+ /* Sets value to cval.  */
  static inline void
  ipcp_cval_set_cvalue (struct ipcp_formal *cval, ipcp_int *value) 
  {
*************** ipcp_cval_set_cvalue (struct ipcp_formal
*** 833,850 ****
--- 898,918 ----
    cval->cvalue.high = value->high;
  }
  
+ /* Returns value part of cval.  */
  static inline ipcp_int *
  ipcp_cval_get_cvalue (struct ipcp_formal *cval)
  {
    return &(cval->cvalue);
  }
  
+ /* Returns type part of cval.  */
  static inline enum Cvalue_type
  ipcp_cval_get_cvalue_type (struct ipcp_formal *cval)
  {
    return cval->cvalue_type;
  }
  
+ /* Returns true if const_val1 and const_val2 are equal.  */
  static inline bool 
  ipcp_cval_equal_cvalues (ipcp_int *const_val1, ipcp_int *const_val2)
  {
*************** ipcp_cval_equal_cvalues (ipcp_int *const
*** 858,864 ****
     it the first statemant in the function FN
     parse tree.
     PARM1 is the lhs of the assignment and
!    is the rhs. */
  static void 
  constant_val_insert (tree fn, tree parm1, tree val)
  {
--- 926,932 ----
     it the first statemant in the function FN
     parse tree.
     PARM1 is the lhs of the assignment and
!    val is the rhs. */
  static void 
  constant_val_insert (tree fn, tree parm1, tree val)
  {
*************** constant_val_insert (tree fn, tree parm1
*** 868,874 ****
    tree stmt_list_body;
    tree body;
    
!   tree rhs = convert (TREE_TYPE (parm1), val);
    if (rhs == error_mark_node)
      return;
    init_stmt = build (MODIFY_EXPR, TREE_TYPE (parm1), parm1, rhs);
--- 936,942 ----
    tree stmt_list_body;
    tree body;
    
!   tree rhs = fold_convert (TREE_TYPE (parm1), val);
    if (rhs == error_mark_node)
      return;
    init_stmt = build (MODIFY_EXPR, TREE_TYPE (parm1), parm1, rhs);
*************** ipcp_propagate_const (ipcp_method mt, in
*** 898,941 ****
    parm_tree = ipcp_method_get_tree (mt, param);
    const_val = build_int_cst_wide (NULL_TREE, cvalue->low, cvalue->high);   
    if (parm_tree != NULL)
!     if (const_val != NULL)
!       if(fndecl != NULL)
! 	constant_val_insert (fndecl, parm_tree, const_val);   
! }
! 
! /* Propagates the constant parameters found by ipcp_stage2()
!    to the function's code.  */
! static void 
! ipcp_stage3 (void)
! {
!   ipcp_method node;
!   int i;
!   ipcp_int *cvalue;
!  
!   for (node = cgraph_nodes; node; node = node->next)
!     {  
!       /* Propagation of the constant is forbidden in 
!          certain conditions.  */
!       if (ipcp_method_dont_propagate (node))
!         continue;
!       for(i=0; i < ipcp_method_formal_count (node); i++)
! 	{
! 	  if (ipcp_cval_get_cvalue_type (ipcp_method_cval (node, i)) 
! 	      == CONST_VALUE) 
! 	    {
! 	      cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
!               ipcp_propagate_const (node, i, cvalue);                       
! 	    }
! 	}
!     }
  }
  
! /* Initialization and computation of IPA Constant
!    Propagation data structures. It is an IntraProcedural
     analysis of methods,which gathers information to be propagated
     later on.  */
  static void 
! ipcp_stage1 (void)
  {
     ipcp_method node;
     ipcp_callsite cs;
--- 966,981 ----
    parm_tree = ipcp_method_get_tree (mt, param);
    const_val = build_int_cst_wide (NULL_TREE, cvalue->low, cvalue->high);   
    if (parm_tree != NULL)
!     if(fndecl != NULL)
!       constant_val_insert (fndecl, parm_tree, const_val);   
  }
  
! /* Initialization and computation of IPCP data structures. 
!    It is an intraprocedural
     analysis of methods,which gathers information to be propagated
     later on.  */
  static void 
! ipcp_init_stage (void)
  {
     ipcp_method node;
     ipcp_callsite cs;
*************** ipcp_stage1 (void)
*** 950,962 ****
     for (node = cgraph_nodes; node; node = node->next)
       {
         /* building jump functions  */
!        for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
!             cs = ipcp_cs_get_next (cs))
           {
             ipcp_callsite_compute_count (cs);
             if (ipcp_callsite_param_count (cs) 
  	       != ipcp_method_formal_count (cs->callee))
               {
                 ipcp_callsite_param_count_set (cs, 0);
                 ipcp_method_formal_count_set (cs->callee, 0);
               }
--- 990,1004 ----
     for (node = cgraph_nodes; node; node = node->next)
       {
         /* building jump functions  */
!        for (cs =  FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
!             cs = NEXT_CALLEE (cs))
           {
             ipcp_callsite_compute_count (cs);
             if (ipcp_callsite_param_count (cs) 
  	       != ipcp_method_formal_count (cs->callee))
               {
+                /* Handles the cases of functions with 
+                   a variable number of parameters.  */
                 ipcp_callsite_param_count_set (cs, 0);
                 ipcp_method_formal_count_set (cs->callee, 0);
               }
*************** ipcp_stage1 (void)
*** 969,975 ****
  /* Interprocedural analysis. The algorithm propagates constants from
     the caller's parameters to the callee's arguments.  */
  static void 
! ipcp_stage2 (void)
  {
    int i;
    struct ipcp_formal cval1 = {0, {0, 0}}, cval = {0, {0, 0}};
--- 1011,1017 ----
  /* Interprocedural analysis. The algorithm propagates constants from
     the caller's parameters to the callee's arguments.  */
  static void 
! ipcp_iterate_stage (void)
  {
    int i;
    struct ipcp_formal cval1 = {0, {0, 0}}, cval = {0, {0, 0}};
*************** ipcp_stage2 (void)
*** 985,992 ****
    while (ipcp_methodlist_not_empty (wl))
      {
        mt = ipcp_remove_method (&wl);
!       for (cs = ipcp_cs_get_first (mt); ipcp_cs_not_last (cs); 
! 	   cs = ipcp_cs_get_next (cs))
  	{
  	  callee = ipcp_callsite_callee (cs);
  	  for (i = 0; i < ipcp_callsite_param_count (cs); i++)
--- 1027,1034 ----
    while (ipcp_methodlist_not_empty (wl))
      {
        mt = ipcp_remove_method (&wl);
!       for (cs = FIRST_CALLEE (mt); CALLEE_NOT_LAST (cs); 
! 	   cs = NEXT_CALLEE (cs))
  	{
  	  callee = ipcp_callsite_callee (cs);
  	  for (i = 0; i < ipcp_callsite_param_count (cs); i++)
*************** ipcp_stage2 (void)
*** 1008,1014 ****
      }
  }
  
! /* Allocate and initialize ipa_node structure.  */
  static void
  ipa_nodes_create (void)
  {
--- 1050,1110 ----
      }
  }
  
! /* Propagates the constant parameters found by ipcp_iterate_stage()
!    to the function's code.  */
! static void 
! ipcp_insert_stage (void)
! {
!   ipcp_method node, node1 = NULL;
!   int i, const_param;
!   ipcp_int *cvalue;
!   varray_type redirect_callers;
!   ipcp_callsite cs;
!   int node_callers;
!    
!   for (node = cgraph_nodes; node; node = node->next)
!     {  
!       /* Propagation of the constant is forbidden in 
!          certain conditions.  */
!       if (ipcp_method_dont_insert_const (node))
!         continue;
!       const_param=0;
!       for(i = 0; i < ipcp_method_formal_count (node); i++)
! 	{
! 	  if (ipcp_cval_get_cvalue_type (ipcp_method_cval (node, i)) 
! 	      == CONST_VALUE) 
! 	    {
! 	      cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));             
!               if (const_param == 0)
!                 {
!                   /* Compute how many callers node has.  */
!                   node_callers = 0;
!                   for (cs = node->callers; cs != NULL;
!                        cs = cs->next_caller)
!                     node_callers++;  
!                   VARRAY_GENERIC_PTR_INIT (redirect_callers, node_callers, "redirect_callers");
!                   for (cs = node->callers; cs != NULL;
!                        cs = cs->next_caller)
!                     VARRAY_PUSH_GENERIC_PTR (redirect_callers, cs);
!                   /* Redirecting all the callers of the node to the 
!                      new versioned node.  */
!                   node1 = cgraph_function_versioning (node, redirect_callers); 
!                   VARRAY_CLEAR (redirect_callers);
!                   if (node1 == NULL)
!                     break;
!                   ipa_cloned_create (node, node1);
!                 }
!               ipcp_propagate_const (node1, i, cvalue);
!               const_param++;           
!              
! 	    }
! 	}
!     }
!   ipcp_update_callgraph ();
! }
! 
! /* Allocate and initialize ipa_node structure for all
!    nodes in callgraph.  */
  static void
  ipa_nodes_create (void)
  {
*************** ipa_nodes_create (void)
*** 1016,1026 ****
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!        node->aux = xcalloc (1, sizeof (struct ipa_node));     
!        ((struct ipa_node *)node->aux)->ipcp_cval = NULL;
!        ((struct ipa_node *)node->aux)->ipcp_arg_num =0;
!        ((struct ipa_node *)node->aux)-> ipcp_param_tree = NULL;
!        ((struct ipa_node *)node->aux)->ipcp_mod = NULL;
       }
  }
  
--- 1112,1118 ----
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!        ipa_node_create (node);
       }
  }
  
*************** ipa_edges_create (void)
*** 1033,1044 ****
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
! 	     cs = ipcp_cs_get_next (cs))
  	  {
  	    cs->aux = xcalloc (1, sizeof (struct ipa_edge));
- 	    ((struct ipa_edge *)cs->aux)->ipcp_param_map = NULL;
- 	    ((struct ipa_edge *)cs->aux)->ipcp_param_num = 0;
  	  }
       
       }
--- 1125,1134 ----
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
! 	     cs = NEXT_CALLEE (cs))
  	  {
  	    cs->aux = xcalloc (1, sizeof (struct ipa_edge));
  	  }
       
       }
*************** ipa_edges_free (void)
*** 1066,1073 ****
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
! 	     cs = ipcp_cs_get_next (cs))
  	  {
  	   
  	    free (cs->aux);
--- 1156,1163 ----
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
! 	     cs = NEXT_CALLEE (cs))
  	  {
  	   
  	    free (cs->aux);
*************** ipa_edges_free (void)
*** 1077,1083 ****
       }
  }
  
! /* The IPA constant propagation driver.  */
  void 
  ipcp_driver (void)
  {
--- 1167,1173 ----
       }
  }
  
! /* The IPCP driver.  */
  void 
  ipcp_driver (void)
  {
*************** ipcp_driver (void)
*** 1087,1105 ****
       fprintf (cgraph_dump_file, "\nIPA constant propagation start:\n");
    ipa_nodes_create ();
    ipa_edges_create ();
!   ipcp_stage1 ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures before propagation:\n");
        ipcp_structures_print (cgraph_dump_file); 
      }
!   ipcp_stage2 ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures after propagation:\n");
        ipcp_structures_print (cgraph_dump_file);      
      }
!   ipcp_stage3 ();  
    ipcp_free (); 
    ipa_nodes_free ();
    ipa_edges_free ();
--- 1177,1200 ----
       fprintf (cgraph_dump_file, "\nIPA constant propagation start:\n");
    ipa_nodes_create ();
    ipa_edges_create ();
!   /* 1. Call the init stage to initialize 
!      the ipa_node and ipa_edge structures.  */
!   ipcp_init_stage ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures before propagation:\n");
        ipcp_structures_print (cgraph_dump_file); 
      }
!   /* 2. Do the interprocedural propagation.  */
!   ipcp_iterate_stage ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures after propagation:\n");
        ipcp_structures_print (cgraph_dump_file);      
      }
!   /* 3. Insert the constants found to the functions.  */
!   ipcp_insert_stage ();
!   /* Free all IPCP structures.  */
    ipcp_free (); 
    ipa_nodes_free ();
    ipa_edges_free ();
*************** ipcp_driver (void)
*** 1107,1113 ****
      fprintf (cgraph_dump_file, "\nIPA constant propagation end\n");
  }
  
! /* Frees the ipcp_method's IPA data structures.  */
  static void 
  ipcp_free (void)
  {
--- 1202,1208 ----
      fprintf (cgraph_dump_file, "\nIPA constant propagation end\n");
  }
  
! /* Frees the ipcp_method's IPCP data structures.  */
  static void 
  ipcp_free (void)
  {
*************** ipcp_free (void)
*** 1116,1142 ****
  
    for (node = cgraph_nodes; node; node = node->next)
       {
!        free (((struct ipa_node *)node->aux)->ipcp_cval);
!        free (((struct ipa_node *)node->aux)->ipcp_param_tree);
!        free (((struct ipa_node *)node->aux)->ipcp_mod); 
!        for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
!             cs = ipcp_cs_get_next (cs))
!          free (((struct ipa_edge *)cs->aux)->ipcp_param_map);
       }
  }
  
! /* Check conditions to forbid constant propagation.  */
  static bool
! ipcp_method_dont_propagate (ipcp_method mt)
  {
!   if (!tree_inlinable_function_p (mt->decl))
      return true;
    if (ipcp_method_contains_asm (mt))
      return true;
    return false;
  }
  
! /* Called by walk_tree to find. Returns true if an asm expr was found  */
  static tree 
  ipcp_asm_walk_tree (tree *tp, int *walk_subtrees ATTRIBUTE_UNUSED, 
  		    void *data ATTRIBUTE_UNUSED)
--- 1211,1248 ----
  
    for (node = cgraph_nodes; node; node = node->next)
       {
!        if (node->aux == NULL)
!          continue;
!        if (((struct ipa_node *)node->aux)->ipcp_cval)
!          free (((struct ipa_node *)node->aux)->ipcp_cval);      
!        if (((struct ipa_node *)node->aux)->ipcp_param_tree)
!          free (((struct ipa_node *)node->aux)->ipcp_param_tree);      
!        if (((struct ipa_node *)node->aux)->ipcp_mod)
!          free (((struct ipa_node *)node->aux)->ipcp_mod); 
!        for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
!             cs = NEXT_CALLEE (cs))
!          {        
!            if (cs->aux)
!              if (((struct ipa_edge *)cs->aux)->ipcp_param_map)
!                free (((struct ipa_edge *)cs->aux)->ipcp_param_map);
!          }
       }
  }
  
! /* Check conditions to forbid constant insertion to the function.  */
  static bool
! ipcp_method_dont_insert_const (ipcp_method mt)
  {
!   /* ??? Handle pending sizes case.  */  
!   if (DECL_UNINLINABLE (mt->decl))
      return true;
+   /* Prevent propagation in the case of ASM calls.  */  
    if (ipcp_method_contains_asm (mt))
      return true;
    return false;
  }
  
! /* Called by walk_tree to find. Returns true if an asm expr was found.  */
  static tree 
  ipcp_asm_walk_tree (tree *tp, int *walk_subtrees ATTRIBUTE_UNUSED, 
  		    void *data ATTRIBUTE_UNUSED)
*************** ipcp_method_contains_asm (ipcp_method mt
*** 1169,1177 ****
    return false;
  }
  
! /* Debuffing interface.  */
  
! /* Prints all IPA constant propagation data structures.  */
  static void 
  ipcp_structures_print (FILE *f)
  {
--- 1275,1283 ----
    return false;
  }
  
! /* Debugging interface.  */
  
! /* Prints all IPCP data structures.  */
  static void 
  ipcp_structures_print (FILE *f)
  {
*************** ipcp_structures_print (FILE *f)
*** 1181,1186 ****
--- 1287,1293 ----
    ipcp_callsite_param_print (f);
  }
  
+ /* Prints ipcp_cval data structures.  */
  static void 
  ipcp_method_cval_print (FILE *f)
  {
*************** ipcp_method_cval_print (FILE *f)
*** 1200,1207 ****
                 fprintf (f, " param [%d]: ", i);
  	       fprintf (f, "type is CONST ");
                 cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
! 	       fprintf (f, "value is  %d %d\n",(int)cvalue->high, 
!                         (int)cvalue->low);
  	     }
  	   else if (ipcp_method_cval (node, i)->cvalue_type == TOP)
  	     fprintf(f, "param [%d]: type is TOP  \n", i);           
--- 1307,1315 ----
                 fprintf (f, " param [%d]: ", i);
  	       fprintf (f, "type is CONST ");
                 cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
!                fprintf (f, "value is  "HOST_WIDE_INT_PRINT_DEC
!                         " "HOST_WIDE_INT_PRINT_DEC" \n", 
!                         cvalue->high, cvalue->low);
  	     }
  	   else if (ipcp_method_cval (node, i)->cvalue_type == TOP)
  	     fprintf(f, "param [%d]: type is TOP  \n", i);           
*************** ipcp_method_cval_print (FILE *f)
*** 1211,1216 ****
--- 1319,1325 ----
       }
  }
  
+ /* Prints ipcp_tree_map data structures.  */
  static void  
  ipcp_method_tree_print (FILE *f) 
  {
*************** ipcp_method_tree_print (FILE *f) 
*** 1238,1243 ****
--- 1347,1353 ----
      }
  }
  
+ /* Printf ipcp_modify data structures.  */
  static void  
  ipcp_method_modify_print (FILE *f) 
  {
*************** ipcp_method_modify_print (FILE *f) 
*** 1260,1266 ****
        
      }
  }
!  
  static void  
  ipcp_callsite_param_print (FILE *f)
  {
--- 1370,1376 ----
        
      }
  }
! /* Prints ipcp_jump_func data structures.  */ 
  static void  
  ipcp_callsite_param_print (FILE *f)
  {
*************** ipcp_callsite_param_print (FILE *f)
*** 1274,1281 ****
    fprintf(f, "\nCALLSITE PARAM PRINT\n");
    for (node = cgraph_nodes; node; node = node->next)
      {    
!       for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
!             cs = ipcp_cs_get_next (cs))
           {
             fprintf (f, "callsite  %s ", cgraph_node_name (node));
             fprintf (f, "-> %s :: \n", cgraph_node_name (cs->callee));
--- 1384,1391 ----
    fprintf(f, "\nCALLSITE PARAM PRINT\n");
    for (node = cgraph_nodes; node; node = node->next)
      {    
!       for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
!             cs =  NEXT_CALLEE (cs))
           {
             fprintf (f, "callsite  %s ", cgraph_node_name (node));
             fprintf (f, "-> %s :: \n", cgraph_node_name (cs->callee));
*************** ipcp_callsite_param_print (FILE *f)
*** 1290,1297 ****
                else if (type == CONST_IPATYPE)
                  {
                    fprintf (f, "CONST : ");
!                   fprintf (f, "high: %d low: %d\n",(int)info_type->high, 
!                            (int)info_type->low);
                  }
                else if (type == FORMAL_IPATYPE)
                  {
--- 1400,1408 ----
                else if (type == CONST_IPATYPE)
                  {
                    fprintf (f, "CONST : ");
!                   fprintf (f, " "HOST_WIDE_INT_PRINT_DEC
!                            " "HOST_WIDE_INT_PRINT_DEC" \n", 
!                            info_type->high, info_type->low);
                  }
                else if (type == FORMAL_IPATYPE)
                  {
*************** ipcp_callsite_param_print (FILE *f)
*** 1302,1304 ****
--- 1413,1473 ----
           }
      }
  }
+ 
+ /* Fix the callsites and the callgraph after function cloning was done.  */
+ static void 
+ ipcp_update_callgraph (void)
+ {
+   ipcp_method node, orig_callee;
+   ipcp_callsite cs;
+ 
+    for (node = cgraph_nodes; node; node = node->next)
+      {
+        /* want to fix only original nodes  */
+        if (ipcp_method_is_cloned (node))
+          continue;
+        for (cs = FIRST_CALLEE (node);  CALLEE_NOT_LAST (cs);
+             cs =  NEXT_CALLEE (cs))
+          {
+            if (ipcp_method_is_cloned (cs->callee))
+              {
+                /* Callee is a cloned node  */
+                orig_callee = ipcp_method_orig_node (cs->callee);
+                if (ipcp_redirect (cs))
+                  {
+                    cgraph_redirect_edge_callee (cs, orig_callee);
+                    TREE_OPERAND (TREE_OPERAND (cs->call_expr, 0), 0) = orig_callee->decl;
+                  }
+              }
+          }
+      }
+ }
+ 
+ /* Retruns true if this callsite should be redirected to 
+    the orig callee (instead of the cloned one).  */
+ static bool
+ ipcp_redirect (ipcp_callsite  cs)
+ {
+   ipcp_method caller, callee, orig_callee;
+   int i;
+   struct ipcp_jump_func *jump_func;
+   enum Jfunc_type type;
+   
+   caller = cs->caller;
+   callee = cs->callee;
+   orig_callee =  ipcp_method_orig_node (callee);
+   
+   for (i = 0; i < ipcp_method_formal_count (orig_callee); i++)
+      {
+        if (ipcp_cval_get_cvalue_type (ipcp_method_cval (orig_callee, i)) 
+            == CONST_VALUE) 
+          {
+            jump_func = ipcp_callsite_param (cs, i);
+            type = get_type (jump_func);
+            if (type != CONST_IPATYPE)
+              return true;
+          }
+      }
+   return false;
+ }
+ 
Index: ipa_prop.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/ipa_prop.h,v
retrieving revision 1.1.2.1
diff -c -3 -p -r1.1.2.1 ipa_prop.h
*** ipa_prop.h	28 Aug 2004 17:26:34 -0000	1.1.2.1
--- ipa_prop.h	5 Oct 2004 08:35:56 -0000
***************
*** 1,6 ****
! /* Callgraph handling code.
!    Copyright (C) 2003, 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky.
  
  This file is part of GCC.
  
--- 1,6 ----
! /* Interprocedural constant propagation
!    Copyright (C) 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky <RAZYA@il.ibm.com>
  
  This file is part of GCC.
  
*************** struct ipcp_modify
*** 57,75 ****
  
  struct ipa_node
  {
    /* Array of cvals.  */
    struct ipcp_formal* GTY ((skip (""))) ipcp_cval;
-   int ipcp_arg_num;
    /* Mapping each parameter to its PARM_DECL tree.  */
    struct ipcp_tree_map* GTY ((skip (""))) ipcp_param_tree;
    /* Indicating which parameter is modified in its method.  */
    struct ipcp_modify* GTY ((skip (""))) ipcp_mod;
  };
  
  struct ipa_edge
  {
!   struct ipcp_jump_func* GTY ((skip (""))) ipcp_param_map;
    int ipcp_param_num;
  };
  void ipcp_driver (void);
  
--- 57,85 ----
  
  struct ipa_node
  {
+   /* Number of formal parameters of this method.  When set to 0,
+    this method's parameters would not be analyzed by the different 
+    stages of IPA CP.  */
+   int ipcp_arg_num;
    /* Array of cvals.  */
    struct ipcp_formal* GTY ((skip (""))) ipcp_cval;
    /* Mapping each parameter to its PARM_DECL tree.  */
    struct ipcp_tree_map* GTY ((skip (""))) ipcp_param_tree;
    /* Indicating which parameter is modified in its method.  */
    struct ipcp_modify* GTY ((skip (""))) ipcp_mod;
+    /* Only for cloned nodes this field would not be NULL, 
+   it points to the node that IPA cp cloned from.  */ 
+   struct cgraph_node *ipcp_orig_node;
  };
  
  struct ipa_edge
  {
!   /* Number of actual arguments in this callsite.  When set to 0,
!    this callsite's parameters would not be analyzed by the different 
!    stages of IPA CP.  */
    int ipcp_param_num;
+   /* Array of the callsite's jump function of each parameter.  */
+   struct ipcp_jump_func* GTY ((skip (""))) ipcp_param_map; 
  };
  void ipcp_driver (void);
  
Index: langhooks-def.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/langhooks-def.h,v
retrieving revision 1.34.2.29.2.7
diff -c -3 -p -r1.34.2.29.2.7 langhooks-def.h
*** langhooks-def.h	18 Sep 2004 22:47:14 -0000	1.34.2.29.2.7
--- langhooks-def.h	5 Oct 2004 08:35:56 -0000
*************** extern void lhd_tree_inlining_end_inlini
*** 84,90 ****
  extern tree lhd_tree_inlining_convert_parm_for_inlining (tree, tree, tree, int);
  extern void lhd_initialize_diagnostics (struct diagnostic_context *);
  extern tree lhd_callgraph_analyze_expr (tree *, int *, tree);
! 
  
  /* Declarations for tree gimplification hooks.  */
  extern int lhd_gimplify_expr (tree *, tree *, tree *);
--- 84,90 ----
  extern tree lhd_tree_inlining_convert_parm_for_inlining (tree, tree, tree, int);
  extern void lhd_initialize_diagnostics (struct diagnostic_context *);
  extern tree lhd_callgraph_analyze_expr (tree *, int *, tree);
! extern int lhd_tree_versioning_cannot_version_tree_fn (tree *);
  
  /* Declarations for tree gimplification hooks.  */
  extern int lhd_gimplify_expr (tree *, tree *, tree *);
*************** extern int lhd_gimplify_expr (tree *, tr
*** 170,175 ****
--- 170,184 ----
    LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
  }
  
+ /* Tree versioning hooks.  */
+ #define LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN \
+   lhd_tree_versioning_cannot_version_tree_fn
+ 
+ #define LANG_HOOKS_TREE_VERSIONING_INITIALIZER { \
+ LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN \
+ }
+ 
+ 
  #define LANG_HOOKS_CALLGRAPH_ANALYZE_EXPR lhd_callgraph_analyze_expr
  #define LANG_HOOKS_CALLGRAPH_EXPAND_FUNCTION NULL
  
*************** extern tree lhd_make_node (enum tree_cod
*** 292,297 ****
--- 301,307 ----
    LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE, \
    LANG_HOOKS_FUNCTION_INITIALIZER, \
    LANG_HOOKS_TREE_INLINING_INITIALIZER, \
+   LANG_HOOKS_TREE_VERSIONING_INITIALIZER, \
    LANG_HOOKS_CALLGRAPH_INITIALIZER, \
    LANG_HOOKS_TREE_DUMP_INITIALIZER, \
    LANG_HOOKS_DECLS, \
Index: langhooks.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/langhooks.c,v
retrieving revision 1.31.2.22.2.7
diff -c -3 -p -r1.31.2.22.2.7 langhooks.c
*** langhooks.c	18 Sep 2004 22:47:14 -0000	1.31.2.22.2.7
--- langhooks.c	5 Oct 2004 08:35:56 -0000
*************** lhd_tree_inlining_cannot_inline_tree_fn 
*** 317,322 ****
--- 317,332 ----
    return 0;
  }
  
+ /* lang_hooks.tree_versioning.cannot_version_tree_fn is called to
+    determine whether there are language-specific reasons for not
+    creating a new version to a given function.  */
+ 
+ int
+ lhd_tree_versioning_cannot_version_tree_fn (tree *fnp ATTRIBUTE_UNUSED)
+ {
+   return 0;
+ }
+ 
  /* lang_hooks.tree_inlining.disregard_inline_limits is called to
     determine whether a function should be considered for inlining even
     if it would exceed inlining limits.  */
Index: langhooks.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/langhooks.h,v
retrieving revision 1.42.2.30.2.6
diff -c -3 -p -r1.42.2.30.2.6 langhooks.h
*** langhooks.h	18 Sep 2004 22:47:14 -0000	1.42.2.30.2.6
--- langhooks.h	5 Oct 2004 08:35:56 -0000
*************** struct lang_hooks_for_tree_inlining
*** 47,52 ****
--- 47,59 ----
    tree (*convert_parm_for_inlining) (tree, tree, tree, int);
  };
  
+ 
+ struct lang_hooks_for_tree_versioning
+ {
+   int (*cannot_version_tree_fn) (tree *);
+ };
+ 
+ 
  struct lang_hooks_for_callgraph
  {
    /* The node passed is a language-specific tree node.  If its contents
*************** struct lang_hooks
*** 387,392 ****
--- 394,401 ----
    struct lang_hooks_for_functions function;
  
    struct lang_hooks_for_tree_inlining tree_inlining;
+   
+   struct lang_hooks_for_tree_versioning tree_versioning;  
  
    struct lang_hooks_for_callgraph callgraph;
  
Index: tree-gimple.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/tree-gimple.h,v
retrieving revision 2.1.2.4
diff -c -3 -p -r2.1.2.4 tree-gimple.h
*** tree-gimple.h	9 Sep 2004 10:55:42 -0000	2.1.2.4
--- tree-gimple.h	5 Oct 2004 08:35:56 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 28,33 ****
--- 28,34 ----
  extern tree create_tmp_var_raw (tree, const char *);
  extern tree create_tmp_var_name (const char *);
  extern tree create_tmp_var (tree, const char *);
+ extern tree create_function_name (const char *);
  extern tree get_initialized_tmp_var (tree, tree *, tree *);
  extern tree get_formal_tmp_var (tree, tree *);
  extern void declare_tmp_vars (tree, tree);
Index: tree-inline.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/tree-inline.c,v
retrieving revision 1.26.2.83.2.12
diff -c -3 -p -r1.26.2.83.2.12 tree-inline.c
*** tree-inline.c	25 Sep 2004 23:17:57 -0000	1.26.2.83.2.12
--- tree-inline.c	5 Oct 2004 08:35:56 -0000
*************** typedef struct inline_data
*** 102,107 ****
--- 102,109 ----
    bool cloning_p;
    /* Similarly for saving function body.  */
    bool saving_p;
+   /* Versioning function is slightly different from inlining. */
+   bool versioning_p;
    /* Hash table used to prevent walk_tree from visiting the same node
       umpteen million times.  */
    htab_t tree_pruner;
*************** static tree mark_local_for_remap_r (tree
*** 138,143 ****
--- 140,149 ----
  static void unsave_expr_1 (tree);
  static tree unsave_r (tree *, int *, void *);
  static void declare_inline_vars (tree bind_expr, tree vars);
+ static tree copy_arguments_for_versioning (tree, inline_data *);
+ static tree version_body (tree, inline_data *);
+ static tree copy_static_chain (tree, inline_data *);
+ static bool function_versionable_p (tree);
  
  /* Insert a tree->tree mapping for ID.  Despite the name suggests
     that the trees should be variables, it is used for more than that.  */
*************** remap_decl (tree decl, inline_data *id)
*** 169,176 ****
    if (!n)
      {
        /* Make a copy of the variable or label.  */
!       tree t = copy_decl_for_inlining (decl, fn, VARRAY_TREE (id->fns, 0));
! 
        /* Remap types, if necessary.  */
        TREE_TYPE (t) = remap_type (TREE_TYPE (t), id);
        if (TREE_CODE (t) == TYPE_DECL)
--- 175,185 ----
    if (!n)
      {
        /* Make a copy of the variable or label.  */
!       tree t;
!       if (!id->versioning_p)
!         t = copy_decl_for_inlining (decl, fn, VARRAY_TREE (id->fns, 0));
!       else
!         t = copy_decl_for_versioning (decl, fn, VARRAY_TREE (id->fns, 0));
        /* Remap types, if necessary.  */
        TREE_TYPE (t) = remap_type (TREE_TYPE (t), id);
        if (TREE_CODE (t) == TYPE_DECL)
*************** copy_body_r (tree *tp, int *walk_subtree
*** 460,468 ****
      gcc_assert (DECL_EXTERNAL (*tp) || TREE_STATIC (*tp));
  #endif
  
!   /* If this is a RETURN_EXPR, change it into a MODIFY_EXPR and a
!      GOTO_EXPR with the RET_LABEL as its target.  */
!   if (TREE_CODE (*tp) == RETURN_EXPR && id->ret_label)
      {
        tree return_stmt = *tp;
        tree goto_stmt;
--- 469,479 ----
      gcc_assert (DECL_EXTERNAL (*tp) || TREE_STATIC (*tp));
  #endif
  
!   /* If this is a RETURN_STMT, change it into an EXPR_STMT and a
!      GOTO_STMT with the RET_LABEL as its target.  */
!   if (TREE_CODE (*tp) == RETURN_EXPR 
!       && id->ret_label
!       && !id->versioning_p)
      {
        tree return_stmt = *tp;
        tree goto_stmt;
*************** copy_body_r (tree *tp, int *walk_subtree
*** 562,568 ****
  		}
  	    }
  	}
!       else if (TREE_CODE (*tp) == INDIRECT_REF)
  	{
  	  /* Get rid of *& from inline substitutions that can happen when a
  	     pointer argument is an ADDR_EXPR.  */
--- 573,580 ----
  		}
  	    }
  	}
!       else if (TREE_CODE (*tp) == INDIRECT_REF
!                && !id->versioning_p)
  	{
  	  /* Get rid of *& from inline substitutions that can happen when a
  	     pointer argument is an ADDR_EXPR.  */
*************** copy_body_r (tree *tp, int *walk_subtree
*** 602,613 ****
  	    }
  	  else
  	    {
!               struct cgraph_edge *edge
! 		= cgraph_edge (id->current_node, old_node);
! 
! 	      if (edge)
! 	        cgraph_clone_edge (edge, id->node, *tp);
! 	    }
  	}
  
        TREE_TYPE (*tp) = remap_type (TREE_TYPE (*tp), id);
--- 614,639 ----
  	    }
  	  else
  	    {
!               struct cgraph_edge *edge;
!               
!               if (!id->versioning_p)
!                 {
!                   edge = cgraph_edge (id->current_node, old_node);
!                   if (edge)
!                     cgraph_clone_edge (edge, id->node, *tp);
!                 }
!             }
!           if (id->versioning_p)
!             {
!               /* Update the call_expr on the edges from the new versions
!                  to it's callees. */
!               struct cgraph_edge *edge;
!               tree callee;
!               callee = get_callee_fndecl (*tp);
!               edge = cgraph_edge (id->node, old_node);
!               if (edge)
!                 edge->call_expr = *tp;
!             }
  	}
  
        TREE_TYPE (*tp) = remap_type (TREE_TYPE (*tp), id);
*************** declare_inline_vars (tree bind_expr, tre
*** 2470,2472 ****
--- 2496,2655 ----
  
    add_var_to_bind_expr (bind_expr, vars);
  }
+ 
+ 
+ /* Return a copy of the function argument tree.  */
+ static tree 
+ copy_arguments_for_versioning (tree orig_parm, inline_data *id)
+ {
+   tree *arg_copy, *parg;
+   
+   arg_copy = &orig_parm;
+   for (parg = arg_copy; *parg; parg = &TREE_CHAIN (*parg))
+     {
+       tree new = remap_decl (*parg, id);
+       lang_hooks.dup_lang_specific_decl (new);
+       TREE_CHAIN (new) = TREE_CHAIN (*parg);
+       *parg = new;
+     }
+   return orig_parm;
+ }
+ 
+ 
+ /* return a copy of the function tree body.  */
+ static tree
+ version_body (tree body, inline_data *id)
+ {
+   tree new_body = body;
+   
+   walk_tree (&new_body, copy_body_r, id, NULL);
+   return new_body;
+ }
+ 
+ 
+ /* Return a copy of the function static chain.  */
+ static tree
+ copy_static_chain (tree static_chain, inline_data *id)
+ {
+   tree *chain_copy, *pvar;
+   
+   chain_copy = &static_chain;
+   for (pvar = chain_copy; *pvar; pvar = &TREE_CHAIN (*pvar))
+     {
+       tree new = remap_decl (*pvar, id);
+       lang_hooks.dup_lang_specific_decl (new);
+       TREE_CHAIN (new) = TREE_CHAIN (*pvar);
+       *pvar = new;
+     }
+   return static_chain;
+ }
+ 
+ bool
+ tree_versionable_function_p (tree fndecl)
+ {
+   return function_versionable_p (fndecl);
+ }
+ 
+ /* Return true if the function can have a new version.
+    This is a guard for the versioning functionality.  */
+ static bool
+ function_versionable_p (tree fndecl)
+ {
+   if (fndecl == NULL_TREE)
+     return false;
+   /* ??? There are cases where a function is
+      uninlinable but can have a new versions.  */
+   if (!tree_inlinable_function_p (fndecl))  
+     return false;
+   /* ??? Remove the following gaurd.  */
+   if (lang_hooks.tree_versioning.cannot_version_tree_fn (&fndecl))
+     return false;
+ 
+   return true;
+ }
+ 
+ 
+ /* Copy the function tree. 
+    OLD_DECL and NEW_DECL are FUNCTION_DECL tree nodes 
+    of the original function and the new copied function
+    respectively.  This cloning supports function 
+    versioning utility.  */
+ void
+ tree_function_versioning (tree old_decl, tree new_decl) 
+ {
+   struct cgraph_node *old_version_node;
+   struct cgraph_node *new_version_node;
+   inline_data id;
+   tree p;
+   
+   if (TREE_CODE (old_decl) != FUNCTION_DECL
+       || TREE_CODE (new_decl) != FUNCTION_DECL)
+     abort ();
+  
+   old_version_node = cgraph_node (old_decl);
+   new_version_node = cgraph_node (new_decl);
+   
+   /* FIXME: For the moment debug information for the new
+      version is not supported.  */
+   DECL_ARTIFICIAL (new_decl) = 1;
+   DECL_IGNORED_P (new_decl) = 1;
+   
+   /* Generate a new name for the new version. */
+   DECL_NAME (new_decl) = 
+     create_function_name (IDENTIFIER_POINTER (DECL_NAME (old_decl)));
+   /* Create a new SYMBOL_REF rtx for the new name. */
+   if (DECL_RTL (old_decl) != NULL)
+     {
+       SET_DECL_RTL (new_decl, copy_rtx (DECL_RTL (old_decl)));
+       XEXP (DECL_RTL (new_decl), 0) =
+         gen_rtx_SYMBOL_REF (GET_MODE (XEXP (DECL_RTL (old_decl), 0)),
+                             IDENTIFIER_POINTER (DECL_NAME (new_decl)));
+     }
+   
+   /* Prepare the data-structures for
+      coping the tree of the new version. */
+   
+   memset (&id, 0, sizeof (id));
+   /* The new version. */
+   id.node = new_version_node;
+   /* The old version. */
+   id.current_node =  cgraph_node (old_decl);
+   id.versioning_p = true;
+   id.decl_map = splay_tree_new (splay_tree_compare_pointers, 
+                                 NULL, NULL);
+   id.tree_pruner = htab_create (37, htab_hash_pointer,
+ 				htab_eq_pointer, NULL);
+   
+   VARRAY_TREE_INIT (id.fns, 1, "fns");
+   VARRAY_PUSH_TREE (id.fns, new_decl);
+   VARRAY_PUSH_TREE (id.fns, old_decl);
+   
+   /* Copy the function's static chain. */
+   p = DECL_STRUCT_FUNCTION (old_decl)->static_chain_decl;
+   if (p)
+     {
+       DECL_STRUCT_FUNCTION (new_decl)->static_chain_decl =
+         copy_static_chain (DECL_STRUCT_FUNCTION (old_decl)->static_chain_decl, &id);
+     }  
+   
+   /* Copy the function's arguments. */
+   if (DECL_ARGUMENTS (old_decl) != NULL_TREE)
+     {
+       DECL_ARGUMENTS (new_decl) = 
+         copy_arguments_for_versioning (DECL_ARGUMENTS (old_decl), &id);
+     }
+   /* Copy the Function's body. */
+   DECL_SAVED_TREE (new_decl) =
+     version_body (DECL_SAVED_TREE (old_decl), &id);
+   if (DECL_RESULT (old_decl) != NULL_TREE)
+     {
+       tree *res_decl = &DECL_RESULT (old_decl);
+       DECL_RESULT (new_decl) = remap_decl (*res_decl, &id);
+       lang_hooks.dup_lang_specific_decl (DECL_RESULT (new_decl));
+     }
+   
+   /* Clean up.  */
+   htab_delete (id.tree_pruner);
+   splay_tree_delete (id.decl_map);
+   return;
+ }
Index: tree-inline.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/tree-inline.h,v
retrieving revision 1.4.2.6.4.1
diff -c -3 -p -r1.4.2.6.4.1 tree-inline.h
*** tree-inline.h	2 Aug 2004 23:12:02 -0000	1.4.2.6.4.1
--- tree-inline.h	5 Oct 2004 08:35:56 -0000
*************** void clone_body (tree, tree, void *);
*** 31,36 ****
--- 31,38 ----
  tree save_body (tree, tree *, tree *);
  void remap_save_expr (tree *, void *, int *);
  int estimate_num_insns (tree expr);
+ bool tree_versionable_function_p (tree);
+ void tree_function_versioning (tree, tree);
  
  /* 0 if we should not perform inlining.
     1 if we should expand functions calls inline at the tree level.
Index: cp/cp-objcp-common.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/cp-objcp-common.h,v
retrieving revision 1.2.8.3
diff -c -3 -p -r1.2.8.3 cp-objcp-common.h
*** cp/cp-objcp-common.h	25 Sep 2004 23:18:37 -0000	1.2.8.3
--- cp/cp-objcp-common.h	5 Oct 2004 08:35:57 -0000
*************** extern tree objcp_tsubst_copy_and_build 
*** 138,143 ****
--- 138,148 ----
  #undef LANG_HOOKS_CALLGRAPH_EXPAND_FUNCTION
  #define LANG_HOOKS_CALLGRAPH_EXPAND_FUNCTION expand_body
  
+ #undef LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN
+ #define LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN \
+    cp_cannot_version_tree_fn
+  
+ 
  #undef LANG_HOOKS_MAKE_TYPE
  #define LANG_HOOKS_MAKE_TYPE cxx_make_type
  #undef LANG_HOOKS_TYPE_FOR_MODE
Index: cp/cp-tree.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/cp-tree.h,v
retrieving revision 1.719.2.51.2.12
diff -c -3 -p -r1.719.2.51.2.12 cp-tree.h
*** cp/cp-tree.h	25 Sep 2004 23:18:37 -0000	1.719.2.51.2.12
--- cp/cp-tree.h	5 Oct 2004 08:35:58 -0000
*************** extern linkage_kind decl_linkage        
*** 4203,4208 ****
--- 4203,4209 ----
  extern tree cp_walk_subtrees (tree*, int*, walk_tree_fn,
  				      void*, void*);
  extern int cp_cannot_inline_tree_fn (tree*);
+ extern int cp_cannot_version_tree_fn (tree*);
  extern tree cp_add_pending_fn_decls (void*,tree);
  extern int cp_is_overload_p (tree);
  extern int cp_auto_var_in_fn_p (tree,tree);
Index: cp/tree.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/tree.c,v
retrieving revision 1.286.2.43.2.11
diff -c -3 -p -r1.286.2.43.2.11 tree.c
*** cp/tree.c	25 Sep 2004 23:18:41 -0000	1.286.2.43.2.11
--- cp/tree.c	5 Oct 2004 08:35:58 -0000
*************** cp_cannot_inline_tree_fn (tree* fnp)
*** 2057,2062 ****
--- 2057,2088 ----
    return 0;
  }
  
+ /* Decide whether there are language-specific reasons to not version a
+    function as a tree.  */
+ 
+ int 
+ cp_cannot_version_tree_fn (tree *fnp)
+ {
+   tree fn = *fnp;
+  
+   /* ??? Is there a way to version a
+      constructor?  */ 
+   if (DECL_COMPLETE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_COMPLETE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_OVERLOADED_OPERATOR_P (fn))
+     return 1;
+  
+   return 0;
+ }
+  
+ 
+ 
  /* Add any pending functions other than the current function (already
     handled by the caller), that thus cannot be inlined, to FNS_P, then
     return the latest function added to the array, PREV_FN.  */

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2004-10-14 16:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-11 14:34 [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Razya Ladelsky
2004-10-11 16:13 ` Steven Bosscher
2004-10-14 16:34   ` Razya Ladelsky
2004-10-14 17:00     ` Jan Hubicka
  -- strict thread matches above, loose matches on Subject: below --
2004-10-10 15:38 Razya Ladelsky
2004-10-10 17:10 ` Steven Bosscher
2004-10-10 21:25   ` Jan Hubicka
2004-10-12 14:18 ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).