public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
@ 2004-10-10 15:38 Razya Ladelsky
  2004-10-10 17:10 ` Steven Bosscher
  2004-10-12 14:18 ` [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Jan Hubicka
  0 siblings, 2 replies; 13+ messages in thread
From: Razya Ladelsky @ 2004-10-10 15:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: stevenb, hubicka, Mircea Namolaru, Ayal Zaks

[-- Attachment #1: Type: text/plain, Size: 608 bytes --]

Hello,

Attached is the patch modified according to the suggestions
we received.

This patch implements function cloning.
It also contains extension to IPCP (Interprocedural constant propagation) 
optimization http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00998.html for 
using function cloning.

Now in order to enable IPCP for a given application, we should use the 
command
gcc -fipa-cp -combine file1.c file2.c file3.c 
where file1.c, file2.c, file3.c are the files of the application.

Bootstrapped and regression tests successfully passed on POWER4.

Comments are welcome.

Thanks,
Razya and Revital.



[-- Attachment #2: ChangeLog5_10 --]
[-- Type: application/octet-stream, Size: 2098 bytes --]

2004-10-05  Razya Ladelsky  <razya@il.ibm.com>

	* ipa_prop.c: Update IPA constant propagation.
	* ipa_prop.h: Same.
	* cgraph.h:  Add interface for accessing existing data structures,
	Support for cloning.
	* Makefile.in (CGRAPH_H): Add dependency to varray.h.
        (ipa_prop.c): Change dependencies.
	* common.opt: Remove fipa-no-cloning flag.
	* cgraphunit.c (cgraph_optimize, cgraph_remove_unreachable_nodes):  
	Support for IPA constant propagation.
	(update_call_expr, cgraph_copy_node_for_versioning, 
	cgraph_function_versioning): New functions to support versioning.
	* cgraph.c: (cgraph_copy_node_callees, cgraph_copy_local_info): New
	functions to support versioning.
	* integrate.c: (copy_decl_for_versioning): New function. Copies decl tree
	node for the purpose of versioning.
	* integrate.h:   (copy_decl_for_versioning): New declaration.
        * langhooks-def.h: (lhd_tree_versioning_cannot_version_tree_fn): Declare.
        (LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN): New definition.
	* langhooks.c: (lhd_tree_versioning_cannot_version_tree_fn):
	Check whether there are language-specific reasons for not
	creating a new version to a given function.
        * langhooks.h: (struct lang_hooks_for_tree_versioning,
	tree_versioning): New declaration.
        * gimplify.c: (create_function_name): New function.
	* tree-gimple.h: (create_function_name): New declaration.
	* tree-inline.c: (struct inline_data): New field to support versioning.
	(copy_arguments_for_versioning, version_body,
	copy_static_chain, function_versionable_p,
	tree_function_versioning): New functions to support tree
	function versioning.
	(remap_decl, copy_body_r):  Support function versioning.
	* tree-inline.h: (tree_versionable_function_p,
	tree_function_versioning): New declarations.
	* cp/cp-objcp-common.h: (LANG_HOOKS_TREE_VERSIONING_CA
	NNOT_VERSION_TREE_FN): New definition.
	* cp/cp-tree.h: (cp_cannot_version_tree_fn): New declaration.
	* cp/tree.c: (cp_cannot_version_tree_fn): Don't auto-version
	constructors/destructors and operators.

[-- Attachment #3: diff_5_10 --]
[-- Type: application/octet-stream, Size: 86534 bytes --]

Index: Makefile.in
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.903.2.179.2.18
diff -c -3 -p -r1.903.2.179.2.18 Makefile.in
*** Makefile.in	25 Sep 2004 23:17:22 -0000	1.903.2.179.2.18
--- Makefile.in	5 Oct 2004 08:35:53 -0000
*************** SCHED_INT_H = sched-int.h $(INSN_ATTR_H)
*** 703,709 ****
  INTEGRATE_H = integrate.h varray.h
  CFGLAYOUT_H = cfglayout.h $(BASIC_BLOCK_H)
  CFGLOOP_H = cfgloop.h $(BASIC_BLOCK_H) $(RTL_H)
! CGRAPH_H = cgraph.h bitmap.h tree.h $(HASHTAB_H)
  DF_H = df.h bitmap.h sbitmap.h $(BASIC_BLOCK_H)
  DDG_H = ddg.h sbitmap.h $(DF_H)
  GCC_H = gcc.h version.h
--- 703,709 ----
  INTEGRATE_H = integrate.h varray.h
  CFGLAYOUT_H = cfglayout.h $(BASIC_BLOCK_H)
  CFGLOOP_H = cfgloop.h $(BASIC_BLOCK_H) $(RTL_H)
! CGRAPH_H = cgraph.h bitmap.h tree.h varray.h $(HASHTAB_H)
  DF_H = df.h bitmap.h sbitmap.h $(BASIC_BLOCK_H)
  DDG_H = ddg.h sbitmap.h $(DF_H)
  GCC_H = gcc.h version.h
*************** simplify-rtx.o : simplify-rtx.c $(CONFIG
*** 1915,1923 ****
  cgraph.o : cgraph.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) gt-cgraph.h \
     output.h intl.h
! ipa_prop.o : ipa_prop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
!    langhooks.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) cgraph.h gt-cgraph.h \
!    output.h intl.h ipa_prop.h tree.h tree-iterator.h tree-gimple.h tree-inline.h
  cgraphunit.o : cgraphunit.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h tree-inline.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) intl.h \
     function.h $(TREE_GIMPLE_H) $(TREE_FLOW_H) ipa_prop.h
--- 1915,1923 ----
  cgraph.o : cgraph.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) gt-cgraph.h \
     output.h intl.h
! ipa_prop.o : ipa_prop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TREE_H) \
!    langhooks.h $(GGC_H) $(CGRAPH_H) output.h ipa_prop.h tree-iterator.h \
!    tree-inline.h  
  cgraphunit.o : cgraphunit.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \
     langhooks.h tree-inline.h toplev.h $(FLAGS_H) $(GGC_H)  $(TARGET_H) $(CGRAPH_H) intl.h \
     function.h $(TREE_GIMPLE_H) $(TREE_FLOW_H) ipa_prop.h
Index: cgraph.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraph.c,v
retrieving revision 1.4.4.18.2.8
diff -c -3 -p -r1.4.4.18.2.8 cgraph.c
*** cgraph.c	25 Sep 2004 23:17:32 -0000	1.4.4.18.2.8
--- cgraph.c	5 Oct 2004 08:35:54 -0000
*************** cgraph_unnest_node (struct cgraph_node *
*** 706,709 ****
--- 706,737 ----
    *node2 = node->next_nested;
    node->origin = NULL;
  }
+ 
+ /* Copy collees of node FROM to node TO.  */
+ void
+ cgraph_copy_node_callees (struct cgraph_node *to,
+                           struct cgraph_node *from)
+ {
+   struct cgraph_edge *e;
+   
+   for (e = from->callees;e; e=e->next_callee)
+     cgraph_clone_edge (e, to, e->call_expr);
+   
+ }
+ 
+ 
+ void
+ cgraph_copy_local_info (struct cgraph_node *new_node, 
+ 			struct cgraph_node *orig_node)
+ {
+   new_node->local.self_insns = orig_node->local.self_insns;
+   new_node->local.local = orig_node->local.local;
+   new_node->local.finalized = orig_node->local.finalized;
+   new_node->local.disregard_inline_limits = 
+     orig_node->local.disregard_inline_limits;
+   new_node->local.inlinable = orig_node->local.inlinable;
+   new_node->local.redefined_extern_inline = 
+     orig_node->local.redefined_extern_inline;
+ }
+ 
  #include "gt-cgraph.h"
Index: cgraph.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraph.h,v
retrieving revision 1.1.4.16.2.7
diff -c -3 -p -r1.1.4.16.2.7 cgraph.h
*** cgraph.h	25 Sep 2004 23:17:32 -0000	1.1.4.16.2.7
--- cgraph.h	5 Oct 2004 08:35:54 -0000
*************** Software Foundation, 59 Temple Place - S
*** 24,29 ****
--- 24,30 ----
  #include "hashtab.h"
  #include "bitmap.h"
  #include "tree.h"
+ #include "varray.h"
  
  /* Information about the function collected locally.
     Available after function is analyzed.  */
*************** struct cgraph_varpool_node GTY(())
*** 232,237 ****
--- 233,245 ----
    bool output;
  };
  
+ /* Get first callee of node.  */
+ #define FIRST_CALLEE(node)               ((node)->callees)
+ /* Get next callee from edge.  */
+ #define NEXT_CALLEE(edge)                ((edge)->next_callee)
+ /* Returns whether edge is last callee.  */
+ #define CALLEE_NOT_LAST(edge)            ((edge) != NULL)
+ 
  extern GTY(()) struct cgraph_node *cgraph_nodes;
  extern GTY(()) int cgraph_n_nodes;
  extern GTY(()) int cgraph_max_uid;
*************** struct cgraph_rtl_info *cgraph_rtl_info 
*** 258,270 ****
  const char * cgraph_node_name (struct cgraph_node *);
  struct cgraph_edge * cgraph_clone_edge (struct cgraph_edge *, struct cgraph_node *, tree);
  struct cgraph_node * cgraph_clone_node (struct cgraph_node *);
! 
  struct cgraph_varpool_node *cgraph_varpool_node (tree decl);
  void cgraph_varpool_mark_needed_node (struct cgraph_varpool_node *);
  void cgraph_varpool_finalize_decl (tree);
  bool cgraph_varpool_assemble_pending_decls (void);
  void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
! 
  bool cgraph_function_possibly_inlined_p (tree);
  void cgraph_unnest_node (struct cgraph_node *node);
  
--- 266,280 ----
  const char * cgraph_node_name (struct cgraph_node *);
  struct cgraph_edge * cgraph_clone_edge (struct cgraph_edge *, struct cgraph_node *, tree);
  struct cgraph_node * cgraph_clone_node (struct cgraph_node *);
! void cgraph_copy_node_callees (struct cgraph_node *,
!                                struct cgraph_node *);
  struct cgraph_varpool_node *cgraph_varpool_node (tree decl);
  void cgraph_varpool_mark_needed_node (struct cgraph_varpool_node *);
  void cgraph_varpool_finalize_decl (tree);
  bool cgraph_varpool_assemble_pending_decls (void);
  void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
! void cgraph_copy_local_info (struct cgraph_node *,
!                              struct cgraph_node *);
  bool cgraph_function_possibly_inlined_p (tree);
  void cgraph_unnest_node (struct cgraph_node *node);
  
*************** void cgraph_build_static_cdtor (char whi
*** 286,291 ****
--- 296,308 ----
  void cgraph_reset_static_var_maps (void);
  bitmap get_global_statics_not_read (tree fn);
  bitmap get_global_statics_not_written(tree fn);
+ void update_call_expr (struct cgraph_node *, varray_type);
+ struct cgraph_node *cgraph_copy_node_for_versioning (struct cgraph_node *,
+                                                      tree, varray_type);
+ struct cgraph_node *cgraph_function_versioning (struct cgraph_node *,
+                                                 varray_type);
+ 
  void init_cgraph (void);
  
+ 
  #endif  /* GCC_CGRAPH_H  */
Index: cgraphunit.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraphunit.c,v
retrieving revision 1.1.4.35.2.16
diff -c -3 -p -r1.1.4.35.2.16 cgraphunit.c
*** cgraphunit.c	25 Sep 2004 23:17:32 -0000	1.1.4.35.2.16
--- cgraphunit.c	5 Oct 2004 08:35:54 -0000
*************** record_call_1 (tree *tp, int *walk_subtr
*** 670,677 ****
  		       visited_nodes);
  	    *walk_subtrees = 0;
  	  }
-         if (flag_ipa_cp && decl == NULL_TREE)
-           flag_ipa_cp = 0;
  	break;
        }
  
--- 670,675 ----
*************** cgraph_reduced_inorder (struct cgraph_no
*** 1270,1279 ****
  }
  
  /* Perform reachability analysis and reclaim all unreachable nodes.
!    This function also remove unneeded bodies of extern inline functions
!    and thus needs to be done only after inlining decisions has been made.  */
  static bool
! cgraph_remove_unreachable_nodes (void)
  {
    struct cgraph_node *first = (void *) 1;
    struct cgraph_node *node;
--- 1268,1278 ----
  }
  
  /* Perform reachability analysis and reclaim all unreachable nodes.
!    If BEFORE_INLINING_P is true this function is called before inlining
!    decisions has been made.  If BEFORE_INLINING_P is false this function also 
!    removes unneeded bodies of extern inline functions.  */
  static bool
! cgraph_remove_unreachable_nodes (bool before_inlining_p)
  {
    struct cgraph_node *first = (void *) 1;
    struct cgraph_node *node;
*************** cgraph_remove_unreachable_nodes (void)
*** 1291,1298 ****
  #endif
    for (node = cgraph_nodes; node; node = node->next)
      if (node->needed && !node->global.inlined_to
! 	&& (!DECL_EXTERNAL (node->decl) || !node->analyzed))
        {
  	node->aux = first;
  	first = node;
        }
--- 1290,1300 ----
  #endif
    for (node = cgraph_nodes; node; node = node->next)
      if (node->needed && !node->global.inlined_to
! 	&& ((!DECL_EXTERNAL (node->decl)) 
!             || !node->analyzed
!             || before_inlining_p))
        {
+         node->reachable = true;
  	node->aux = first;
  	first = node;
        }
*************** cgraph_remove_unreachable_nodes (void)
*** 1312,1320 ****
  	if (!e->callee->aux
  	    && node->analyzed
  	    && (!e->inline_failed || !e->callee->analyzed
! 		|| !DECL_EXTERNAL (e->callee->decl)))
! 	  {
  	    e->callee->aux = first;
  	    first = e->callee;
  	  }
      }
--- 1314,1324 ----
  	if (!e->callee->aux
  	    && node->analyzed
  	    && (!e->inline_failed || !e->callee->analyzed
! 		|| (!DECL_EXTERNAL (e->callee->decl))
!                 || before_inlining_p))
!         {
  	    e->callee->aux = first;
+             e->callee->reachable = true;
  	    first = e->callee;
  	  }
      }
*************** cgraph_remove_unreachable_nodes (void)
*** 1341,1347 ****
  	    local_insns = 0;
  	  if (cgraph_dump_file)
  	    fprintf (cgraph_dump_file, " %s", cgraph_node_name (node));
! 	  if (!node->analyzed || !DECL_EXTERNAL (node->decl))
  	    cgraph_remove_node (node);
  	  else
  	    {
--- 1345,1352 ----
  	    local_insns = 0;
  	  if (cgraph_dump_file)
  	    fprintf (cgraph_dump_file, " %s", cgraph_node_name (node));
! 	  if (!node->analyzed || !DECL_EXTERNAL (node->decl)
!               || before_inlining_p)
  	    cgraph_remove_node (node);
  	  else
  	    {
*************** cgraph_decide_inlining (void)
*** 1960,1966 ****
    /* We will never output extern functions we didn't inline. 
       ??? Perhaps we can prevent accounting of growth of external
       inline functions.  */
!   cgraph_remove_unreachable_nodes ();
  
    if (cgraph_dump_file)
      fprintf (cgraph_dump_file,
--- 1965,1971 ----
    /* We will never output extern functions we didn't inline. 
       ??? Perhaps we can prevent accounting of growth of external
       inline functions.  */
!   cgraph_remove_unreachable_nodes (false);
  
    if (cgraph_dump_file)
      fprintf (cgraph_dump_file,
*************** cgraph_optimize (void)
*** 2785,2792 ****
  #endif
    if (!flag_unit_at_a_time)
      return;
!   if (flag_ipa_cp && flag_ipa_no_cloning)
      ipcp_driver ();
    timevar_push (TV_CGRAPHOPT);
    if (!quiet_flag)
      fprintf (stderr, "Performing intraprocedural optimizations\n");
--- 2790,2801 ----
  #endif
    if (!flag_unit_at_a_time)
      return;
!   if (flag_ipa_cp)
!   {
      ipcp_driver ();
+     /* Clean the callgraph.  */
+     cgraph_remove_unreachable_nodes (true);
+   }
    timevar_push (TV_CGRAPHOPT);
    if (!quiet_flag)
      fprintf (stderr, "Performing intraprocedural optimizations\n");
*************** init_cgraph (void)
*** 2932,2935 ****
--- 2941,3083 ----
    cgraph_dump_file = dump_begin (TDI_cgraph, NULL);
    memory_identifier = get_identifier("memory");
  }
+ 
+  
+ /* Update the CALL_EXPR in NEW_VERSION node callers edges.  */
+ 
+ void
+ update_call_expr (struct cgraph_node *new_version, 
+  		  varray_type redirect_callers)
+ {
+   struct cgraph_edge *e;
+   unsigned i;
+   
+   if (new_version == NULL)
+     abort ();
+   
+   for (i = 0; i < VARRAY_ACTIVE_SIZE (redirect_callers); i++)
+     {
+       e = VARRAY_GENERIC_PTR (redirect_callers, i);
+       
+       /* Update the call expr on the edges
+          to the new version. */ 
+       TREE_OPERAND (TREE_OPERAND (e->call_expr, 0), 0) = new_version->decl;
+     }
+   for (e = new_version->callers; e; e = e->next_caller)
+     { 
+       /* Update the call expr on the edges
+          of recursive calls. */
+       if (e->caller == new_version)
+         TREE_OPERAND (TREE_OPERAND (e->call_expr, 0), 0) = new_version->decl;            
+     }           
+ }
+ 
+ 
+ /* Create a new cgraph node which is the new version of 
+    OLD_VERSION node.  REDIRECT_CALLERS holds the callers
+    of OLD_VERSION which should be redirected to point to  
+    NEW_VERSION.  ALL the callees edges of OLD_VERSION 
+    are cloned to the new version node.  Return the new 
+    version node.  */
+ 
+ struct cgraph_node *
+ cgraph_copy_node_for_versioning (struct cgraph_node *old_version,
+  				 tree new_decl, varray_type redirect_callers)
+ {
+   struct cgraph_node *new_version;
+   struct cgraph_edge *e;
+   struct cgraph_edge *next_callee;
+   unsigned i;
+   
+   if (old_version == NULL)
+     abort ();
+   
+   new_version = cgraph_node (new_decl);
+   
+   new_version->needed = false; 
+   new_version->decl = new_decl;
+   memset (&new_version->local, 0, sizeof (new_version->local));
+   cgraph_copy_local_info (new_version,
+  			  old_version);
+   memset (&new_version->global, 0, sizeof (new_version->global));
+   new_version->global.insns = old_version->global.insns;
+   memset (&new_version->rtl, 0, sizeof (new_version->rtl));
+   new_version->analyzed = true;
+   
+   /* Clone the old node callees.  Recursive calls are
+      also cloned.  */ 
+   cgraph_copy_node_callees (new_version, old_version);
+   
+   
+   /* Fix recursive calls. 
+      If old_version has a recursive call after the 
+      previous cloning the new version will have an edge
+      pointing to the old version which is wrong;
+      so redirect it to point to the new version. */
+   for (e = new_version->callees ; e; e = next_callee)
+     {
+       next_callee = e->next_callee;
+       if (e->callee == old_version)
+         {
+           cgraph_redirect_edge_callee (e, new_version);
+         }
+        if (!next_callee)
+          break;
+     }
+   for (i = 0; i < VARRAY_ACTIVE_SIZE (redirect_callers); i++) 
+     {
+       e = VARRAY_GENERIC_PTR (redirect_callers, i); 
+       /* Redirect calls to the old version node
+  	 to point to it's new version.  */ 
+       cgraph_redirect_edge_callee (e, new_version);
+     }
+   
+   allocate_struct_function (new_decl);
+   cfun->function_end_locus = DECL_SOURCE_LOCATION (new_decl);
+   
+   return new_version;
+ }
+ 
+ /* Perform function versioning. 
+    Function versioning includes:
+    1) Generating a new cgraph node for the new version
+    and redirect it's edges respectively. 
+    2) Copying the old version tree to the new
+    version.  
+    The function gets REDIRECT_CALLERS varray, the 
+    edges to be redirected to the new version and
+    the old version's cgraph node.  It returns the new version's 
+    cgraph node.  */ 
+ 
+ struct cgraph_node *
+ cgraph_function_versioning (struct cgraph_node *old_version_node, 
+                             varray_type redirect_callers)
+ {
+   tree old_decl = old_version_node->decl;
+   struct cgraph_node *new_version_node = NULL;
+   tree new_decl;
+   
+   if (!tree_versionable_function_p (old_decl))
+     return NULL;
+   
+   /* Make a new FUNCTION_DECL tree node for the
+      new version. */
+   new_decl = copy_node (old_decl);
+   
+   /* Create the new version's call-graph node. 
+      and update the edges of the new node. */
+   new_version_node = 
+     cgraph_copy_node_for_versioning (old_version_node, new_decl,
+                                      redirect_callers);  
+   
+   /* Copy the old version's function tree to the new
+      version.  */
+   tree_function_versioning (old_decl, new_decl);
+   /* Update the call_expr on the edges
+      to the new version node. */
+   update_call_expr (new_version_node, redirect_callers);
+   return new_version_node;
+ }
+ 
+ 
  #include "gt-cgraphunit.h"
Index: common.opt
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/common.opt,v
retrieving revision 1.14.2.20.2.13
diff -c -3 -p -r1.14.2.20.2.13 common.opt
*** common.opt	18 Sep 2004 22:46:58 -0000	1.14.2.20.2.13
--- common.opt	5 Oct 2004 08:35:54 -0000
*************** fipa-cp
*** 888,897 ****
  Common Report Var(flag_ipa_cp)
  Perform IPA constant propagation
  
- fipa-no-cloning
- Common Report Var(flag_ipa_no_cloning)
- Perform IPA cp, without cloning of methods
- 
  funroll-loops
  Common Report Var(flag_unroll_loops)
  Perform loop unrolling when iteration count is known
--- 888,893 ----
Index: gimplify.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/gimplify.c,v
retrieving revision 1.1.2.141.2.13
diff -c -3 -p -r1.1.2.141.2.13 gimplify.c
*** gimplify.c	25 Sep 2004 23:17:49 -0000	1.1.2.141.2.13
--- gimplify.c	5 Oct 2004 08:35:55 -0000
*************** create_tmp_var_name (const char *prefix)
*** 315,320 ****
--- 315,344 ----
    return get_identifier (tmp_name);
  }
  
+ /* Create a new function name with PREFIX.  Returns an identifier.  */
+ tree
+ create_function_name (const char *prefix)
+ {
+   char *tmp_name;
+   unsigned i;
+   
+   if (prefix)
+     {
+       char *preftmp = ASTRDUP (prefix);
+ 
+       remove_suffix (preftmp, strlen (preftmp));
+       prefix = preftmp;
+     }
+   ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix ? prefix : "F", tmp_var_id_num++);
+   for (i=0; i < strlen (tmp_name); i++)
+   {
+      if (tmp_name[i] == '.')
+            tmp_name[i] = '_';
+   }
+ 
+   return get_identifier (tmp_name);
+ }
+ 
  
  /* Create a new temporary variable declaration of type TYPE.
     Does NOT push it into the current binding.  */
Index: integrate.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/integrate.c,v
retrieving revision 1.197.2.31.2.7
diff -c -3 -p -r1.197.2.31.2.7 integrate.c
*** integrate.c	9 Sep 2004 10:55:04 -0000	1.197.2.31.2.7
--- integrate.c	5 Oct 2004 08:35:55 -0000
*************** copy_decl_for_inlining (tree decl, tree 
*** 176,181 ****
--- 176,248 ----
  
    return copy;
  }
+ 
+ /* Copy NODE (which must be a DECL).  The DECL originally was in the FROM_FN,
+    but now it will be in the TO_FN. As apposed for to copy_decl_for_inlining ()
+    function; we do not give a special treatment to parm_decl and result_decl.  */ 
+ tree
+ copy_decl_for_versioning (tree decl, tree from_fn, tree to_fn)
+ {
+   tree copy;
+ 
+   /* Copy the declaration.  */
+   copy = copy_node (decl);
+       
+   /* The COPY is not abstract; it will be generated in TO_FN.  */
+   DECL_ABSTRACT (copy) = 0;
+   lang_hooks.dup_lang_specific_decl (copy);
+       
+   /* TREE_ADDRESSABLE isn't used to indicate that a label's
+      address has been taken; it's for internal bookkeeping in
+      expand_goto_internal.  */
+   if (TREE_CODE (copy) == LABEL_DECL)
+     {
+       TREE_ADDRESSABLE (copy) = 0;
+     }
+   
+   /* FIXME:  For the moment debug information for the copy
+      is not supported.  */ 
+   DECL_ARTIFICIAL (copy) = 1;
+   DECL_IGNORED_P (copy) = 1;
+   /* DECL_ABSTRACT_ORIGIN (copy) = DECL_ORIGIN (decl);  */
+   
+   if (TREE_STATIC (copy))
+     {
+       /* Set the DECL_ABSTRACT_ORIGIN of copy to point to the 
+          old version decl in case it is static so it can 
+          be retrieved when needed.  */
+       DECL_ABSTRACT_ORIGIN (copy) = decl;
+       /* FIXME: Again - debug information is not supported yet.  */
+       DECL_IGNORED_P (decl) = 1;
+     }
+    
+   /* The new variable/label has no RTL, yet.  */
+   if (!TREE_STATIC (copy) && !DECL_EXTERNAL (copy))
+     SET_DECL_RTL (copy, NULL_RTX);
+  
+   /* These args would always appear unused, if not for this.  */
+   TREE_USED (copy) = 1;
+   
+   /* Set the context for the new declaration.  */
+   if (!DECL_CONTEXT (decl))
+     /* Globals stay global.  */
+     ;
+   else if (TREE_STATIC (decl))
+     /* Function-scoped static variables should stay in the original
+        function.  */
+     ;
+   else if (DECL_CONTEXT (decl) != from_fn)
+     /* Things that weren't in the scope of the function we're inlining
+        from aren't in the scope we're inlining to, either.  */
+     ;
+   else
+     /* Ordinary automatic local variables are now in the scope of the
+        new function.  */
+     DECL_CONTEXT (copy) = to_fn;
+   
+   return copy;
+ }
+ 
  \f
  /* Unfortunately, we need a global copy of const_equiv map for communication
     with a function called from note_stores.  Be *very* careful that this
Index: integrate.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/integrate.h,v
retrieving revision 1.25.2.2.6.1
diff -c -3 -p -r1.25.2.2.6.1 integrate.h
*** integrate.h	22 Feb 2004 15:35:47 -0000	1.25.2.2.6.1
--- integrate.h	5 Oct 2004 08:35:55 -0000
*************** extern void allocate_initial_values (rtx
*** 136,142 ****
  /* Copy a declaration when one function is substituted inline into
     another.  */
  extern tree copy_decl_for_inlining (tree, tree, tree);
! 
  /* Check whether there's any attribute in a function declaration that
     makes the function uninlinable.  Returns false if it finds any,
     true otherwise.  */
--- 136,142 ----
  /* Copy a declaration when one function is substituted inline into
     another.  */
  extern tree copy_decl_for_inlining (tree, tree, tree);
! extern tree copy_decl_for_versioning (tree, tree, tree);
  /* Check whether there's any attribute in a function declaration that
     makes the function uninlinable.  Returns false if it finds any,
     true otherwise.  */
Index: ipa_prop.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/ipa_prop.c,v
retrieving revision 1.1.2.3
diff -c -3 -p -r1.1.2.3 ipa_prop.c
*** ipa_prop.c	12 Sep 2004 16:39:17 -0000	1.1.2.3
--- ipa_prop.c	5 Oct 2004 08:35:56 -0000
***************
*** 1,6 ****
! /* Callgraph based intraprocedural optimizations.
!    Copyright (C) 2003, 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky.
  
  This file is part of GCC.
  
--- 1,6 ----
! /* Interprocedural constant propagation
!    Copyright (C) 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky <RAZYA@il.ibm.com>
  
  This file is part of GCC.
  
*************** Software Foundation, 59 Temple Place - S
*** 20,27 ****
  02111-1307, USA.  */
  
  /* Interprocedural constant propagation.
! The aim of IPA constant propagation is to find which function's argument
! has the same constant value in each invocation throughout the whole program.
  For example, for an application consisting of two files, foo1.c, foo2.c:
  
  foo1.c contains :
--- 20,27 ----
  02111-1307, USA.  */
  
  /* Interprocedural constant propagation.
! The aim of interprocedural constant propagation (IPCP) is to find which function's 
! argument has the same constant value in each invocation throughout the whole program.
  For example, for an application consisting of two files, foo1.c, foo2.c:
  
  foo1.c contains :
*************** int g (int y)
*** 47,142 ****
    printf ("value is %d",y);
  }
  
! The IPCP algorithm should find that g's formal argument y
  is always called with the value 3. 
  
  The optimization is divided into three stages:
  
  First stage - intraprocedural analysis  
  =======================================
  This phase computes jump_function and modify information. 
  
! Jump functions represent the value that is passes to each actual argument 
! by this callsite.
! there are three values :
!   Formal - the caller's formal parameter is passed as a parameter.
!   Constant - a constant is passed as a parameter.
    Unknown - neither of the above.
  
! Modify is true for a formal parameter if it is modified in its function.
! 
! In the case that the formal parameter passed in the callsite is 
! modified, the jump fumction value for this parameter turns to UNKOWN.
  
! The jump function structure, ipcp_jump_func, is defined in ipa_edge
  structure (defined in ipa_prop.h and pointed to by cgraph_node->aux)
  The modify info, ipcp_modify, is defined in ipa_node structure
  (defined in ipa_prop.h and pointed to by cgraph_edge->aux).
  
! The ipcp_tree_map (a mapping from the PARAM_DECL tree to the inner
! representation of formals) and ipcp_formal (explained below) structures
! are also pointed to by cgraph_node->aux and also initialized 
! in this phase.
! 
! -ipcp_stage1() is the first stage driver.
  
  Second stage - interprocedural analysis
  ========================================
! This phase solves the interprocedural constant propagation problem.
! The algorithm is based on a paper by Challahan David, 
! Keith D Cooper, Ken Kennedy, Linda Torczon, 
! Interprocedural Constant Propagation, Comp86.
! 
! Each formal has cval info (denoted as ipcp_formal) which represents its 
! value computed so far. 
! possible values : 
!   TOP - not known yet. 
!   BOTTOM - considered as not constant. 
!   CONSTANT_TYPE - has constant value.
  
- The algoritm computes for all formal parameters in the program 
- their cval value.
  Cval of formal f will have a constant value if all callsites to this 
! function have the same constant value passed to f 
! and will have BOTTOM value otherwise.
  
! This stage propagates the constant information from formal paranmeters
! to the actual parameters.
  
! -ipcp_stage2() is the second stage driver.
  
! Third phase 
! ============
  Propagates the constant-valued formals into the function.
  
! -ipcp_stage3() is the third phase driver.
     
  */
  
  #include "config.h"
  #include "system.h"
! #include "coretypes.h"
! #include "tm.h"
  #include "tree.h"
  #include "langhooks.h"
- #include "hashtab.h"
- #include "toplev.h"
- #include "flags.h"
  #include "ggc.h"
- #include "debug.h"
- #include "target.h"  
  #include "cgraph.h"
- #include "varray.h"
  #include "output.h"  
  #include "tree-iterator.h"
- #include "tree-gimple.h"
  #include "tree-inline.h"
  #include "ipa_prop.h"
  
  typedef struct cgraph_node *ipcp_method ;
  typedef  struct cgraph_edge *ipcp_callsite;
  
! struct ipcp_methodlist
  {
    ipcp_method method_p;
    struct ipcp_methodlist *next_method;
--- 47,144 ----
    printf ("value is %d",y);
  }
  
! The IPCP algorithm will find that g's formal argument y
  is always called with the value 3. 
  
+ The algorithm used is based on "Interprocedural Constant Propagation",
+ by Challahan David, Keith D Cooper, Ken Kennedy, Linda Torczon, Comp86, pg 152-161
+ 
  The optimization is divided into three stages:
  
  First stage - intraprocedural analysis  
  =======================================
  This phase computes jump_function and modify information. 
  
! A jump function for a callsite represents the values passed as actual arguments 
! of the callsite. There are three types of values :
!   Formal - the caller's formal parameter is passed as an actual argument.
!   Constant - a constant is passed as a an actual argument.
    Unknown - neither of the above.
  
! In order to compute the jump functions, we need the modify information for the formal
! parameters of methods.
  
! The jump function info, ipcp_jump_func, is defined in ipa_edge
  structure (defined in ipa_prop.h and pointed to by cgraph_node->aux)
  The modify info, ipcp_modify, is defined in ipa_node structure
  (defined in ipa_prop.h and pointed to by cgraph_edge->aux).
  
! -ipcp_init_stage() is the first stage driver.
  
  Second stage - interprocedural analysis
  ========================================
! This phase does the interprocedural constant propagation.
! It computes for all formal parameters in the program 
! their cval value that may be:
!   TOP - unknown. 
!   BOTTOM - non constant. 
!   CONSTANT_TYPE - constant value.
  
  Cval of formal f will have a constant value if all callsites to this 
! function have the same constant value passed to f. 
  
! The cval info, ipcp_formal, is defined in ipa_node structure
! (defined in ipa_prop.h and pointed to by cgraph_edge->aux). 
  
! -ipcp_iterate_stage() is the second stage driver.
  
! Third phase - transformation of methods code
! ============================================
  Propagates the constant-valued formals into the function.
+ For each method mt, whose parameters are consts, we create a clone.
+ 
+ We insert an assignment statement 'parameter = const' at the beginning 
+ of the cloned method.
  
! We also need to modify some callsites to call to the cloned methods instead 
! of the original ones. For a callsite passing an argument found to be a constant 
! by IPCP, there are two different cases to handle:
! 1. A constant is passed as an argument.
! 2. A parameter (of the caller) passed as an argument (pass through argument).
! 
! In the first case, the callsite in the original caller should be redirected 
! to call the cloned callee.
! In the second case, both the caller and the callee have clones
! and the callsite of the cloned caller would be redirected to call to 
! the cloned callee.
! 
! The callgraph is updated accordingly.
! 
! This update is done in two stages: 
! First all cloned methods are created during a traversal of the callgraph, 
! during which all callsites are redirected to call the cloned method.
! Then the callsites are traversed and updated as described above.
! 
! -ipcp_insert_stage() is the third phase driver.
     
  */
  
  #include "config.h"
  #include "system.h"
! #include "coretypes.h" 
  #include "tree.h"
  #include "langhooks.h"
  #include "ggc.h"
  #include "cgraph.h"
  #include "output.h"  
  #include "tree-iterator.h"
  #include "tree-inline.h"
  #include "ipa_prop.h"
  
  typedef struct cgraph_node *ipcp_method ;
  typedef  struct cgraph_edge *ipcp_callsite;
  
! struct ipcp_methodlist GTY(())
  {
    ipcp_method method_p;
    struct ipcp_methodlist *next_method;
*************** struct ipcp_methodlist
*** 144,150 ****
  
  typedef struct ipcp_methodlist *ipcp_methodlist_p;
  
! /* WorlList Interface.  */
  static inline ipcp_methodlist_p ipcp_create_methodlist_node (void); 
  static inline bool ipcp_methodlist_not_empty (ipcp_methodlist_p);
  static inline ipcp_method ipcp_methodlist_method (ipcp_methodlist_p); 
--- 146,152 ----
  
  typedef struct ipcp_methodlist *ipcp_methodlist_p;
  
! /* ipcp_worlList Interface.  */
  static inline ipcp_methodlist_p ipcp_create_methodlist_node (void); 
  static inline bool ipcp_methodlist_not_empty (ipcp_methodlist_p);
  static inline ipcp_method ipcp_methodlist_method (ipcp_methodlist_p); 
*************** static void ipcp_callsite_compute_param 
*** 172,177 ****
--- 174,183 ----
  static void ipcp_callsite_compute_count (ipcp_callsite);
  
  /* ipcp_method interface.  */
+ static inline void ipa_node_create (ipcp_method);
+ static inline bool ipcp_method_is_cloned (ipcp_method);
+ static inline void ipcp_method_set_orig_node (ipcp_method, ipcp_method);
+ static inline ipcp_method ipcp_method_orig_node (ipcp_method);
  static inline void ipcp_formal_create (ipcp_method); 
  static inline int ipcp_method_formal_count (ipcp_method);
  static inline void ipcp_method_formal_count_set (ipcp_method, int); 
*************** static inline tree ipcp_method_get_tree 
*** 184,192 ****
  static inline void ipcp_method_tree_map_create (ipcp_method);
  static inline void ipcp_method_modify_create (ipcp_method); 
  static inline void ipcp_method_modify_set (ipcp_method, int, bool);
! static inline ipcp_callsite ipcp_cs_get_first (ipcp_method);
! static inline ipcp_callsite ipcp_cs_get_next (ipcp_callsite);
! static inline bool ipcp_cs_not_last (ipcp_callsite); 
  static int  ipcp_method_tree_map (ipcp_method, tree);
  static void ipcp_method_compute_tree_map (ipcp_method);
  static void ipcp_cval_meet (struct ipcp_formal *, struct ipcp_formal *,
--- 190,196 ----
  static inline void ipcp_method_tree_map_create (ipcp_method);
  static inline void ipcp_method_modify_create (ipcp_method); 
  static inline void ipcp_method_modify_set (ipcp_method, int, bool);
! static void ipa_cloned_create (ipcp_method orig_node, ipcp_method new_node);
  static int  ipcp_method_tree_map (ipcp_method, tree);
  static void ipcp_method_compute_tree_map (ipcp_method);
  static void ipcp_cval_meet (struct ipcp_formal *, struct ipcp_formal *,
*************** static inline ipcp_int *ipcp_cval_get_cv
*** 216,228 ****
  static inline enum Cvalue_type ipcp_cval_get_cvalue_type (struct ipcp_formal *);
  static inline bool ipcp_cval_equal_cvalues (ipcp_int *, ipcp_int *);
  
! static void ipcp_stage1 (void);
! static void ipcp_stage2 (void);
! static void ipcp_stage3 (void);
  static void ipcp_free (void);
  static void ipa_nodes_create (void);
  
! static bool ipcp_method_dont_propagate (ipcp_method);
  static tree ipcp_asm_walk_tree (tree *, int *, void *);
  static bool ipcp_method_contains_asm (ipcp_method);
  static void ipcp_walk_statements (ipcp_method, tree *);
--- 220,235 ----
  static inline enum Cvalue_type ipcp_cval_get_cvalue_type (struct ipcp_formal *);
  static inline bool ipcp_cval_equal_cvalues (ipcp_int *, ipcp_int *);
  
! static void ipcp_init_stage (void);
! static void ipcp_iterate_stage (void);
! static void ipcp_insert_stage (void);
! static void ipcp_update_callgraph (void);
! static bool ipcp_redirect (ipcp_callsite);
  static void ipcp_free (void);
  static void ipa_nodes_create (void);
+ static void ipa_edges_create (void);
  
! static bool ipcp_method_dont_insert_const (ipcp_method);
  static tree ipcp_asm_walk_tree (tree *, int *, void *);
  static bool ipcp_method_contains_asm (ipcp_method);
  static void ipcp_walk_statements (ipcp_method, tree *);
*************** static void ipcp_callsite_param_print (F
*** 238,273 ****
--- 245,286 ----
  
  /* WorlList Interface.  */
  
+ /* Creates the worklist for ipcp_iterate_stage().  */
  static inline ipcp_methodlist_p
  ipcp_create_methodlist_node (void) 
  {
    return ggc_alloc (sizeof (struct ipcp_methodlist));
  }
  
+ /* Checks that worklist is not empty.  */
  static inline bool
  ipcp_methodlist_not_empty (ipcp_methodlist_p wl)
  {
    return (wl != NULL);
  }
  
+ /* Gets method from worklist's node.  */
  static inline ipcp_method
  ipcp_methodlist_method (ipcp_methodlist_p wl) 
  {
    return wl->method_p;   
  }
  
+ /* Sets worklist's node to point to mt.  */
  static inline void
  ipcp_methodlist_method_set (ipcp_methodlist_p wl, ipcp_method mt)
  {
    wl->method_p = mt;
  }
  
+ /* Gets next worklist's node.  */
  static inline ipcp_methodlist_p
  ipcp_methodlist_next_method (ipcp_methodlist_p wl) 
  { 
    return wl->next_method;    
  }
  
+ /* Sets worklist node wl1 to point to  worklist node wl2.  */
  static inline void
  ipcp_methodlist_next_method_set (ipcp_methodlist_p wl1, ipcp_methodlist_p wl2) 
  { 
*************** ipcp_remove_method (ipcp_methodlist_p *w
*** 312,349 ****
    return ipcp_methodlist_method (first);
  }
  
! /* callsite interface.  */
  static inline int 
  ipcp_callsite_param_count (ipcp_callsite cs) 
  { 
    return ((struct ipa_edge *)cs->aux)->ipcp_param_num;
  }
  
  static inline void 
  ipcp_callsite_param_count_set (ipcp_callsite cs, int i) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_num = i;
  }
  
  static inline struct ipcp_jump_func *
  ipcp_callsite_param (ipcp_callsite cs, int i) 
  {
    return &(((struct ipa_edge *)cs->aux)->ipcp_param_map[i]);
  }
  
  static inline ipcp_method 
  ipcp_callsite_callee (ipcp_callsite cs)
  {
    return cs->callee;
  }
  
! 
  static inline void
  ipcp_callsite_param_set_type (ipcp_callsite cs, int i,  enum Jfunc_type type1) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_map[i].type = type1;
  }
  
  static inline void
  ipcp_callsite_param_set_info_type (ipcp_callsite cs, int i, 
  				   ipcp_int *info_type1) 
--- 325,368 ----
    return ipcp_methodlist_method (first);
  }
  
! /* ipcp_callsite interface.  */
! 
! /* Gets how many arguments the callsite has.  */
  static inline int 
  ipcp_callsite_param_count (ipcp_callsite cs) 
  { 
    return ((struct ipa_edge *)cs->aux)->ipcp_param_num;
  }
  
+ /* Sets how many arguments the callsite has.  */
  static inline void 
  ipcp_callsite_param_count_set (ipcp_callsite cs, int i) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_num = i;
  }
  
+ /* Gets the jump function for argument i of callsite cs.  */
  static inline struct ipcp_jump_func *
  ipcp_callsite_param (ipcp_callsite cs, int i) 
  {
    return &(((struct ipa_edge *)cs->aux)->ipcp_param_map[i]);
  }
  
+ /* Gets the callee of callsite cs.  */
  static inline ipcp_method 
  ipcp_callsite_callee (ipcp_callsite cs)
  {
    return cs->callee;
  }
  
! /* Sets the jump function's type for argument i of callsite cs.  */
  static inline void
  ipcp_callsite_param_set_type (ipcp_callsite cs, int i,  enum Jfunc_type type1) 
  {
    ((struct ipa_edge *)cs->aux)->ipcp_param_map[i].type = type1;
  }
  
+ /* Sets the jump function's info type for argument i of callsite cs.  */
  static inline void
  ipcp_callsite_param_set_info_type (ipcp_callsite cs, int i, 
  				   ipcp_int *info_type1) 
*************** ipcp_callsite_param_set_info_type (ipcp_
*** 351,356 ****
--- 370,376 ----
    ipcp_jf_set_info_type (ipcp_callsite_param (cs, i), info_type1);
  }
  
+ /* Allocates space for the jump functions.  */
  static inline void
  ipcp_callsite_param_map_create (ipcp_callsite cs) 
  {
*************** ipcp_callsite_param_map_create (ipcp_cal
*** 358,369 ****
--- 378,391 ----
      xcalloc (ipcp_callsite_param_count (cs), sizeof (struct ipcp_jump_func));   
  }
  
+ /* Returns the call expr tree realted to callsite cs.  */
  static inline tree
  ipcp_callsite_tree (ipcp_callsite cs) 
  {
    return cs->call_expr;
  }
  
+ /* Gets the caller of callsite cs.  */
  static inline ipcp_method 
  ipcp_callsite_caller (ipcp_callsite cs) 
  {
*************** ipcp_callsite_compute_param (ipcp_callsi
*** 445,450 ****
--- 467,501 ----
  
  /* ipcp_method interface.  */
  
+ /* Allocate and initialize ipa_node structure.  */
+ static inline void
+ ipa_node_create (ipcp_method node)
+ {
+    node->aux = xcalloc (1, sizeof (struct ipa_node));
+ }
+ 
+ /* Returns true if node is a cloned/versioned node.  */
+ static inline bool
+ ipcp_method_is_cloned (ipcp_method node)
+ {
+   return (ipcp_method_orig_node (node) != NULL);
+ }
+ 
+ /* Get orig node of method mt.  */
+ static inline ipcp_method
+ ipcp_method_orig_node (ipcp_method mt)
+ {
+   return ((struct ipa_node *)mt->aux)->ipcp_orig_node;
+ }
+ 
+ /* Sets orig node of method node.  */
+ static inline void
+ ipcp_method_set_orig_node (ipcp_method node, ipcp_method orig_node)
+ {
+   ((struct ipa_node *)node->aux)->ipcp_orig_node = orig_node;
+ }
+ 
+ /* Create cval structure for method mt.  */
  static inline void
  ipcp_formal_create (ipcp_method mt) 
  {
*************** ipcp_formal_create (ipcp_method mt) 
*** 452,469 ****
--- 503,523 ----
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_formal));
  }
  
+ /* Get number of formals of method mt.  */
  static inline int
  ipcp_method_formal_count (ipcp_method mt)
  {
    return ((struct ipa_node *)mt->aux)->ipcp_arg_num;
  }
  
+ /* Set number of formals of method mt.  */
  static inline void 
  ipcp_method_formal_count_set (ipcp_method mt, int i) 
  {
    ((struct ipa_node *)mt->aux)->ipcp_arg_num = i;
  }
  
+ /* Set cval structure of i-th formal of mt to cval.  */
  static inline void 
  ipcp_method_cval_set (ipcp_method mt, int i, struct ipcp_formal *cval)             
  {
*************** ipcp_method_cval_set (ipcp_method mt, in
*** 471,482 ****
--- 525,538 ----
    ipcp_cval_set_cvalue ( ipcp_method_cval (mt, i), ipcp_cval_get_cvalue (cval));     
  }
  
+ /* Get cval structure of i-th formal of mt.  */
  static inline struct ipcp_formal *
  ipcp_method_cval (ipcp_method mt, int info_type) 
  {
    return &(((struct ipa_node *)mt->aux)->ipcp_cval[info_type]);
  }
  
+ /* Set type of cval structure of formal i of mt to cval_type.  */
  static inline void 
  ipcp_method_cval_set_cvalue_type (ipcp_method mt, int i, 
  				  enum Cvalue_type cval_type)  
*************** ipcp_method_cval_set_cvalue_type (ipcp_m
*** 484,501 ****
--- 540,560 ----
    ((struct ipa_node *)mt->aux)->ipcp_cval[i].cvalue_type = cval_type;
  }
  
+ /* Returns whether i-th formal of mt is modified in mt.  */
  static inline bool 
  ipcp_method_is_modified (ipcp_method mt, int i)
  {
    return ((struct ipa_node *)mt->aux)->ipcp_mod[i].mod;
  }
  
+ /* Get param tree of i-th formal of mt.  */
  static inline tree 
  ipcp_method_get_tree (ipcp_method mt, int i)
  {
    return ((struct ipa_node *)mt->aux)->ipcp_param_tree[i].param_tree;
  }
  
+ /* Create tree map structure of mt.  */
  static inline void 
  ipcp_method_tree_map_create (ipcp_method mt)
  { 
*************** ipcp_method_tree_map_create (ipcp_method
*** 503,508 ****
--- 562,568 ----
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_tree_map));
  }
  
+ /* Create modify structure of mt.  */
  static inline void  
  ipcp_method_modify_create (ipcp_method mt) 
  { 
*************** ipcp_method_modify_create (ipcp_method m
*** 510,537 ****
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_modify));
  }
  
  static inline void  
  ipcp_method_modify_set (ipcp_method mt, int i, bool val)
  {
    ((struct ipa_node *)mt->aux)->ipcp_mod[i].mod = val;
  }
  
! static inline ipcp_callsite
! ipcp_cs_get_first (ipcp_method mt)
! {
!   return mt->callees;
! }
! 
! static inline ipcp_callsite
! ipcp_cs_get_next (ipcp_callsite cs)
  {
!   return cs->next_callee;
! }
  
- static inline bool 
- ipcp_cs_not_last (ipcp_callsite cs) 
- {
-   return (cs != NULL);
  }
  
  /* Returning the parameter index of the ptree.  */
--- 570,590 ----
      xcalloc (ipcp_method_formal_count (mt), sizeof (struct ipcp_modify));
  }
  
+ /* Set modify of i-th formal of mt.  */
  static inline void  
  ipcp_method_modify_set (ipcp_method mt, int i, bool val)
  {
    ((struct ipa_node *)mt->aux)->ipcp_mod[i].mod = val;
  }
  
! static void
! ipa_cloned_create (ipcp_method orig_node, ipcp_method new_node)
  {
!   ipa_node_create (new_node);
!   ipcp_method_set_orig_node (new_node, orig_node);
!   ipcp_method_formal_compute_count (new_node);
!   ipcp_method_compute_tree_map (new_node);
  
  }
  
  /* Returning the parameter index of the ptree.  */
*************** ipcp_modify_walk_tree (tree *tp, void *d
*** 702,714 ****
  	}
        break;
      case ADDR_EXPR:
        if( TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL )
          {
  	  i = ipcp_method_tree_map (mt, TREE_OPERAND (t, 0));
            ipcp_method_modify_set (mt, i, true);
          }
        break;
!     case ASM_EXPR:   
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
  	  ipcp_method_modify_set (mt, j, true);
--- 755,770 ----
  	}
        break;
      case ADDR_EXPR:
+       /* If the parametr's address is taken, 
+          it could be modified.  */
        if( TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL )
          {
  	  i = ipcp_method_tree_map (mt, TREE_OPERAND (t, 0));
            ipcp_method_modify_set (mt, i, true);
          }
        break;
!     case ASM_EXPR:  
!       /* Asm code could modify any of the parameters.  */
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
  	  ipcp_method_modify_set (mt, j, true);
*************** ipcp_method_compute_modify (ipcp_method 
*** 743,749 ****
  
    ipcp_method_modify_init (mt);
    decl = mt->decl;
!   if (!tree_inlinable_function_p (decl))
      {
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
--- 799,807 ----
  
    ipcp_method_modify_init (mt);
    decl = mt->decl;
!   /* ??? Handle pending sizes case. Set all parameters 
!      of the method to be modified.  */  
!   if (DECL_UNINLINABLE (decl))
      {
        for (j = 0; j < ipcp_method_formal_count (mt); j++)
  	{
*************** ipcp_walk_statements (ipcp_method mt, tr
*** 784,803 ****
      }
  }
  
! /* jump function interface.  */
  
  static inline enum Jfunc_type
  get_type (struct ipcp_jump_func *jf)
  {
    return jf->type;
  }
  
  static inline ipcp_int *
  ipcp_jf_get_info_type (struct ipcp_jump_func *jf) 
  {
    return &(jf->info_type);
  }
  
  static inline void 
  ipcp_jf_set_info_type (struct ipcp_jump_func *jf, ipcp_int *info_type)
  {
--- 842,864 ----
      }
  }
  
! /* ipcp_jump_func interface.  */
  
+ /* Get type of jump function jf.  */
  static inline enum Jfunc_type
  get_type (struct ipcp_jump_func *jf)
  {
    return jf->type;
  }
  
+ /* Get info type of jump function jf.  */
  static inline ipcp_int *
  ipcp_jf_get_info_type (struct ipcp_jump_func *jf) 
  {
    return &(jf->info_type);
  }
  
+ /* Set info type of jump function jf.  */
  static inline void 
  ipcp_jf_set_info_type (struct ipcp_jump_func *jf, ipcp_int *info_type)
  {
*************** ipcp_jf_set_info_type (struct ipcp_jump_
*** 805,810 ****
--- 866,872 ----
    jf->info_type.high = info_type->high;
  }
  
+ /* Transform const_val to an ipcp_int type.  */
  static inline void 
  ipcp_int2ipcp_int (ipcp_int *info_type, int const_val)
  {
*************** ipcp_int2ipcp_int (ipcp_int *info_type, 
*** 812,817 ****
--- 874,880 ----
    info_type->high = 0;
  }
  
+ /* Returns low part of info_type as an int.  */
  static inline int
  ipcp_get_int (ipcp_int *info_type)
  {
*************** ipcp_get_int (ipcp_int *info_type)
*** 820,831 ****
--- 883,896 ----
  
  /* cval interface.  */
  
+ /* Sets type to cval.  */
  static inline void 
  ipcp_cval_set_cvalue_type (struct ipcp_formal *cval, enum Cvalue_type type)
  {
    cval->cvalue_type = type;
  }
  
+ /* Sets value to cval.  */
  static inline void
  ipcp_cval_set_cvalue (struct ipcp_formal *cval, ipcp_int *value) 
  {
*************** ipcp_cval_set_cvalue (struct ipcp_formal
*** 833,850 ****
--- 898,918 ----
    cval->cvalue.high = value->high;
  }
  
+ /* Returns value part of cval.  */
  static inline ipcp_int *
  ipcp_cval_get_cvalue (struct ipcp_formal *cval)
  {
    return &(cval->cvalue);
  }
  
+ /* Returns type part of cval.  */
  static inline enum Cvalue_type
  ipcp_cval_get_cvalue_type (struct ipcp_formal *cval)
  {
    return cval->cvalue_type;
  }
  
+ /* Returns true if const_val1 and const_val2 are equal.  */
  static inline bool 
  ipcp_cval_equal_cvalues (ipcp_int *const_val1, ipcp_int *const_val2)
  {
*************** ipcp_cval_equal_cvalues (ipcp_int *const
*** 858,864 ****
     it the first statemant in the function FN
     parse tree.
     PARM1 is the lhs of the assignment and
!    is the rhs. */
  static void 
  constant_val_insert (tree fn, tree parm1, tree val)
  {
--- 926,932 ----
     it the first statemant in the function FN
     parse tree.
     PARM1 is the lhs of the assignment and
!    val is the rhs. */
  static void 
  constant_val_insert (tree fn, tree parm1, tree val)
  {
*************** constant_val_insert (tree fn, tree parm1
*** 868,874 ****
    tree stmt_list_body;
    tree body;
    
!   tree rhs = convert (TREE_TYPE (parm1), val);
    if (rhs == error_mark_node)
      return;
    init_stmt = build (MODIFY_EXPR, TREE_TYPE (parm1), parm1, rhs);
--- 936,942 ----
    tree stmt_list_body;
    tree body;
    
!   tree rhs = fold_convert (TREE_TYPE (parm1), val);
    if (rhs == error_mark_node)
      return;
    init_stmt = build (MODIFY_EXPR, TREE_TYPE (parm1), parm1, rhs);
*************** ipcp_propagate_const (ipcp_method mt, in
*** 898,941 ****
    parm_tree = ipcp_method_get_tree (mt, param);
    const_val = build_int_cst_wide (NULL_TREE, cvalue->low, cvalue->high);   
    if (parm_tree != NULL)
!     if (const_val != NULL)
!       if(fndecl != NULL)
! 	constant_val_insert (fndecl, parm_tree, const_val);   
! }
! 
! /* Propagates the constant parameters found by ipcp_stage2()
!    to the function's code.  */
! static void 
! ipcp_stage3 (void)
! {
!   ipcp_method node;
!   int i;
!   ipcp_int *cvalue;
!  
!   for (node = cgraph_nodes; node; node = node->next)
!     {  
!       /* Propagation of the constant is forbidden in 
!          certain conditions.  */
!       if (ipcp_method_dont_propagate (node))
!         continue;
!       for(i=0; i < ipcp_method_formal_count (node); i++)
! 	{
! 	  if (ipcp_cval_get_cvalue_type (ipcp_method_cval (node, i)) 
! 	      == CONST_VALUE) 
! 	    {
! 	      cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
!               ipcp_propagate_const (node, i, cvalue);                       
! 	    }
! 	}
!     }
  }
  
! /* Initialization and computation of IPA Constant
!    Propagation data structures. It is an IntraProcedural
     analysis of methods,which gathers information to be propagated
     later on.  */
  static void 
! ipcp_stage1 (void)
  {
     ipcp_method node;
     ipcp_callsite cs;
--- 966,981 ----
    parm_tree = ipcp_method_get_tree (mt, param);
    const_val = build_int_cst_wide (NULL_TREE, cvalue->low, cvalue->high);   
    if (parm_tree != NULL)
!     if(fndecl != NULL)
!       constant_val_insert (fndecl, parm_tree, const_val);   
  }
  
! /* Initialization and computation of IPCP data structures. 
!    It is an intraprocedural
     analysis of methods,which gathers information to be propagated
     later on.  */
  static void 
! ipcp_init_stage (void)
  {
     ipcp_method node;
     ipcp_callsite cs;
*************** ipcp_stage1 (void)
*** 950,962 ****
     for (node = cgraph_nodes; node; node = node->next)
       {
         /* building jump functions  */
!        for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
!             cs = ipcp_cs_get_next (cs))
           {
             ipcp_callsite_compute_count (cs);
             if (ipcp_callsite_param_count (cs) 
  	       != ipcp_method_formal_count (cs->callee))
               {
                 ipcp_callsite_param_count_set (cs, 0);
                 ipcp_method_formal_count_set (cs->callee, 0);
               }
--- 990,1004 ----
     for (node = cgraph_nodes; node; node = node->next)
       {
         /* building jump functions  */
!        for (cs =  FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
!             cs = NEXT_CALLEE (cs))
           {
             ipcp_callsite_compute_count (cs);
             if (ipcp_callsite_param_count (cs) 
  	       != ipcp_method_formal_count (cs->callee))
               {
+                /* Handles the cases of functions with 
+                   a variable number of parameters.  */
                 ipcp_callsite_param_count_set (cs, 0);
                 ipcp_method_formal_count_set (cs->callee, 0);
               }
*************** ipcp_stage1 (void)
*** 969,975 ****
  /* Interprocedural analysis. The algorithm propagates constants from
     the caller's parameters to the callee's arguments.  */
  static void 
! ipcp_stage2 (void)
  {
    int i;
    struct ipcp_formal cval1 = {0, {0, 0}}, cval = {0, {0, 0}};
--- 1011,1017 ----
  /* Interprocedural analysis. The algorithm propagates constants from
     the caller's parameters to the callee's arguments.  */
  static void 
! ipcp_iterate_stage (void)
  {
    int i;
    struct ipcp_formal cval1 = {0, {0, 0}}, cval = {0, {0, 0}};
*************** ipcp_stage2 (void)
*** 985,992 ****
    while (ipcp_methodlist_not_empty (wl))
      {
        mt = ipcp_remove_method (&wl);
!       for (cs = ipcp_cs_get_first (mt); ipcp_cs_not_last (cs); 
! 	   cs = ipcp_cs_get_next (cs))
  	{
  	  callee = ipcp_callsite_callee (cs);
  	  for (i = 0; i < ipcp_callsite_param_count (cs); i++)
--- 1027,1034 ----
    while (ipcp_methodlist_not_empty (wl))
      {
        mt = ipcp_remove_method (&wl);
!       for (cs = FIRST_CALLEE (mt); CALLEE_NOT_LAST (cs); 
! 	   cs = NEXT_CALLEE (cs))
  	{
  	  callee = ipcp_callsite_callee (cs);
  	  for (i = 0; i < ipcp_callsite_param_count (cs); i++)
*************** ipcp_stage2 (void)
*** 1008,1014 ****
      }
  }
  
! /* Allocate and initialize ipa_node structure.  */
  static void
  ipa_nodes_create (void)
  {
--- 1050,1110 ----
      }
  }
  
! /* Propagates the constant parameters found by ipcp_iterate_stage()
!    to the function's code.  */
! static void 
! ipcp_insert_stage (void)
! {
!   ipcp_method node, node1 = NULL;
!   int i, const_param;
!   ipcp_int *cvalue;
!   varray_type redirect_callers;
!   ipcp_callsite cs;
!   int node_callers;
!    
!   for (node = cgraph_nodes; node; node = node->next)
!     {  
!       /* Propagation of the constant is forbidden in 
!          certain conditions.  */
!       if (ipcp_method_dont_insert_const (node))
!         continue;
!       const_param=0;
!       for(i = 0; i < ipcp_method_formal_count (node); i++)
! 	{
! 	  if (ipcp_cval_get_cvalue_type (ipcp_method_cval (node, i)) 
! 	      == CONST_VALUE) 
! 	    {
! 	      cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));             
!               if (const_param == 0)
!                 {
!                   /* Compute how many callers node has.  */
!                   node_callers = 0;
!                   for (cs = node->callers; cs != NULL;
!                        cs = cs->next_caller)
!                     node_callers++;  
!                   VARRAY_GENERIC_PTR_INIT (redirect_callers, node_callers, "redirect_callers");
!                   for (cs = node->callers; cs != NULL;
!                        cs = cs->next_caller)
!                     VARRAY_PUSH_GENERIC_PTR (redirect_callers, cs);
!                   /* Redirecting all the callers of the node to the 
!                      new versioned node.  */
!                   node1 = cgraph_function_versioning (node, redirect_callers); 
!                   VARRAY_CLEAR (redirect_callers);
!                   if (node1 == NULL)
!                     break;
!                   ipa_cloned_create (node, node1);
!                 }
!               ipcp_propagate_const (node1, i, cvalue);
!               const_param++;           
!              
! 	    }
! 	}
!     }
!   ipcp_update_callgraph ();
! }
! 
! /* Allocate and initialize ipa_node structure for all
!    nodes in callgraph.  */
  static void
  ipa_nodes_create (void)
  {
*************** ipa_nodes_create (void)
*** 1016,1026 ****
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!        node->aux = xcalloc (1, sizeof (struct ipa_node));     
!        ((struct ipa_node *)node->aux)->ipcp_cval = NULL;
!        ((struct ipa_node *)node->aux)->ipcp_arg_num =0;
!        ((struct ipa_node *)node->aux)-> ipcp_param_tree = NULL;
!        ((struct ipa_node *)node->aux)->ipcp_mod = NULL;
       }
  }
  
--- 1112,1118 ----
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!        ipa_node_create (node);
       }
  }
  
*************** ipa_edges_create (void)
*** 1033,1044 ****
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
! 	     cs = ipcp_cs_get_next (cs))
  	  {
  	    cs->aux = xcalloc (1, sizeof (struct ipa_edge));
- 	    ((struct ipa_edge *)cs->aux)->ipcp_param_map = NULL;
- 	    ((struct ipa_edge *)cs->aux)->ipcp_param_num = 0;
  	  }
       
       }
--- 1125,1134 ----
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
! 	     cs = NEXT_CALLEE (cs))
  	  {
  	    cs->aux = xcalloc (1, sizeof (struct ipa_edge));
  	  }
       
       }
*************** ipa_edges_free (void)
*** 1066,1073 ****
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
! 	     cs = ipcp_cs_get_next (cs))
  	  {
  	   
  	    free (cs->aux);
--- 1156,1163 ----
    
    for (node = cgraph_nodes; node; node = node->next)
       {
!         for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
! 	     cs = NEXT_CALLEE (cs))
  	  {
  	   
  	    free (cs->aux);
*************** ipa_edges_free (void)
*** 1077,1083 ****
       }
  }
  
! /* The IPA constant propagation driver.  */
  void 
  ipcp_driver (void)
  {
--- 1167,1173 ----
       }
  }
  
! /* The IPCP driver.  */
  void 
  ipcp_driver (void)
  {
*************** ipcp_driver (void)
*** 1087,1105 ****
       fprintf (cgraph_dump_file, "\nIPA constant propagation start:\n");
    ipa_nodes_create ();
    ipa_edges_create ();
!   ipcp_stage1 ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures before propagation:\n");
        ipcp_structures_print (cgraph_dump_file); 
      }
!   ipcp_stage2 ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures after propagation:\n");
        ipcp_structures_print (cgraph_dump_file);      
      }
!   ipcp_stage3 ();  
    ipcp_free (); 
    ipa_nodes_free ();
    ipa_edges_free ();
--- 1177,1200 ----
       fprintf (cgraph_dump_file, "\nIPA constant propagation start:\n");
    ipa_nodes_create ();
    ipa_edges_create ();
!   /* 1. Call the init stage to initialize 
!      the ipa_node and ipa_edge structures.  */
!   ipcp_init_stage ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures before propagation:\n");
        ipcp_structures_print (cgraph_dump_file); 
      }
!   /* 2. Do the interprocedural propagation.  */
!   ipcp_iterate_stage ();
    if (cgraph_dump_file)
      {
        fprintf (cgraph_dump_file, "\nIPA structures after propagation:\n");
        ipcp_structures_print (cgraph_dump_file);      
      }
!   /* 3. Insert the constants found to the functions.  */
!   ipcp_insert_stage ();
!   /* Free all IPCP structures.  */
    ipcp_free (); 
    ipa_nodes_free ();
    ipa_edges_free ();
*************** ipcp_driver (void)
*** 1107,1113 ****
      fprintf (cgraph_dump_file, "\nIPA constant propagation end\n");
  }
  
! /* Frees the ipcp_method's IPA data structures.  */
  static void 
  ipcp_free (void)
  {
--- 1202,1208 ----
      fprintf (cgraph_dump_file, "\nIPA constant propagation end\n");
  }
  
! /* Frees the ipcp_method's IPCP data structures.  */
  static void 
  ipcp_free (void)
  {
*************** ipcp_free (void)
*** 1116,1142 ****
  
    for (node = cgraph_nodes; node; node = node->next)
       {
!        free (((struct ipa_node *)node->aux)->ipcp_cval);
!        free (((struct ipa_node *)node->aux)->ipcp_param_tree);
!        free (((struct ipa_node *)node->aux)->ipcp_mod); 
!        for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
!             cs = ipcp_cs_get_next (cs))
!          free (((struct ipa_edge *)cs->aux)->ipcp_param_map);
       }
  }
  
! /* Check conditions to forbid constant propagation.  */
  static bool
! ipcp_method_dont_propagate (ipcp_method mt)
  {
!   if (!tree_inlinable_function_p (mt->decl))
      return true;
    if (ipcp_method_contains_asm (mt))
      return true;
    return false;
  }
  
! /* Called by walk_tree to find. Returns true if an asm expr was found  */
  static tree 
  ipcp_asm_walk_tree (tree *tp, int *walk_subtrees ATTRIBUTE_UNUSED, 
  		    void *data ATTRIBUTE_UNUSED)
--- 1211,1248 ----
  
    for (node = cgraph_nodes; node; node = node->next)
       {
!        if (node->aux == NULL)
!          continue;
!        if (((struct ipa_node *)node->aux)->ipcp_cval)
!          free (((struct ipa_node *)node->aux)->ipcp_cval);      
!        if (((struct ipa_node *)node->aux)->ipcp_param_tree)
!          free (((struct ipa_node *)node->aux)->ipcp_param_tree);      
!        if (((struct ipa_node *)node->aux)->ipcp_mod)
!          free (((struct ipa_node *)node->aux)->ipcp_mod); 
!        for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
!             cs = NEXT_CALLEE (cs))
!          {        
!            if (cs->aux)
!              if (((struct ipa_edge *)cs->aux)->ipcp_param_map)
!                free (((struct ipa_edge *)cs->aux)->ipcp_param_map);
!          }
       }
  }
  
! /* Check conditions to forbid constant insertion to the function.  */
  static bool
! ipcp_method_dont_insert_const (ipcp_method mt)
  {
!   /* ??? Handle pending sizes case.  */  
!   if (DECL_UNINLINABLE (mt->decl))
      return true;
+   /* Prevent propagation in the case of ASM calls.  */  
    if (ipcp_method_contains_asm (mt))
      return true;
    return false;
  }
  
! /* Called by walk_tree to find. Returns true if an asm expr was found.  */
  static tree 
  ipcp_asm_walk_tree (tree *tp, int *walk_subtrees ATTRIBUTE_UNUSED, 
  		    void *data ATTRIBUTE_UNUSED)
*************** ipcp_method_contains_asm (ipcp_method mt
*** 1169,1177 ****
    return false;
  }
  
! /* Debuffing interface.  */
  
! /* Prints all IPA constant propagation data structures.  */
  static void 
  ipcp_structures_print (FILE *f)
  {
--- 1275,1283 ----
    return false;
  }
  
! /* Debugging interface.  */
  
! /* Prints all IPCP data structures.  */
  static void 
  ipcp_structures_print (FILE *f)
  {
*************** ipcp_structures_print (FILE *f)
*** 1181,1186 ****
--- 1287,1293 ----
    ipcp_callsite_param_print (f);
  }
  
+ /* Prints ipcp_cval data structures.  */
  static void 
  ipcp_method_cval_print (FILE *f)
  {
*************** ipcp_method_cval_print (FILE *f)
*** 1200,1207 ****
                 fprintf (f, " param [%d]: ", i);
  	       fprintf (f, "type is CONST ");
                 cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
! 	       fprintf (f, "value is  %d %d\n",(int)cvalue->high, 
!                         (int)cvalue->low);
  	     }
  	   else if (ipcp_method_cval (node, i)->cvalue_type == TOP)
  	     fprintf(f, "param [%d]: type is TOP  \n", i);           
--- 1307,1315 ----
                 fprintf (f, " param [%d]: ", i);
  	       fprintf (f, "type is CONST ");
                 cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
!                fprintf (f, "value is  "HOST_WIDE_INT_PRINT_DEC
!                         " "HOST_WIDE_INT_PRINT_DEC" \n", 
!                         cvalue->high, cvalue->low);
  	     }
  	   else if (ipcp_method_cval (node, i)->cvalue_type == TOP)
  	     fprintf(f, "param [%d]: type is TOP  \n", i);           
*************** ipcp_method_cval_print (FILE *f)
*** 1211,1216 ****
--- 1319,1325 ----
       }
  }
  
+ /* Prints ipcp_tree_map data structures.  */
  static void  
  ipcp_method_tree_print (FILE *f) 
  {
*************** ipcp_method_tree_print (FILE *f) 
*** 1238,1243 ****
--- 1347,1353 ----
      }
  }
  
+ /* Printf ipcp_modify data structures.  */
  static void  
  ipcp_method_modify_print (FILE *f) 
  {
*************** ipcp_method_modify_print (FILE *f) 
*** 1260,1266 ****
        
      }
  }
!  
  static void  
  ipcp_callsite_param_print (FILE *f)
  {
--- 1370,1376 ----
        
      }
  }
! /* Prints ipcp_jump_func data structures.  */ 
  static void  
  ipcp_callsite_param_print (FILE *f)
  {
*************** ipcp_callsite_param_print (FILE *f)
*** 1274,1281 ****
    fprintf(f, "\nCALLSITE PARAM PRINT\n");
    for (node = cgraph_nodes; node; node = node->next)
      {    
!       for (cs = ipcp_cs_get_first (node); ipcp_cs_not_last (cs);
!             cs = ipcp_cs_get_next (cs))
           {
             fprintf (f, "callsite  %s ", cgraph_node_name (node));
             fprintf (f, "-> %s :: \n", cgraph_node_name (cs->callee));
--- 1384,1391 ----
    fprintf(f, "\nCALLSITE PARAM PRINT\n");
    for (node = cgraph_nodes; node; node = node->next)
      {    
!       for (cs = FIRST_CALLEE (node); CALLEE_NOT_LAST (cs);
!             cs =  NEXT_CALLEE (cs))
           {
             fprintf (f, "callsite  %s ", cgraph_node_name (node));
             fprintf (f, "-> %s :: \n", cgraph_node_name (cs->callee));
*************** ipcp_callsite_param_print (FILE *f)
*** 1290,1297 ****
                else if (type == CONST_IPATYPE)
                  {
                    fprintf (f, "CONST : ");
!                   fprintf (f, "high: %d low: %d\n",(int)info_type->high, 
!                            (int)info_type->low);
                  }
                else if (type == FORMAL_IPATYPE)
                  {
--- 1400,1408 ----
                else if (type == CONST_IPATYPE)
                  {
                    fprintf (f, "CONST : ");
!                   fprintf (f, " "HOST_WIDE_INT_PRINT_DEC
!                            " "HOST_WIDE_INT_PRINT_DEC" \n", 
!                            info_type->high, info_type->low);
                  }
                else if (type == FORMAL_IPATYPE)
                  {
*************** ipcp_callsite_param_print (FILE *f)
*** 1302,1304 ****
--- 1413,1473 ----
           }
      }
  }
+ 
+ /* Fix the callsites and the callgraph after function cloning was done.  */
+ static void 
+ ipcp_update_callgraph (void)
+ {
+   ipcp_method node, orig_callee;
+   ipcp_callsite cs;
+ 
+    for (node = cgraph_nodes; node; node = node->next)
+      {
+        /* want to fix only original nodes  */
+        if (ipcp_method_is_cloned (node))
+          continue;
+        for (cs = FIRST_CALLEE (node);  CALLEE_NOT_LAST (cs);
+             cs =  NEXT_CALLEE (cs))
+          {
+            if (ipcp_method_is_cloned (cs->callee))
+              {
+                /* Callee is a cloned node  */
+                orig_callee = ipcp_method_orig_node (cs->callee);
+                if (ipcp_redirect (cs))
+                  {
+                    cgraph_redirect_edge_callee (cs, orig_callee);
+                    TREE_OPERAND (TREE_OPERAND (cs->call_expr, 0), 0) = orig_callee->decl;
+                  }
+              }
+          }
+      }
+ }
+ 
+ /* Retruns true if this callsite should be redirected to 
+    the orig callee (instead of the cloned one).  */
+ static bool
+ ipcp_redirect (ipcp_callsite  cs)
+ {
+   ipcp_method caller, callee, orig_callee;
+   int i;
+   struct ipcp_jump_func *jump_func;
+   enum Jfunc_type type;
+   
+   caller = cs->caller;
+   callee = cs->callee;
+   orig_callee =  ipcp_method_orig_node (callee);
+   
+   for (i = 0; i < ipcp_method_formal_count (orig_callee); i++)
+      {
+        if (ipcp_cval_get_cvalue_type (ipcp_method_cval (orig_callee, i)) 
+            == CONST_VALUE) 
+          {
+            jump_func = ipcp_callsite_param (cs, i);
+            type = get_type (jump_func);
+            if (type != CONST_IPATYPE)
+              return true;
+          }
+      }
+   return false;
+ }
+ 
Index: ipa_prop.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/Attic/ipa_prop.h,v
retrieving revision 1.1.2.1
diff -c -3 -p -r1.1.2.1 ipa_prop.h
*** ipa_prop.h	28 Aug 2004 17:26:34 -0000	1.1.2.1
--- ipa_prop.h	5 Oct 2004 08:35:56 -0000
***************
*** 1,6 ****
! /* Callgraph handling code.
!    Copyright (C) 2003, 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky.
  
  This file is part of GCC.
  
--- 1,6 ----
! /* Interprocedural constant propagation
!    Copyright (C) 2004 Free Software Foundation, Inc.
!    Contributed by Razya Ladelsky <RAZYA@il.ibm.com>
  
  This file is part of GCC.
  
*************** struct ipcp_modify
*** 57,75 ****
  
  struct ipa_node
  {
    /* Array of cvals.  */
    struct ipcp_formal* GTY ((skip (""))) ipcp_cval;
-   int ipcp_arg_num;
    /* Mapping each parameter to its PARM_DECL tree.  */
    struct ipcp_tree_map* GTY ((skip (""))) ipcp_param_tree;
    /* Indicating which parameter is modified in its method.  */
    struct ipcp_modify* GTY ((skip (""))) ipcp_mod;
  };
  
  struct ipa_edge
  {
!   struct ipcp_jump_func* GTY ((skip (""))) ipcp_param_map;
    int ipcp_param_num;
  };
  void ipcp_driver (void);
  
--- 57,85 ----
  
  struct ipa_node
  {
+   /* Number of formal parameters of this method.  When set to 0,
+    this method's parameters would not be analyzed by the different 
+    stages of IPA CP.  */
+   int ipcp_arg_num;
    /* Array of cvals.  */
    struct ipcp_formal* GTY ((skip (""))) ipcp_cval;
    /* Mapping each parameter to its PARM_DECL tree.  */
    struct ipcp_tree_map* GTY ((skip (""))) ipcp_param_tree;
    /* Indicating which parameter is modified in its method.  */
    struct ipcp_modify* GTY ((skip (""))) ipcp_mod;
+    /* Only for cloned nodes this field would not be NULL, 
+   it points to the node that IPA cp cloned from.  */ 
+   struct cgraph_node *ipcp_orig_node;
  };
  
  struct ipa_edge
  {
!   /* Number of actual arguments in this callsite.  When set to 0,
!    this callsite's parameters would not be analyzed by the different 
!    stages of IPA CP.  */
    int ipcp_param_num;
+   /* Array of the callsite's jump function of each parameter.  */
+   struct ipcp_jump_func* GTY ((skip (""))) ipcp_param_map; 
  };
  void ipcp_driver (void);
  
Index: langhooks-def.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/langhooks-def.h,v
retrieving revision 1.34.2.29.2.7
diff -c -3 -p -r1.34.2.29.2.7 langhooks-def.h
*** langhooks-def.h	18 Sep 2004 22:47:14 -0000	1.34.2.29.2.7
--- langhooks-def.h	5 Oct 2004 08:35:56 -0000
*************** extern void lhd_tree_inlining_end_inlini
*** 84,90 ****
  extern tree lhd_tree_inlining_convert_parm_for_inlining (tree, tree, tree, int);
  extern void lhd_initialize_diagnostics (struct diagnostic_context *);
  extern tree lhd_callgraph_analyze_expr (tree *, int *, tree);
! 
  
  /* Declarations for tree gimplification hooks.  */
  extern int lhd_gimplify_expr (tree *, tree *, tree *);
--- 84,90 ----
  extern tree lhd_tree_inlining_convert_parm_for_inlining (tree, tree, tree, int);
  extern void lhd_initialize_diagnostics (struct diagnostic_context *);
  extern tree lhd_callgraph_analyze_expr (tree *, int *, tree);
! extern int lhd_tree_versioning_cannot_version_tree_fn (tree *);
  
  /* Declarations for tree gimplification hooks.  */
  extern int lhd_gimplify_expr (tree *, tree *, tree *);
*************** extern int lhd_gimplify_expr (tree *, tr
*** 170,175 ****
--- 170,184 ----
    LANG_HOOKS_TREE_INLINING_CONVERT_PARM_FOR_INLINING \
  }
  
+ /* Tree versioning hooks.  */
+ #define LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN \
+   lhd_tree_versioning_cannot_version_tree_fn
+ 
+ #define LANG_HOOKS_TREE_VERSIONING_INITIALIZER { \
+ LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN \
+ }
+ 
+ 
  #define LANG_HOOKS_CALLGRAPH_ANALYZE_EXPR lhd_callgraph_analyze_expr
  #define LANG_HOOKS_CALLGRAPH_EXPAND_FUNCTION NULL
  
*************** extern tree lhd_make_node (enum tree_cod
*** 292,297 ****
--- 301,307 ----
    LANG_HOOKS_FORMAT_ATTRIBUTE_TABLE, \
    LANG_HOOKS_FUNCTION_INITIALIZER, \
    LANG_HOOKS_TREE_INLINING_INITIALIZER, \
+   LANG_HOOKS_TREE_VERSIONING_INITIALIZER, \
    LANG_HOOKS_CALLGRAPH_INITIALIZER, \
    LANG_HOOKS_TREE_DUMP_INITIALIZER, \
    LANG_HOOKS_DECLS, \
Index: langhooks.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/langhooks.c,v
retrieving revision 1.31.2.22.2.7
diff -c -3 -p -r1.31.2.22.2.7 langhooks.c
*** langhooks.c	18 Sep 2004 22:47:14 -0000	1.31.2.22.2.7
--- langhooks.c	5 Oct 2004 08:35:56 -0000
*************** lhd_tree_inlining_cannot_inline_tree_fn 
*** 317,322 ****
--- 317,332 ----
    return 0;
  }
  
+ /* lang_hooks.tree_versioning.cannot_version_tree_fn is called to
+    determine whether there are language-specific reasons for not
+    creating a new version to a given function.  */
+ 
+ int
+ lhd_tree_versioning_cannot_version_tree_fn (tree *fnp ATTRIBUTE_UNUSED)
+ {
+   return 0;
+ }
+ 
  /* lang_hooks.tree_inlining.disregard_inline_limits is called to
     determine whether a function should be considered for inlining even
     if it would exceed inlining limits.  */
Index: langhooks.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/langhooks.h,v
retrieving revision 1.42.2.30.2.6
diff -c -3 -p -r1.42.2.30.2.6 langhooks.h
*** langhooks.h	18 Sep 2004 22:47:14 -0000	1.42.2.30.2.6
--- langhooks.h	5 Oct 2004 08:35:56 -0000
*************** struct lang_hooks_for_tree_inlining
*** 47,52 ****
--- 47,59 ----
    tree (*convert_parm_for_inlining) (tree, tree, tree, int);
  };
  
+ 
+ struct lang_hooks_for_tree_versioning
+ {
+   int (*cannot_version_tree_fn) (tree *);
+ };
+ 
+ 
  struct lang_hooks_for_callgraph
  {
    /* The node passed is a language-specific tree node.  If its contents
*************** struct lang_hooks
*** 387,392 ****
--- 394,401 ----
    struct lang_hooks_for_functions function;
  
    struct lang_hooks_for_tree_inlining tree_inlining;
+   
+   struct lang_hooks_for_tree_versioning tree_versioning;  
  
    struct lang_hooks_for_callgraph callgraph;
  
Index: tree-gimple.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/tree-gimple.h,v
retrieving revision 2.1.2.4
diff -c -3 -p -r2.1.2.4 tree-gimple.h
*** tree-gimple.h	9 Sep 2004 10:55:42 -0000	2.1.2.4
--- tree-gimple.h	5 Oct 2004 08:35:56 -0000
*************** Boston, MA 02111-1307, USA.  */
*** 28,33 ****
--- 28,34 ----
  extern tree create_tmp_var_raw (tree, const char *);
  extern tree create_tmp_var_name (const char *);
  extern tree create_tmp_var (tree, const char *);
+ extern tree create_function_name (const char *);
  extern tree get_initialized_tmp_var (tree, tree *, tree *);
  extern tree get_formal_tmp_var (tree, tree *);
  extern void declare_tmp_vars (tree, tree);
Index: tree-inline.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/tree-inline.c,v
retrieving revision 1.26.2.83.2.12
diff -c -3 -p -r1.26.2.83.2.12 tree-inline.c
*** tree-inline.c	25 Sep 2004 23:17:57 -0000	1.26.2.83.2.12
--- tree-inline.c	5 Oct 2004 08:35:56 -0000
*************** typedef struct inline_data
*** 102,107 ****
--- 102,109 ----
    bool cloning_p;
    /* Similarly for saving function body.  */
    bool saving_p;
+   /* Versioning function is slightly different from inlining. */
+   bool versioning_p;
    /* Hash table used to prevent walk_tree from visiting the same node
       umpteen million times.  */
    htab_t tree_pruner;
*************** static tree mark_local_for_remap_r (tree
*** 138,143 ****
--- 140,149 ----
  static void unsave_expr_1 (tree);
  static tree unsave_r (tree *, int *, void *);
  static void declare_inline_vars (tree bind_expr, tree vars);
+ static tree copy_arguments_for_versioning (tree, inline_data *);
+ static tree version_body (tree, inline_data *);
+ static tree copy_static_chain (tree, inline_data *);
+ static bool function_versionable_p (tree);
  
  /* Insert a tree->tree mapping for ID.  Despite the name suggests
     that the trees should be variables, it is used for more than that.  */
*************** remap_decl (tree decl, inline_data *id)
*** 169,176 ****
    if (!n)
      {
        /* Make a copy of the variable or label.  */
!       tree t = copy_decl_for_inlining (decl, fn, VARRAY_TREE (id->fns, 0));
! 
        /* Remap types, if necessary.  */
        TREE_TYPE (t) = remap_type (TREE_TYPE (t), id);
        if (TREE_CODE (t) == TYPE_DECL)
--- 175,185 ----
    if (!n)
      {
        /* Make a copy of the variable or label.  */
!       tree t;
!       if (!id->versioning_p)
!         t = copy_decl_for_inlining (decl, fn, VARRAY_TREE (id->fns, 0));
!       else
!         t = copy_decl_for_versioning (decl, fn, VARRAY_TREE (id->fns, 0));
        /* Remap types, if necessary.  */
        TREE_TYPE (t) = remap_type (TREE_TYPE (t), id);
        if (TREE_CODE (t) == TYPE_DECL)
*************** copy_body_r (tree *tp, int *walk_subtree
*** 460,468 ****
      gcc_assert (DECL_EXTERNAL (*tp) || TREE_STATIC (*tp));
  #endif
  
!   /* If this is a RETURN_EXPR, change it into a MODIFY_EXPR and a
!      GOTO_EXPR with the RET_LABEL as its target.  */
!   if (TREE_CODE (*tp) == RETURN_EXPR && id->ret_label)
      {
        tree return_stmt = *tp;
        tree goto_stmt;
--- 469,479 ----
      gcc_assert (DECL_EXTERNAL (*tp) || TREE_STATIC (*tp));
  #endif
  
!   /* If this is a RETURN_STMT, change it into an EXPR_STMT and a
!      GOTO_STMT with the RET_LABEL as its target.  */
!   if (TREE_CODE (*tp) == RETURN_EXPR 
!       && id->ret_label
!       && !id->versioning_p)
      {
        tree return_stmt = *tp;
        tree goto_stmt;
*************** copy_body_r (tree *tp, int *walk_subtree
*** 562,568 ****
  		}
  	    }
  	}
!       else if (TREE_CODE (*tp) == INDIRECT_REF)
  	{
  	  /* Get rid of *& from inline substitutions that can happen when a
  	     pointer argument is an ADDR_EXPR.  */
--- 573,580 ----
  		}
  	    }
  	}
!       else if (TREE_CODE (*tp) == INDIRECT_REF
!                && !id->versioning_p)
  	{
  	  /* Get rid of *& from inline substitutions that can happen when a
  	     pointer argument is an ADDR_EXPR.  */
*************** copy_body_r (tree *tp, int *walk_subtree
*** 602,613 ****
  	    }
  	  else
  	    {
!               struct cgraph_edge *edge
! 		= cgraph_edge (id->current_node, old_node);
! 
! 	      if (edge)
! 	        cgraph_clone_edge (edge, id->node, *tp);
! 	    }
  	}
  
        TREE_TYPE (*tp) = remap_type (TREE_TYPE (*tp), id);
--- 614,639 ----
  	    }
  	  else
  	    {
!               struct cgraph_edge *edge;
!               
!               if (!id->versioning_p)
!                 {
!                   edge = cgraph_edge (id->current_node, old_node);
!                   if (edge)
!                     cgraph_clone_edge (edge, id->node, *tp);
!                 }
!             }
!           if (id->versioning_p)
!             {
!               /* Update the call_expr on the edges from the new versions
!                  to it's callees. */
!               struct cgraph_edge *edge;
!               tree callee;
!               callee = get_callee_fndecl (*tp);
!               edge = cgraph_edge (id->node, old_node);
!               if (edge)
!                 edge->call_expr = *tp;
!             }
  	}
  
        TREE_TYPE (*tp) = remap_type (TREE_TYPE (*tp), id);
*************** declare_inline_vars (tree bind_expr, tre
*** 2470,2472 ****
--- 2496,2655 ----
  
    add_var_to_bind_expr (bind_expr, vars);
  }
+ 
+ 
+ /* Return a copy of the function argument tree.  */
+ static tree 
+ copy_arguments_for_versioning (tree orig_parm, inline_data *id)
+ {
+   tree *arg_copy, *parg;
+   
+   arg_copy = &orig_parm;
+   for (parg = arg_copy; *parg; parg = &TREE_CHAIN (*parg))
+     {
+       tree new = remap_decl (*parg, id);
+       lang_hooks.dup_lang_specific_decl (new);
+       TREE_CHAIN (new) = TREE_CHAIN (*parg);
+       *parg = new;
+     }
+   return orig_parm;
+ }
+ 
+ 
+ /* return a copy of the function tree body.  */
+ static tree
+ version_body (tree body, inline_data *id)
+ {
+   tree new_body = body;
+   
+   walk_tree (&new_body, copy_body_r, id, NULL);
+   return new_body;
+ }
+ 
+ 
+ /* Return a copy of the function static chain.  */
+ static tree
+ copy_static_chain (tree static_chain, inline_data *id)
+ {
+   tree *chain_copy, *pvar;
+   
+   chain_copy = &static_chain;
+   for (pvar = chain_copy; *pvar; pvar = &TREE_CHAIN (*pvar))
+     {
+       tree new = remap_decl (*pvar, id);
+       lang_hooks.dup_lang_specific_decl (new);
+       TREE_CHAIN (new) = TREE_CHAIN (*pvar);
+       *pvar = new;
+     }
+   return static_chain;
+ }
+ 
+ bool
+ tree_versionable_function_p (tree fndecl)
+ {
+   return function_versionable_p (fndecl);
+ }
+ 
+ /* Return true if the function can have a new version.
+    This is a guard for the versioning functionality.  */
+ static bool
+ function_versionable_p (tree fndecl)
+ {
+   if (fndecl == NULL_TREE)
+     return false;
+   /* ??? There are cases where a function is
+      uninlinable but can have a new versions.  */
+   if (!tree_inlinable_function_p (fndecl))  
+     return false;
+   /* ??? Remove the following gaurd.  */
+   if (lang_hooks.tree_versioning.cannot_version_tree_fn (&fndecl))
+     return false;
+ 
+   return true;
+ }
+ 
+ 
+ /* Copy the function tree. 
+    OLD_DECL and NEW_DECL are FUNCTION_DECL tree nodes 
+    of the original function and the new copied function
+    respectively.  This cloning supports function 
+    versioning utility.  */
+ void
+ tree_function_versioning (tree old_decl, tree new_decl) 
+ {
+   struct cgraph_node *old_version_node;
+   struct cgraph_node *new_version_node;
+   inline_data id;
+   tree p;
+   
+   if (TREE_CODE (old_decl) != FUNCTION_DECL
+       || TREE_CODE (new_decl) != FUNCTION_DECL)
+     abort ();
+  
+   old_version_node = cgraph_node (old_decl);
+   new_version_node = cgraph_node (new_decl);
+   
+   /* FIXME: For the moment debug information for the new
+      version is not supported.  */
+   DECL_ARTIFICIAL (new_decl) = 1;
+   DECL_IGNORED_P (new_decl) = 1;
+   
+   /* Generate a new name for the new version. */
+   DECL_NAME (new_decl) = 
+     create_function_name (IDENTIFIER_POINTER (DECL_NAME (old_decl)));
+   /* Create a new SYMBOL_REF rtx for the new name. */
+   if (DECL_RTL (old_decl) != NULL)
+     {
+       SET_DECL_RTL (new_decl, copy_rtx (DECL_RTL (old_decl)));
+       XEXP (DECL_RTL (new_decl), 0) =
+         gen_rtx_SYMBOL_REF (GET_MODE (XEXP (DECL_RTL (old_decl), 0)),
+                             IDENTIFIER_POINTER (DECL_NAME (new_decl)));
+     }
+   
+   /* Prepare the data-structures for
+      coping the tree of the new version. */
+   
+   memset (&id, 0, sizeof (id));
+   /* The new version. */
+   id.node = new_version_node;
+   /* The old version. */
+   id.current_node =  cgraph_node (old_decl);
+   id.versioning_p = true;
+   id.decl_map = splay_tree_new (splay_tree_compare_pointers, 
+                                 NULL, NULL);
+   id.tree_pruner = htab_create (37, htab_hash_pointer,
+ 				htab_eq_pointer, NULL);
+   
+   VARRAY_TREE_INIT (id.fns, 1, "fns");
+   VARRAY_PUSH_TREE (id.fns, new_decl);
+   VARRAY_PUSH_TREE (id.fns, old_decl);
+   
+   /* Copy the function's static chain. */
+   p = DECL_STRUCT_FUNCTION (old_decl)->static_chain_decl;
+   if (p)
+     {
+       DECL_STRUCT_FUNCTION (new_decl)->static_chain_decl =
+         copy_static_chain (DECL_STRUCT_FUNCTION (old_decl)->static_chain_decl, &id);
+     }  
+   
+   /* Copy the function's arguments. */
+   if (DECL_ARGUMENTS (old_decl) != NULL_TREE)
+     {
+       DECL_ARGUMENTS (new_decl) = 
+         copy_arguments_for_versioning (DECL_ARGUMENTS (old_decl), &id);
+     }
+   /* Copy the Function's body. */
+   DECL_SAVED_TREE (new_decl) =
+     version_body (DECL_SAVED_TREE (old_decl), &id);
+   if (DECL_RESULT (old_decl) != NULL_TREE)
+     {
+       tree *res_decl = &DECL_RESULT (old_decl);
+       DECL_RESULT (new_decl) = remap_decl (*res_decl, &id);
+       lang_hooks.dup_lang_specific_decl (DECL_RESULT (new_decl));
+     }
+   
+   /* Clean up.  */
+   htab_delete (id.tree_pruner);
+   splay_tree_delete (id.decl_map);
+   return;
+ }
Index: tree-inline.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/tree-inline.h,v
retrieving revision 1.4.2.6.4.1
diff -c -3 -p -r1.4.2.6.4.1 tree-inline.h
*** tree-inline.h	2 Aug 2004 23:12:02 -0000	1.4.2.6.4.1
--- tree-inline.h	5 Oct 2004 08:35:56 -0000
*************** void clone_body (tree, tree, void *);
*** 31,36 ****
--- 31,38 ----
  tree save_body (tree, tree *, tree *);
  void remap_save_expr (tree *, void *, int *);
  int estimate_num_insns (tree expr);
+ bool tree_versionable_function_p (tree);
+ void tree_function_versioning (tree, tree);
  
  /* 0 if we should not perform inlining.
     1 if we should expand functions calls inline at the tree level.
Index: cp/cp-objcp-common.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/cp-objcp-common.h,v
retrieving revision 1.2.8.3
diff -c -3 -p -r1.2.8.3 cp-objcp-common.h
*** cp/cp-objcp-common.h	25 Sep 2004 23:18:37 -0000	1.2.8.3
--- cp/cp-objcp-common.h	5 Oct 2004 08:35:57 -0000
*************** extern tree objcp_tsubst_copy_and_build 
*** 138,143 ****
--- 138,148 ----
  #undef LANG_HOOKS_CALLGRAPH_EXPAND_FUNCTION
  #define LANG_HOOKS_CALLGRAPH_EXPAND_FUNCTION expand_body
  
+ #undef LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN
+ #define LANG_HOOKS_TREE_VERSIONING_CANNOT_VERSION_TREE_FN \
+    cp_cannot_version_tree_fn
+  
+ 
  #undef LANG_HOOKS_MAKE_TYPE
  #define LANG_HOOKS_MAKE_TYPE cxx_make_type
  #undef LANG_HOOKS_TYPE_FOR_MODE
Index: cp/cp-tree.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/cp-tree.h,v
retrieving revision 1.719.2.51.2.12
diff -c -3 -p -r1.719.2.51.2.12 cp-tree.h
*** cp/cp-tree.h	25 Sep 2004 23:18:37 -0000	1.719.2.51.2.12
--- cp/cp-tree.h	5 Oct 2004 08:35:58 -0000
*************** extern linkage_kind decl_linkage        
*** 4203,4208 ****
--- 4203,4209 ----
  extern tree cp_walk_subtrees (tree*, int*, walk_tree_fn,
  				      void*, void*);
  extern int cp_cannot_inline_tree_fn (tree*);
+ extern int cp_cannot_version_tree_fn (tree*);
  extern tree cp_add_pending_fn_decls (void*,tree);
  extern int cp_is_overload_p (tree);
  extern int cp_auto_var_in_fn_p (tree,tree);
Index: cp/tree.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/tree.c,v
retrieving revision 1.286.2.43.2.11
diff -c -3 -p -r1.286.2.43.2.11 tree.c
*** cp/tree.c	25 Sep 2004 23:18:41 -0000	1.286.2.43.2.11
--- cp/tree.c	5 Oct 2004 08:35:58 -0000
*************** cp_cannot_inline_tree_fn (tree* fnp)
*** 2057,2062 ****
--- 2057,2088 ----
    return 0;
  }
  
+ /* Decide whether there are language-specific reasons to not version a
+    function as a tree.  */
+ 
+ int 
+ cp_cannot_version_tree_fn (tree *fnp)
+ {
+   tree fn = *fnp;
+  
+   /* ??? Is there a way to version a
+      constructor?  */ 
+   if (DECL_COMPLETE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_COMPLETE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_OVERLOADED_OPERATOR_P (fn))
+     return 1;
+  
+   return 0;
+ }
+  
+ 
+ 
  /* Add any pending functions other than the current function (already
     handled by the caller), that thus cannot be inlined, to FNS_P, then
     return the latest function added to the array, PREV_FN.  */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-10 15:38 [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Razya Ladelsky
@ 2004-10-10 17:10 ` Steven Bosscher
  2004-10-10 21:25   ` Jan Hubicka
  2004-10-12 14:18 ` [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Jan Hubicka
  1 sibling, 1 reply; 13+ messages in thread
From: Steven Bosscher @ 2004-10-10 17:10 UTC (permalink / raw)
  To: Razya Ladelsky, gcc-patches; +Cc: hubicka, Mircea Namolaru, Ayal Zaks

On Sunday 10 October 2004 17:20, Razya Ladelsky wrote:
> Comments are welcome.

[ The first part of this is turned into a bit of a rant, as I'm feeling
  really bad about where you are implementing these optimizations. IMHO
  the abstraction level is wrong, we are not taking things in the right
  direction...  Skip to /rant for comments on the actual patch ;-)  ]

I don't mean to say anything bad about your work, but why do you not
try to help implementing *proper* IPA, starting with a CFG inliner and
early global (ie. intraprocedural) optimizations *before* IPA, like
apparently most compilers do? 
To quote myself from "http://gcc.gnu.org/projects/tree-profiling.html":

==============================================================
(...) plans exist for reorganisation of the compiler passes as follows:

1. For each SCC in the call graph in reverse topological order
      do simple intraprocedural optimizations in SSA form
      (basically, run anything that is cheap and can not
      create overlapping live ranges, but will improve the
      results of the next step);
      collect intraprocedural information necessary for
      interprocedural data flow analyses;
      store the function's intermediate representation somewhere (*)
 
2. Perform interprocedural analyses and optimizations (at least 
   constant propagation, side-effects analysis, and function cloning
   and specialization).

3. For each SCC in the call graph in reverse topological order 
      do remaining intraprocedural optimizations;
      hand over the optimized function to the RTL expanders
      for finalization and code generation.
==============================================================

(*) Either in memory or in a data base.  And yes I *am* willing to
    open that discussion again if it turns out that we can do great
    things with IPA but storing in memory is too expennsive (d'oh!).

This means that when we start working on this (most likely before the
end of this year), we (at SUSE, IBM, or whereever) will have to rewrite
for the new IPA framework *everything* that you're now sort-of hackishly
implementing in the existing framework that clearly is unsuitable for
proper IPA.
The reason *why* almost everything would have to be redone is because
your implementation does not work on lowered GIMPLE, and also because
we'll need to clone function CFGs (ie. not function body trees) or lose
the CFG for cloned/inlined functions, and reconstruct them later on,
somehow (but that is not really an option IMO).

This is wasteful.  I would *much* rather have everyone agree on clean
IPA framework design, and work together on implementing that.

Don't get me wrong, it is really really cool that y'all want to have this
implemented in GCC.  So do I.  It is also great that you're experimenting
with these algorithms, demonstrating that it can be done for GCC.

But what is happening right now is making things only more difficult in
the long run.  We already have this lame excuse for IPA with the "Inter
Module" C front end hack that will have to be redone if GCC is ever going
to support real IPA. 
And IMHO we do not need more obfuscated pseudo-IPA/IPO implementations
like that; not now that we have the opportunity to do it properly with
the cgraph module and low-GIMPLE which is relatively easy to serialize
and dump to a data base.

If only these reasons mattered, I'd really rather not have this patch
in CVS.  Perhaps other people have a different opinion, of course ;-)

</rant>


Now, about the patch.
First off, you should probably just create a new file cgraph-clone.c or
something to put most of the new functions.  cgraph*.[ch] is already a
bit messy and we plan to clean it up.

Second, I'm still very curious about numbers.  What does this do to the
compile time?  Have you counted propagated constants and/or cloned
functions on SPEC or GCC itself (or some other benchmark)?

A few comments on the code itself:

> +       /* If the parametr's address is taken,
> +          it could be modified.  */
>         if( TREE_CODE (TREE_OPERAND (t, 0)) == PARM_DECL )
>           {
>   	  i = ipcp_method_tree_map (mt, TREE_OPERAND (t, 0));
>             ipcp_method_modify_set (mt, i, true);
>           }
>         break;

This is one of the reasons why I believe we should first work on some
framework to do early optimizations.  After the tree optimizers, we'll
have much better information about what parameters have their address
taken.
Also, funny you have a tab before "i =" and spaces almost everywhere
else.  I don't care about spaces vs. tabs much, but I believe the GCC
style requires tabs.  (I'd say, at least don't do both! ;-)


>   /* Perform reachability analysis and reclaim all unreachable nodes.
> !    If BEFORE_INLINING_P is true this function is called before inlining
> !    decisions has been made.  If BEFORE_INLINING_P is false this function also
> !    removes unneeded bodies of extern inline functions.  */
>   static bool
> ! cgraph_remove_unreachable_nodes (bool before_inlining_p)

You should really explain what the difference between BEFORE_INLINING_P
and !BEFORE_INLINING_P is: How does this function behave differently with
this flag set or not set?


> +   for (i=0; i < strlen (tmp_name); i++)
> +   {
> +      if (tmp_name[i] == '.')
> +            tmp_name[i] = '_';
> +   }

So you're going to call strlen for every loop iteration?



> --- integrate.c	5 Oct 2004 08:35:55 -0000
> *************** copy_decl_for_inlining (tree decl, tree
> *** 176,181 ****
> --- 176,248 ----
> 
>     return copy;
>   }
> +
> + /* Copy NODE (which must be a DECL).  The DECL originally was in the FROM_FN,
> +    but now it will be in the TO_FN. As apposed for to copy_decl_for_inlining ()
> +    function; we do not give a special treatment to parm_decl and result_decl.  */
> + tree
> + copy_decl_for_versioning (tree decl, tree from_fn, tree to_fn)

This function has no business in integrate.c, this file used to implement
the RTL inliner which is long dead and gone.  Do not put new code there.
This should go into the new file, or otherwise I guess tree-inline is a
more appropriate place.
Indeed, copy_decl_for_inlining should be moved out of this file also.


> +   if (TREE_CODE (copy) == LABEL_DECL)
> +     {
> +       TREE_ADDRESSABLE (copy) = 0;
> +     }

This is not GNU/GCC code style.


> ! 	    {
> ! 	      cvalue = ipcp_cval_get_cvalue (ipcp_method_cval (node, i));
> !               if (const_param == 0)
> !                 {
> !                   /* Compute how many callers node has.  */
> !                   node_callers = 0;
> !                   for (cs = node->callers; cs != NULL;
> !                        cs = cs->next_caller)
> !                     node_callers++;

The indent of "if" also is not quite GNU/GCC style conforming.
And would it be useful to cache the number of callers/callees in the
cgraph_node?  IIRC I've seen this kind of counting elsewhere, too (but
I'm not sure).


> +   /* ??? Remove the following gaurd.  */
> +   if (lang_hooks.tree_versioning.cannot_version_tree_fn (&fndecl))
> +     return false;

Typo: guard ;-)
And I actually like the idea that the language can say, "do not clone
this", why would you want to remove this?


> +   /* ??? Is there a way to version a
> +      constructor?  */

Good question.  I would hope so, I'd expect (but have no data to support
the suspicion, unfortunately) to see constants in constructors a lot, e.g.
booleans, number of items/slots/elements/etc...
Does anyone know if this has been studied and any data has been published?


> +   /* The new variable/label has no RTL, yet.  */
> +   if (!TREE_STATIC (copy) && !DECL_EXTERNAL (copy))
> +     SET_DECL_RTL (copy, NULL_RTX);

Hmm...  This somehow looks odd, looks like a remnant of the RTL inliner.
My understanding is that since we inline before expand, we should never
have RTL for local labels and variables when we get here.  Hence, this
check might very well be redundant (you could try replacing it with a
"gcc_assert (!DECL_RTL (copy))").


> + /* lang_hooks.tree_versioning.cannot_version_tree_fn is called to
> +    determine whether there are language-specific reasons for not
> +    creating a new version to a given function.  */

"for a given function".


> *************** ipcp_method_compute_modify (ipcp_method
> *** 743,749 ****
> 
>     ipcp_method_modify_init (mt);
>     decl = mt->decl;
> !   if (!tree_inlinable_function_p (decl))
>       {
>         for (j = 0; j < ipcp_method_formal_count (mt); j++)
>   	{
> --- 799,807 ----
> 
>     ipcp_method_modify_init (mt);
>     decl = mt->decl;
> !   /* ??? Handle pending sizes case. Set all parameters
> !      of the method to be modified.  */
> !   if (DECL_UNINLINABLE (decl))
>       {
>         for (j = 0; j < ipcp_method_formal_count (mt); j++)
>   	{

What is the reason for this change?  I cannot tell from the ChangeLog,
because you don't really *have* a ChangeLog entry for ipa_prop.c:

> 	* ipa_prop.c: Update IPA constant propagation.
> 	* ipa_prop.h: Same.

Please add a proper ChangeLog entry for all patches you post.  It is
required for good reasons, patch review without a ChangeLog entry is
very difficult sometimes.

It would generally be helpful if you give a short explanation for the
changes you make to existing files, especially for the places you've 
annotated with a FIXME, ???, XXX, or some other marker indicating that
the code as-is may not be perfect.

Thanks,

Gr.
Steven


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-10 17:10 ` Steven Bosscher
@ 2004-10-10 21:25   ` Jan Hubicka
  2004-10-10 23:00     ` IPA (was: Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION)) Steven Bosscher
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2004-10-10 21:25 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: Razya Ladelsky, gcc-patches, hubicka, Mircea Namolaru, Ayal Zaks

> On Sunday 10 October 2004 17:20, Razya Ladelsky wrote:
> > Comments are welcome.
> 
> [ The first part of this is turned into a bit of a rant, as I'm feeling
>   really bad about where you are implementing these optimizations. IMHO
>   the abstraction level is wrong, we are not taking things in the right
>   direction...  Skip to /rant for comments on the actual patch ;-)  ]
> 
> I don't mean to say anything bad about your work, but why do you not
> try to help implementing *proper* IPA, starting with a CFG inliner and
> early global (ie. intraprocedural) optimizations *before* IPA, like
> apparently most compilers do? 
> To quote myself from "http://gcc.gnu.org/projects/tree-profiling.html":

Actually I don't think it is bad to have one nontrivial IPA optimization
while working on implementation of the IPA infrastructure (tought I
would not recomment put too much effort into implementing more
optimizations at this shape of infrastructure or pushing the quality
of analysis of high gimple in IPA too far.

If we do it right, most of IPA optimizers won't care much about the
actual underlying function representation.  There are number of issues
that needs to be resolved with the infrastructural changes and sometimes
it is good to give it a try in practice.  Lets try to get the clones
cgraph interface right now and we can clean up the actual function
clonning and gimple analysis once we are ready for this (I hope that
this will be relatively soon too)

> 
> ==============================================================
> (...) plans exist for reorganisation of the compiler passes as follows:
> 
> 1. For each SCC in the call graph in reverse topological order
>       do simple intraprocedural optimizations in SSA form
>       (basically, run anything that is cheap and can not
>       create overlapping live ranges, but will improve the
>       results of the next step);
>       collect intraprocedural information necessary for
>       interprocedural data flow analyses;
>       store the function's intermediate representation somewhere (*)
>  
> 2. Perform interprocedural analyses and optimizations (at least 
>    constant propagation, side-effects analysis, and function cloning
>    and specialization).
> 
> 3. For each SCC in the call graph in reverse topological order 
>       do remaining intraprocedural optimizations;
>       hand over the optimized function to the RTL expanders
>       for finalization and code generation.
> ==============================================================
> 
> (*) Either in memory or in a data base.  And yes I *am* willing to
>     open that discussion again if it turns out that we can do great
>     things with IPA but storing in memory is too expennsive (d'oh!).

Yep, we really need to resolve the issues concerning the link time
optimizations.

While I personally agree with the scheme you outlined (and also
described it in the GCC Summit paper), the requirement of not loading
whole program into memory at once (so relying only on global date while
doing IPA and performing the actual transformations later) actually
brings some stress on the implementation of IPA optimizers (ie
optimizers done after inlining needs to deal with the clones and it is
dificult to iterate optimizations like we commonly do for local
optimizations).  In fact Kenneth already suggested in private mail to
not use this approach and instead load everything in the memory at once
and do kind of iteration on the IPA.  I would like to hear some opinions
here.

It is quite interesting for me to see how the cprop and Kenneth's alias
analysis are impleemnted and see how much dificulty it is to make it fit
this scheme.
> 
> This means that when we start working on this (most likely before the
> end of this year), we (at SUSE, IBM, or whereever) will have to rewrite
> for the new IPA framework *everything* that you're now sort-of hackishly
> implementing in the existing framework that clearly is unsuitable for
> proper IPA.
> The reason *why* almost everything would have to be redone is because
> your implementation does not work on lowered GIMPLE, and also because
> we'll need to clone function CFGs (ie. not function body trees) or lose
> the CFG for cloned/inlined functions, and reconstruct them later on,
> somehow (but that is not really an option IMO).

Hopefully with the new CFG inliner implementation we will get cloning
functions with CFG needed for cprop for free too (in fact following the
approach of implementing IPA optimizations at once after propagation
pass, it might be nice to make the changes while inlining the function)

Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* IPA (was: Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION))
  2004-10-10 21:25   ` Jan Hubicka
@ 2004-10-10 23:00     ` Steven Bosscher
  2004-10-10 23:34       ` Jan Hubicka
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Bosscher @ 2004-10-10 23:00 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Razya Ladelsky, gcc-patches, hubicka, Mircea Namolaru, Ayal Zaks,
	Kenneth Zadeck

On Sunday 10 October 2004 22:59, Jan Hubicka wrote:
> While I personally agree with the scheme you outlined (and also
> described it in the GCC Summit paper), the requirement of not loading
> whole program into memory at once (so relying only on global date while
> doing IPA and performing the actual transformations later) actually
> brings some stress on the implementation of IPA optimizers (ie
> optimizers done after inlining needs to deal with the clones and it is
> dificult to iterate optimizations like we commonly do for local
> optimizations).

In my mental model, inlining and cloning would be one of the
last optimizations to do on the call graph.  Wouldn't it be a
simple matter of adding the clones to the call graph and moving
on as if it is a normal compilation?  At least you would not
actually do any IPA on clones and functions after inlining.
That would  require re-analyzing the function to get the global
data right, wouldn't it?



>  In fact Kenneth already suggested in private mail to
> not use this approach and instead load everything in the memory at once
> and do kind of iteration on the IPA.

Can you show some pseudo-algorithm of how this would work?


>  I would like to hear some opinions
> here.

Hmm, obviously it depends on how much we can improve on memory
efficiency, there is obvioulsy a lot of room for improvements
there if we could move to saner data structures.  Everyone knows
we're a *little* memory wasteful right now ;-)

But even then, I wonder if you could take the source of a very
large code base (say, GCC itself, or the linux kernel), and do
link time whole program optimizations on it if you have to keep
everything in memory.

And iterating (or perhaps worklist based?) IPA doesn't sound very
attractive either.  I suppose you could offer it as an option at
some very high optimization level, but does it really pay off in
general to do IPA so aggressively?  Is the benefit large enough to
justify a complicated and probably compile time expensive framework
like that?

In any case, we'll be stuck on doing everything in-memory at first
anyway, so we'll have the opportunity to experiment a bit, and that
will be interesting.  Of course, making pre-inlining optimizations
possible is the most important thing for now...

Gr.
Steven




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA (was: Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION))
  2004-10-10 23:00     ` IPA (was: Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION)) Steven Bosscher
@ 2004-10-10 23:34       ` Jan Hubicka
  2004-10-12 14:22         ` IPA Kenneth Zadeck
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2004-10-10 23:34 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: Jan Hubicka, Razya Ladelsky, gcc-patches, hubicka,
	Mircea Namolaru, Ayal Zaks, Kenneth Zadeck

> On Sunday 10 October 2004 22:59, Jan Hubicka wrote:
> > While I personally agree with the scheme you outlined (and also
> > described it in the GCC Summit paper), the requirement of not loading
> > whole program into memory at once (so relying only on global date while
> > doing IPA and performing the actual transformations later) actually
> > brings some stress on the implementation of IPA optimizers (ie
> > optimizers done after inlining needs to deal with the clones and it is
> > dificult to iterate optimizations like we commonly do for local
> > optimizations).
> 
> In my mental model, inlining and cloning would be one of the
> last optimizations to do on the call graph.  Wouldn't it be a
> simple matter of adding the clones to the call graph and moving
> on as if it is a normal compilation?  At least you would not
> actually do any IPA on clones and functions after inlining.
> That would  require re-analyzing the function to get the global
> data right, wouldn't it?

The idea is that inlining (as any code specialization transformation) makes analysis
more precise (ie once you constant propagate operand into function
argument it might become very trivial), so one approach how to cope with
this is to simply re-throw the expanded CFG after inlining to the
analysers again to gain extra precision.

One posibility is to update the local information once you decide to do
the cloning (ie to decide in informed way that function X is good to
clone when it's argument is constant, you need to know anyway that it
will simplify after you do so)
> 
> 
> 
> >  In fact Kenneth already suggested in private mail to
> > not use this approach and instead load everything in the memory at once
> > and do kind of iteration on the IPA.
> 
> Can you show some pseudo-algorithm of how this would work?

I believe Kenny's idea is to make early optimization, IPA,
clonning/inlining, IPA again on already inlined and compiled functions
and finally compile the function.
> 
> 
> >  I would like to hear some opinions
> > here.
> 
> Hmm, obviously it depends on how much we can improve on memory
> efficiency, there is obvioulsy a lot of room for improvements
> there if we could move to saner data structures.  Everyone knows
> we're a *little* memory wasteful right now ;-)

My major concern is that even if we are not little wastefull, we will
have scalability problems for very large compilation units that are
going to be more common..
> 
> But even then, I wonder if you could take the source of a very
> large code base (say, GCC itself, or the linux kernel), and do
> link time whole program optimizations on it if you have to keep
> everything in memory.

GCC or Linux is still little.  Think of Open Office in 10 years ;)
> 
> And iterating (or perhaps worklist based?) IPA doesn't sound very
> attractive either.  I suppose you could offer it as an option at
> some very high optimization level, but does it really pay off in
> general to do IPA so aggressively?  Is the benefit large enough to
> justify a complicated and probably compile time expensive framework
> like that?

worklist is probably not good idea.  I had in mind simply re-running the
passes, iteration was probably not best way to call it ;)

Honza
> 
> In any case, we'll be stuck on doing everything in-memory at first
> anyway, so we'll have the opportunity to experiment a bit, and that
> will be interesting.  Of course, making pre-inlining optimizations
> possible is the most important thing for now...
> 
> Gr.
> Steven
> 
> 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension  (RESUBMISSION)
  2004-10-10 15:38 [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Razya Ladelsky
  2004-10-10 17:10 ` Steven Bosscher
@ 2004-10-12 14:18 ` Jan Hubicka
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2004-10-12 14:18 UTC (permalink / raw)
  To: Razya Ladelsky; +Cc: gcc-patches, stevenb, hubicka, Mircea Namolaru, Ayal Zaks

> Hello,
> 
> Attached is the patch modified according to the suggestions
> we received.
> 
> This patch implements function cloning.
> It also contains extension to IPCP (Interprocedural constant propagation) 
> optimization http://gcc.gnu.org/ml/gcc-patches/2004-08/msg00998.html for 
> using function cloning.
> 
> Now in order to enable IPCP for a given application, we should use the 
> command
> gcc -fipa-cp -combine file1.c file2.c file3.c 
> where file1.c, file2.c, file3.c are the files of the application.
> 
> Bootstrapped and regression tests successfully passed on POWER4.
> 
> Comments are welcome.

Hi,
The cgraph bits looks almost fine to me now (modulo minor problems),
but I still have few questions ;)

Index: cgraph.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cgraph.c,v
retrieving revision 1.4.4.18.2.8
diff -c -3 -p -r1.4.4.18.2.8 cgraph.c
*** cgraph.c	25 Sep 2004 23:17:32 -0000	1.4.4.18.2.8
--- cgraph.c	5 Oct 2004 08:35:54 -0000
*************** cgraph_unnest_node (struct cgraph_node *
*** 706,709 ****
--- 706,737 ----
    *node2 = node->next_nested;
    node->origin = NULL;
  }
+ 
+ /* Copy collees of node FROM to node TO.  */
+ void
+ cgraph_copy_node_callees (struct cgraph_node *to,
+                           struct cgraph_node *from)
+ {
+   struct cgraph_edge *e;
+   
+   for (e = from->callees;e; e=e->next_callee)
+     cgraph_clone_edge (e, to, e->call_expr);
+   
+ }
+ 
+ 
+ void
+ cgraph_copy_local_info (struct cgraph_node *new_node, 
+ 			struct cgraph_node *orig_node)
+ {
+   new_node->local.self_insns = orig_node->local.self_insns;
+   new_node->local.local = orig_node->local.local;
+   new_node->local.finalized = orig_node->local.finalized;
+   new_node->local.disregard_inline_limits = 
+     orig_node->local.disregard_inline_limits;
+   new_node->local.inlinable = orig_node->local.inlinable;
+   new_node->local.redefined_extern_inline = 
+     orig_node->local.redefined_extern_inline;
+ }

The function lacks comment.
Why new_node->local = old_node->local won't work?  This is how the info
is copied in the cgraph_clone_node.

If you want to have special function for this, it is OK with me, but
please make it used by the clone_node function (same for
cgraph_copy_node_callees)

+ /* Get first callee of node.  */
+ #define FIRST_CALLEE(node)               ((node)->callees)
+ /* Get next callee from edge.  */
+ #define NEXT_CALLEE(edge)                ((edge)->next_callee)
+ /* Returns whether edge is last callee.  */
+ #define CALLEE_NOT_LAST(edge)            ((edge) != NULL)
+ 

Similarly if you want to have the macros, please make them used
consistently.  I tend to preffer node->callees instead of FIRST_CALLEE
tought, but since we have both schemes in GCC already, I am fine.
The CALLEE_NOT_LAST should be probably CALLEE_NOT_LAST_P as it is
predicate and I would preffer it to be CALLEE_LAST_P to avoid double
negatives.

+ struct cgraph_node *
+ cgraph_copy_node_for_versioning (struct cgraph_node *old_version,
+  				 tree new_decl, varray_type redirect_callers)
+ {
+   struct cgraph_node *new_version;
+   struct cgraph_edge *e;
+   struct cgraph_edge *next_callee;
+   unsigned i;
+   
+   if (old_version == NULL)
+     abort ();
+   
+   new_version = cgraph_node (new_decl);
+   
+   new_version->needed = false; 
+   new_version->decl = new_decl;

cgraph_node should make this job for you.  Perhaps adding "new_decl"
argument to cgraph_clone_node would kill most of the redundancy with
this function?  (Ie you will need to call cgraph_clone_node and 
then fixup the recursive edges).  When you do have recursion, why you
want the recursive call to always point to new clone (I would guess that
the function can actually change the value of operand)..
Index: cp/tree.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/cp/tree.c,v
retrieving revision 1.286.2.43.2.11
diff -c -3 -p -r1.286.2.43.2.11 tree.c
*** cp/tree.c	25 Sep 2004 23:18:41 -0000	1.286.2.43.2.11
--- cp/tree.c	5 Oct 2004 08:35:58 -0000
*************** cp_cannot_inline_tree_fn (tree* fnp)
*** 2057,2062 ****
--- 2057,2088 ----
    return 0;
  }
  
+ /* Decide whether there are language-specific reasons to not version a
+    function as a tree.  */
+ 
+ int 
+ cp_cannot_version_tree_fn (tree *fnp)
+ {
+   tree fn = *fnp;
+  
+   /* ??? Is there a way to version a
+      constructor?  */ 
+   if (DECL_COMPLETE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_CONSTRUCTOR_P (fn))
+     return 1;
+   if (DECL_COMPLETE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_BASE_DESTRUCTOR_P (fn))
+     return 1;
+   if (DECL_OVERLOADED_OPERATOR_P (fn))
+     return 1;

Why this is needed?
(ie when you have already lowered form of the program you should either
see the low level information why clonning is not possible or you should
be able to clone it).

Otherwise the patch looks almost OK to me (with the incremental plan to
work on CFG based inlining/cloning next :)
Please also incorporate the Steven's comments (especially one about
moving the function out of integrate.c - that file should die soon). I
think it is resonable to place new functions in cgraph/cgraphunit.c as
they are pretty trivial now and the main infrastructure should have
friendly way to clone the function)

Also, please, prepare small testsuite catching the cases you can deal
with now.  Since we lack the IPA directory in testsuite, it would be
nice to add one. See tree-ssa directory on how to parse the dump files,
in fact copying around it's .exp file should (I hope) mostly work for
us.

Like Steven, I would be also interested to hear about the results in
some benchmarks (at least absolute numbers about how much clonning you
do and how much arguemnts you find to be constant).

Thank you,
Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA
  2004-10-10 23:34       ` Jan Hubicka
@ 2004-10-12 14:22         ` Kenneth Zadeck
  2004-10-12 14:41           ` IPA Jan Hubicka
  0 siblings, 1 reply; 13+ messages in thread
From: Kenneth Zadeck @ 2004-10-12 14:22 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Steven Bosscher, Razya Ladelsky, gcc-patches, hubicka,
	Mircea Namolaru, Ayal Zaks, Berlin, Daniel, Novillo, Diego



Jan Hubicka wrote:

>>On Sunday 10 October 2004 22:59, Jan Hubicka wrote:
>>    
>>
>>>While I personally agree with the scheme you outlined (and also
>>>described it in the GCC Summit paper), the requirement of not loading
>>>whole program into memory at once (so relying only on global date while
>>>doing IPA and performing the actual transformations later) actually
>>>brings some stress on the implementation of IPA optimizers (ie
>>>optimizers done after inlining needs to deal with the clones and it is
>>>dificult to iterate optimizations like we commonly do for local
>>>optimizations).
>>>      
>>>
>>In my mental model, inlining and cloning would be one of the
>>last optimizations to do on the call graph.  Wouldn't it be a
>>simple matter of adding the clones to the call graph and moving
>>on as if it is a normal compilation?  At least you would not
>>actually do any IPA on clones and functions after inlining.
>>That would  require re-analyzing the function to get the global
>>data right, wouldn't it?
>>    
>>
>
>The idea is that inlining (as any code specialization transformation) makes analysis
>more precise (ie once you constant propagate operand into function
>argument it might become very trivial), so one approach how to cope with
>this is to simply re-throw the expanded CFG after inlining to the
>analysers again to gain extra precision.
>
>One posibility is to update the local information once you decide to do
>the cloning (ie to decide in informed way that function X is good to
>clone when it's argument is constant, you need to know anyway that it
>will simplify after you do so)
>  
>
>>    
>>
I had a lot of problems getting this to work.  I think that you need to 
either work on the code after inlining or before cloning.  The patch 
that I posted and am still waiting for comments introduces the concept 
of a "master clone", one unique clone among all of the clones for a 
given procedure where you can hang the info off of.  My propagation also 
gloms all of the edges for all of the clones into one bundle so you 
think you are working on the graph before any cloning has occurred.

The problem with the current cgraphunit is that there is no time when 
all of the inlining has been done because once on procedure has 
everything inlined into it, it is immediately compiled.  This problem 
has been solved on the profiling and struct branches and there is a plan 
to get this back into main line after the freeze is over.

>>    
>>
>>> In fact Kenneth already suggested in private mail to
>>>not use this approach and instead load everything in the memory at once
>>>and do kind of iteration on the IPA.
>>>      
>>>
>>Can you show some pseudo-algorithm of how this would work?
>>    
>>
>
>I believe Kenny's idea is to make early optimization, IPA,
>clonning/inlining, IPA again on already inlined and compiled functions
>and finally compile the function.
>  
>
>>    
>>
My plan should really be called the Berlin, Novillo, Zadeck plan.  I 
outlined this to Jan.  It is a good one for single files, it is not the 
right one for "whole programs". 

In this plan, modest "whole compilation unit analysis is first performed 
before inlining is done.  Next, each procedure is inlined and then 
compiled down past the point where ssa form is built to the point where 
some of the reducing optimizations are performed (i.e. constant prop, 
dead code, value numbering).  These optimizations are going extract most 
of the benefit from the inlining and may delete enough code to really 
effect the interprocedural info.  Then we apply our best aliasing 
algorithm and over the entire compilation unit and then restart the 
single procedural compilations with than information. 

This will be too bulky for the whole program plan, but is really doable 
in the framework of todays build environments and it should get us real 
improvements over what we have today.

I am a big advocate of the whole world at link time plan but I 
understand the political problems with this.


>>> I would like to hear some opinions
>>>here.
>>>      
>>>
>>Hmm, obviously it depends on how much we can improve on memory
>>efficiency, there is obvioulsy a lot of room for improvements
>>there if we could move to saner data structures.  Everyone knows
>>we're a *little* memory wasteful right now ;-)
>>    
>>
>
>My major concern is that even if we are not little wastefull, we will
>have scalability problems for very large compilation units that are
>going to be more common..
>  
>
>>But even then, I wonder if you could take the source of a very
>>large code base (say, GCC itself, or the linux kernel), and do
>>link time whole program optimizations on it if you have to keep
>>everything in memory.
>>    
>>
>
>GCC or Linux is still little.  Think of Open Office in 10 years ;)
>  
>
>>And iterating (or perhaps worklist based?) IPA doesn't sound very
>>attractive either.  I suppose you could offer it as an option at
>>some very high optimization level, but does it really pay off in
>>general to do IPA so aggressively?  Is the benefit large enough to
>>justify a complicated and probably compile time expensive framework
>>like that?
>>    
>>
>
>worklist is probably not good idea.  I had in mind simply re-running the
>passes, iteration was probably not best way to call it ;)
>
>Honza
>  
>
>>In any case, we'll be stuck on doing everything in-memory at first
>>anyway, so we'll have the opportunity to experiment a bit, and that
>>will be interesting.  Of course, making pre-inlining optimizations
>>possible is the most important thing for now...
>>
>>Gr.
>>Steven
>>
>>
>>
>>    
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA
  2004-10-12 14:22         ` IPA Kenneth Zadeck
@ 2004-10-12 14:41           ` Jan Hubicka
       [not found]             ` <OF392747EC.15519132-ONC2256F2D.00576111-C2256F2D.005769AD@il.ibm.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2004-10-12 14:41 UTC (permalink / raw)
  To: Kenneth Zadeck
  Cc: Jan Hubicka, Steven Bosscher, Razya Ladelsky, gcc-patches,
	hubicka, Mircea Namolaru, Ayal Zaks, Berlin, Daniel, Novillo,
	Diego

> 
> 
> Jan Hubicka wrote:
> 
> >>On Sunday 10 October 2004 22:59, Jan Hubicka wrote:
> >>   
> >>
> >>>While I personally agree with the scheme you outlined (and also
> >>>described it in the GCC Summit paper), the requirement of not loading
> >>>whole program into memory at once (so relying only on global date while
> >>>doing IPA and performing the actual transformations later) actually
> >>>brings some stress on the implementation of IPA optimizers (ie
> >>>optimizers done after inlining needs to deal with the clones and it is
> >>>dificult to iterate optimizations like we commonly do for local
> >>>optimizations).
> >>>     
> >>>
> >>In my mental model, inlining and cloning would be one of the
> >>last optimizations to do on the call graph.  Wouldn't it be a
> >>simple matter of adding the clones to the call graph and moving
> >>on as if it is a normal compilation?  At least you would not
> >>actually do any IPA on clones and functions after inlining.
> >>That would  require re-analyzing the function to get the global
> >>data right, wouldn't it?
> >>   
> >>
> >
> >The idea is that inlining (as any code specialization transformation) 
> >makes analysis
> >more precise (ie once you constant propagate operand into function
> >argument it might become very trivial), so one approach how to cope with
> >this is to simply re-throw the expanded CFG after inlining to the
> >analysers again to gain extra precision.
> >
> >One posibility is to update the local information once you decide to do
> >the cloning (ie to decide in informed way that function X is good to
> >clone when it's argument is constant, you need to know anyway that it
> >will simplify after you do so)
> > 
> >
> >>   
> >>
> I had a lot of problems getting this to work.  I think that you need to 
> either work on the code after inlining or before cloning.  The patch 
> that I posted and am still waiting for comments introduces the concept 
> of a "master clone", one unique clone among all of the clones for a 
> given procedure where you can hang the info off of.  My propagation also 
> gloms all of the edges for all of the clones into one bundle so you 
> think you are working on the graph before any cloning has occurred.
> 
> The problem with the current cgraphunit is that there is no time when 
> all of the inlining has been done because once on procedure has 
> everything inlined into it, it is immediately compiled.  This problem 
> has been solved on the profiling and struct branches and there is a plan 
> to get this back into main line after the freeze is over.

I am not quite sure about the struct branch, but with tree profiling,
there is still inlining done just before compiling the function.  The
difference is only in a fact that you will have the CFG built in the
early stages of compilation, so each function body will be already in
low gimple and after early optimizations once the IPA analysis takes
place.  (so for instance the inliner can already use the profile read
back to compiler or built via guessing)

Concerning the IPA analysis, it seems to me that the presence of clones
should not be too major pain.  I have plan to implement pass manager
that will have three hooks for each optimizatoin.  One will be called
for each function body once the function is finalized supposed to
compute local properties, later you will get called to compute the
global info during the cgraph_optimize and then you will get called for
each body to perform the changes)

If you don't get any benefits from inlining, you can register yourself
before the inliner and you won't see the clones in global phasse.  If
you get some more info from expanded callgraph, you register yourself
after the phasse and you should be able to propagate the info across the
cloned nodes as if you were already analysing the function body after
inlining.

Ie in most cases we will have the local info shared across the clonnes
and you we locally analyze before the inlining decisions takes place,
while global info might be different for each clone, but I guess I have
to think more about this.
> 
> >>   
> >>
> >>>In fact Kenneth already suggested in private mail to
> >>>not use this approach and instead load everything in the memory at once
> >>>and do kind of iteration on the IPA.
> >>>     
> >>>
> >>Can you show some pseudo-algorithm of how this would work?
> >>   
> >>
> >
> >I believe Kenny's idea is to make early optimization, IPA,
> >clonning/inlining, IPA again on already inlined and compiled functions
> >and finally compile the function.
> > 
> >
> >>   
> >>
> My plan should really be called the Berlin, Novillo, Zadeck plan.  I 
> outlined this to Jan.  It is a good one for single files, it is not the 
> right one for "whole programs". 
> 
> In this plan, modest "whole compilation unit analysis is first performed 
> before inlining is done.  Next, each procedure is inlined and then 
> compiled down past the point where ssa form is built to the point where 
> some of the reducing optimizations are performed (i.e. constant prop, 
> dead code, value numbering).  These optimizations are going extract most 
> of the benefit from the inlining and may delete enough code to really 
> effect the interprocedural info.  Then we apply our best aliasing 
> algorithm and over the entire compilation unit and then restart the 
> single procedural compilations with than information. 
> 
> This will be too bulky for the whole program plan, but is really doable 
> in the framework of todays build environments and it should get us real 
> improvements over what we have today.
> 
> I am a big advocate of the whole world at link time plan but I 
> understand the political problems with this.

I am trying to keep organization of the current GCC implementation of
IPA optimizers in a shape so they will scale up to the link time
optimization approach and I really hope that sooner or later we will
find way how to avoid the political issues with this scheme. 

Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA
       [not found]             ` <OF392747EC.15519132-ONC2256F2D.00576111-C2256F2D.005769AD@il.ibm.com>
@ 2004-10-14 16:44               ` Jan Hubicka
  2004-10-14 17:22               ` IPA Kenneth Zadeck
       [not found]               ` <416EADEA.2030406@naturalbridge.com>
  2 siblings, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2004-10-14 16:44 UTC (permalink / raw)
  To: Razya Ladelsky
  Cc: Jan Hubicka, Ayal Zaks, Berlin, Daniel, Novillo, Diego,
	gcc-patches, hubicka, Mircea Namolaru, Steven Bosscher,
	Kenneth Zadeck

> Jan Hubicka <jh@suse.cz> wrote on 12/10/2004 16:31:57:
> 
> 
> > > 
> > > I am a big advocate of the whole world at link time plan but I 
> > > understand the political problems with this.
> > 
> > I am trying to keep organization of the current GCC implementation of
> > IPA optimizers in a shape so they will scale up to the link time
> > optimization approach and I really hope that sooner or later we will
> > find way how to avoid the political issues with this scheme. 
> > 
> > Honza
> 
> IPA provides valuable information to help subsequent optimizations,
> so it must be followed by intraprocedural optimizations. How does this fit 
> with link time ?

By link-time I reffer to scheme where all the source files are compiled
in the time linker is usually invoked.  Either via storing gimple
functions + cgraphs into object files later passed to compiler again via
modified linker or by other tricks.

This is main source of my concerns about organizing IPA a way so all
functions are manipulated at once (and thus must be present in memory at
once) - it would be nice to distribute the work a way so the time
functions are traidtionally compiled we do as much work as possible
(including gathering local info from IPA) and at the time linking is
done do the actual IPA propagation and then effectivly load the function
bodies as needed, optimizing and producing the assembly.

Honza
> 
> Razya

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA
       [not found]             ` <OF392747EC.15519132-ONC2256F2D.00576111-C2256F2D.005769AD@il.ibm.com>
  2004-10-14 16:44               ` IPA Jan Hubicka
@ 2004-10-14 17:22               ` Kenneth Zadeck
       [not found]               ` <416EADEA.2030406@naturalbridge.com>
  2 siblings, 0 replies; 13+ messages in thread
From: Kenneth Zadeck @ 2004-10-14 17:22 UTC (permalink / raw)
  To: Razya Ladelsky
  Cc: Jan Hubicka, Ayal Zaks, Berlin, Daniel, Novillo, Diego,
	gcc-patches, hubicka, Mircea Namolaru, Steven Bosscher

If you look at most other optimizing compilers (i.e. sun, ibm, llvm), 
they have enhanced .o formats that contain the intermediate code that 
can then be further optimized.  This solves a huge number of legistical 
problems, such as getting the user to change their entire build 
procedures and makefiles.  All you have to do is add a few options onto 
the link command.  It also means that if a single file changes, all you 
do is recompile the one file and do a (notably more expensive) link. 

The other benefit is that since a fair amount of the analysis is done 
locally before you do the intermodular stuff, it is generally faster to 
rebuild after small changes to a single file.

However, this violates one of the stallman principals of not wanting to 
have a defined api for the intermediate form of the compiler.  So we are 
forced to do something that is inferior and that, most likely, will not 
be widely used.

Kenny

Razya Ladelsky wrote:

>Jan Hubicka <jh@suse.cz> wrote on 12/10/2004 16:31:57:
>
>
>  
>
>>>I am a big advocate of the whole world at link time plan but I 
>>>understand the political problems with this.
>>>      
>>>
>>I am trying to keep organization of the current GCC implementation of
>>IPA optimizers in a shape so they will scale up to the link time
>>optimization approach and I really hope that sooner or later we will
>>find way how to avoid the political issues with this scheme. 
>>
>>Honza
>>    
>>
>
>IPA provides valuable information to help subsequent optimizations,
>so it must be followed by intraprocedural optimizations. How does this fit 
>with link time ?
>
>Razya
>  
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA
       [not found]               ` <416EADEA.2030406@naturalbridge.com>
@ 2004-10-14 17:24                 ` Steven Bosscher
  2004-10-14 21:25                   ` IPA Mark Mitchell
  2004-10-15  3:39                 ` IPA Daniel Berlin
  1 sibling, 1 reply; 13+ messages in thread
From: Steven Bosscher @ 2004-10-14 17:24 UTC (permalink / raw)
  To: Kenneth Zadeck, Razya Ladelsky
  Cc: Jan Hubicka, Ayal Zaks, Berlin, Daniel, Novillo, Diego,
	gcc-patches, hubicka, Mircea Namolaru

On Thursday 14 October 2004 18:48, Kenneth Zadeck wrote:
>  However, this violates one of the stallman principals of not wanting to
> have a defined api for the intermediate form of the compiler.  So we are
> forced to do something that is inferior and that, most likely, will not be
> widely used.

RMS has been convinced to step off his principles before (see gcc
vs. egcsfor example), so we don't have to declare defeat before
trying.

IMO we should just work as-if we can have a streamable intermediate
form, and show that there is serious benefit for GCC and for Free
Software in general to have this feature.  We should gather and put
up the numbers, and convince RMS that it's a Good Thing.

If that doesn't convince him, we'll have to think about other ways.
But until then, let's be optimistic and not let politics influence
our designs too much.

Gr.
Steven


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA
  2004-10-14 17:24                 ` IPA Steven Bosscher
@ 2004-10-14 21:25                   ` Mark Mitchell
  0 siblings, 0 replies; 13+ messages in thread
From: Mark Mitchell @ 2004-10-14 21:25 UTC (permalink / raw)
  To: Steven Bosscher
  Cc: Kenneth Zadeck, Razya Ladelsky, Jan Hubicka, Ayal Zaks, Berlin,
	Daniel, Novillo, Diego, gcc-patches, hubicka, Mircea Namolaru

Steven Bosscher wrote:

>On Thursday 14 October 2004 18:48, Kenneth Zadeck wrote:
>  
>
>> However, this violates one of the stallman principals of not wanting to
>>have a defined api for the intermediate form of the compiler.  So we are
>>forced to do something that is inferior and that, most likely, will not be
>>widely used.
>>    
>>
>
>RMS has been convinced to step off his principles before (see gcc
>vs. egcsfor example), so we don't have to declare defeat before
>trying.
>
>IMO we should just work as-if we can have a streamable intermediate
>form, and show that there is serious benefit for GCC and for Free
>Software in general to have this feature.  We should gather and put
>up the numbers, and convince RMS that it's a Good Thing.
>  
>
Yes.

I have said publicly for some time now that I think that RMS is mistaken 
in this particular respect, and that a read/write interface to an 
intermediate form is an essential part of GCC's future.  Once we build 
it and demonstrate that it is useful, I am confident that it will become 
part of GCC because I do not think  RMS will not want to see GCC cede 
ground to other compilers.

-- 
Mark Mitchell
CodeSourcery, LLC
(916) 791-8304
mark@codesourcery.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: IPA
       [not found]               ` <416EADEA.2030406@naturalbridge.com>
  2004-10-14 17:24                 ` IPA Steven Bosscher
@ 2004-10-15  3:39                 ` Daniel Berlin
  1 sibling, 0 replies; 13+ messages in thread
From: Daniel Berlin @ 2004-10-15  3:39 UTC (permalink / raw)
  To: Kenneth Zadeck
  Cc: Razya Ladelsky, Jan Hubicka, Ayal Zaks, Novillo, Diego,
	gcc-patches, hubicka, Mircea Namolaru, Steven Bosscher

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1235 bytes --]



On Thu, 14 Oct 2004, Kenneth Zadeck wrote:

> If you look at most other optimizing compilers (i.e. sun, ibm, llvm),
> they have enhanced .o formats that contain the intermediate code that can
> then be further optimized.  This solves a huge number of legistical
> problems, such as getting the user to change their entire build
> procedures and makefiles.  All you have to do is add a few options onto
> the link command.  It also means that if a single file changes, all you
> do is recompile the one file and do a (notably more expensive) link. 
> 
> The other benefit is that since a fair amount of the analysis is done
> locally before you do the intermodular stuff, it is generally faster to
> rebuild after small changes to a single file.
> 
> However, this violates one of the stallman principals of not wanting to
> have a defined api for the intermediate form of the compiler.  So we are
> forced to do something that is inferior and that, most likely, will not
> be widely used.

Except that when it becomse necsesary to do this, we'll simply do it, and 
stallman has said that would be okay.
Or so i'm told.
He just doesn't want us doing it unless it absoultely necessary to keep 
gcc competitive.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2004-10-15  3:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-10 15:38 [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Razya Ladelsky
2004-10-10 17:10 ` Steven Bosscher
2004-10-10 21:25   ` Jan Hubicka
2004-10-10 23:00     ` IPA (was: Re: [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION)) Steven Bosscher
2004-10-10 23:34       ` Jan Hubicka
2004-10-12 14:22         ` IPA Kenneth Zadeck
2004-10-12 14:41           ` IPA Jan Hubicka
     [not found]             ` <OF392747EC.15519132-ONC2256F2D.00576111-C2256F2D.005769AD@il.ibm.com>
2004-10-14 16:44               ` IPA Jan Hubicka
2004-10-14 17:22               ` IPA Kenneth Zadeck
     [not found]               ` <416EADEA.2030406@naturalbridge.com>
2004-10-14 17:24                 ` IPA Steven Bosscher
2004-10-14 21:25                   ` IPA Mark Mitchell
2004-10-15  3:39                 ` IPA Daniel Berlin
2004-10-12 14:18 ` [tree-profiling-branch PATCH] Function cloning + IPCP extension (RESUBMISSION) Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).