public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [patch] Speed up ivopts
@ 2004-11-06 21:22 Zdenek Dvorak
  2004-11-08 19:20 ` Richard Henderson
  0 siblings, 1 reply; 4+ messages in thread
From: Zdenek Dvorak @ 2004-11-06 21:22 UTC (permalink / raw)
  To: gcc-patches

Hello,

on the testcase for PR 17549, ivopts are taking 25% of compile time.
This patch contains several improvements for this problem; with them, ivopts
take some 5% of compile time (spent mostly in scev analysis).

1) To find next bit in bitmap element we scan the bits sequentially.
   For sparse bitmaps this is quite slow.  The patch uses faster
   mechanism (basically the generic C implementation of
   count_trailing_zeros from longlong.h.  Using directly the
   function from longlong.h or at least the ffs builtin would be better
   since it would make it possible to get advantage from specific
   purpose machine instructions, but it would probably require some playing
   with the configure, which I am not very familiar with).

2) On all places where we use bitmap of candidates related to a use
   (i.e. those derived directly from it and thus the ones likely to
   be useful for optimizing it), we also consider the common "important"
   candidates.  Rather than always doing the union of the bitmaps
   again and again, it is easier and faster to just always include
   important candidates to all related_cands bitmaps (there are usually
   only few important candidates, so the memory consumption is basically
   unaffected).

3) This enables one further optimization -- after we determine the
   costs for uses, we may remove the candidates that cannot be used
   to compute the use from the bitmaps, so that we do not need to scan
   over them any more.

4) In case there are too many candidates for new induction variables we
   take only the importand + related ones into account.  Costs of
   expressing each use by a given candidate are then stored at uses in
   unordered lists, and the costs are looked up by a sequential scan.
   The lists are usually quite short (about 10 elements), so this works
   well.  Still this is easy to improve -- by employing trivial hashing,
   we are able to find the cost directly without need to scan over
   several elements in about 90% of cases.

5) Some parts of the functions to optimize the set of induction variables are
   quite inefficient:
   
   When the set is changed by adding a new induction
   variable, we find the best way for expressing each use by scanning
   over all ivs in the set; this is waste of time, since the only
   possibilities are either using the new iv or using the one that was
   used before.

   Also each time we need to know the cost of using some set of ivs, we just
   stupidly pass over all uses, find best candidates for them, sum all
   the numbers together, etc.  It is of course much easier and faster to
   just keep the cost of the set and update it incrementally as the set
   is changed.

Bootstrapped & regtested on i686, x86_64, ppc and ia64.

Zdenek

	* Makefile.in (tree-ssa-loop-ivopts.o): Add sbitmap.h dependency.
	* bitmap.h (bmp_iter_advance): New.
	(bmp_iter_set_1, bmp_iter_and_1, bmp_iter_and_compl_1): Split
	from bmp_iter_set, bmp_iter_and, bmp_iter_and_compl, respectively.
	(bmp_iter_set, bmp_iter_and, bmp_iter_and_compl): Call _1 versions.
	Use bmp_iter_advance.
	* tree-ssa-loop-ivopts.c (struct iv_use): Change semantics of
	related_cands.
	(struct iv_ca, struct iv_ca_delta): New types.
	(tree_ssa_iv_optimize_init): Allocate important_candidates bitmap.
	(record_important_candidates): New.
	(find_iv_candidates): Call record_important_candidates.
	(alloc_use_cost_map): Derive size only from important candidates.
	(set_use_iv_cost, get_use_iv_cost): Use hash-like mechanism to speed
	up searches.
	(determine_use_iv_cost_generic, determine_use_iv_cost_address,
	determine_use_iv_cost_condition, determine_use_iv_cost_outer,
	determine_use_iv_cost): Return whether the use can be expressed by
	the candidate.
	(determine_use_iv_costs): Prune useless candidates from relate_cands
	bitmaps.
	(find_best_candidate, set_cost_up_to, set_cost): Removed.
	(cheaper_cost_pair, iv_ca_recount_cost, iv_ca_set_no_cp,
	iv_ca_set_cp, iv_ca_add_use, iv_ca_cost, iv_ca_has_deps,
	iv_ca_delta_add, iv_ca_cand_for_use, iv_ca_delta_commit,
	iv_ca_cand_used_p, iv_ca_delta_free, iv_ca_new, iv_ca_free,
	iv_ca_dump, iv_ca_extend, iv_ca_narrow): New functions.
	(try_add_cand_for, get_initial_solution, try_improve_iv_set,
	find_optimal_iv_set, create_new_ivs, tree_ssa_iv_optimize_loop):
	Use new iv set representation.
	(free_loop_data): clear important_candidates bitmap.
	(tree_ssa_iv_optimize_finalize): Free important_candidates bitmap.

Index: Makefile.in
===================================================================
RCS file: /cvs/gcc/gcc/gcc/Makefile.in,v
retrieving revision 1.1418
diff -c -3 -p -r1.1418 Makefile.in
*** Makefile.in	5 Nov 2004 04:49:05 -0000	1.1418
--- Makefile.in	6 Nov 2004 15:06:26 -0000
*************** tree-ssa-loop-ivopts.o : tree-ssa-loop-i
*** 1714,1720 ****
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) varray.h $(EXPR_H) \
     output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
     tree-pass.h $(GGC_H) $(RECOG_H) insn-config.h $(HASHTAB_H) $(SCEV_H) \
!    cfgloop.h $(PARAMS_H)
  tree-ssa-loop-manip.o : tree-ssa-loop-manip.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
     output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
--- 1714,1720 ----
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) varray.h $(EXPR_H) \
     output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
     tree-pass.h $(GGC_H) $(RECOG_H) insn-config.h $(HASHTAB_H) $(SCEV_H) \
!    cfgloop.h $(PARAMS_H) sbitmap.h
  tree-ssa-loop-manip.o : tree-ssa-loop-manip.c $(TREE_FLOW_H) $(CONFIG_H) \
     $(SYSTEM_H) $(RTL_H) $(TREE_H) $(TM_P_H) $(CFGLOOP_H) \
     output.h diagnostic.h $(TIMEVAR_H) $(TM_H) coretypes.h $(TREE_DUMP_H) \
Index: bitmap.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/bitmap.h,v
retrieving revision 1.47
diff -c -3 -p -r1.47 bitmap.h
*** bitmap.h	5 Nov 2004 11:12:55 -0000	1.47
--- bitmap.h	6 Nov 2004 15:06:26 -0000
*************** bmp_iter_next (bitmap_iterator *bi, unsi
*** 358,382 ****
    *bit_no += 1;
  }
  
! /* Advance to the next nonzero bit of a single bitmap, we will have
!    already advanced past the just iterated bit.  Return true if there
!    is a bit to iterate.  */
  
! static inline bool
! bmp_iter_set (bitmap_iterator *bi, unsigned *bit_no)
  {
!   /* If our current word is nonzero, it contains the bit we want.  */
!   if (bi->bits)
      {
!     next_bit:
!       while (!(bi->bits & 1))
  	{
! 	  bi->bits >>= 1;
! 	  *bit_no += 1;
  	}
-       return true;
      }
  
    /* Round up to the word boundary.  We might have just iterated past
       the end of the last word, hence the -1.  It is not possible for
       bit_no to point at the beginning of the now last word.  */
--- 358,428 ----
    *bit_no += 1;
  }
  
! /* Table of lowest nonzero bit for small values.  */
  
! static const unsigned nzb_tab[] =
  {
!   8, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   6, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   7, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   6, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   5, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0,
!   4, 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0};
! 
! /* Advance BIT_NO to the next nit in BI.  */
! 
! static inline void
! bmp_iter_advance (bitmap_iterator *bi, unsigned *bit_no)
! {
!   unsigned nzb;
! 
! #if nBITMAP_WORD_BITS != 32 && nBITMAP_WORD_BITS != 64
! #error "Weird size of unsigned long."
! #endif
! 
!   if (!(bi->bits & 0xff))
      {
! #if nBITMAP_WORD_BITS == 64
!       if (!(bi->bits & 0xffffffff))
  	{
! 	  bi->bits >>= 32;
! 	  *bit_no += 32;
! 	}
! #endif
!       if (!(bi->bits & 0xffff))
! 	{
! 	  bi->bits >>= 16;
! 	  *bit_no += 16;
! 	}
!       if (!(bi->bits & 0xff))
! 	{
! 	  bi->bits >>= 8;
! 	  *bit_no += 8;
  	}
      }
+   nzb = nzb_tab[bi->bits & 0xff];
+ 
+   *bit_no += nzb;
+   bi->bits >>= nzb;
+ }
  
+ /* Advance to the next nonzero bit of a single bitmap, we will have
+    already advanced past the just iterated bit.  Return true if there
+    is a bit to iterate.  */
+ 
+ static inline bool
+ bmp_iter_set_1 (bitmap_iterator *bi, unsigned *bit_no)
+ {
    /* Round up to the word boundary.  We might have just iterated past
       the end of the last word, hence the -1.  It is not possible for
       bit_no to point at the beginning of the now last word.  */
*************** bmp_iter_set (bitmap_iterator *bi, unsig
*** 391,397 ****
  	{
  	  bi->bits = bi->elt1->bits[bi->word_no];
  	  if (bi->bits)
! 	    goto next_bit;
  	  *bit_no += BITMAP_WORD_BITS;
  	  bi->word_no++;
  	}
--- 437,446 ----
  	{
  	  bi->bits = bi->elt1->bits[bi->word_no];
  	  if (bi->bits)
! 	    {
! 	      bmp_iter_advance (bi, bit_no);
! 	      return true;
! 	    }
  	  *bit_no += BITMAP_WORD_BITS;
  	  bi->word_no++;
  	}
*************** bmp_iter_set (bitmap_iterator *bi, unsig
*** 405,429 ****
      }
  }
  
! /* Advance to the next nonzero bit of an intersecting pair of
!    bitmaps.  We will have already advanced past the just iterated bit.
!    Return true if there is a bit to iterate.  */
  
  static inline bool
! bmp_iter_and (bitmap_iterator *bi, unsigned *bit_no)
  {
-   /* If our current word is nonzero, it contains the bit we want.  */
    if (bi->bits)
      {
!     next_bit:
!       while (!(bi->bits & 1))
! 	{
! 	  bi->bits >>= 1;
! 	  *bit_no += 1;
! 	}
        return true;
      }
  
    /* Round up to the word boundary.  We might have just iterated past
       the end of the last word, hence the -1.  It is not possible for
       bit_no to point at the beginning of the now last word.  */
--- 454,481 ----
      }
  }
  
! /* Advance to the next nonzero bit of a single bitmap, we will have
!    already advanced past the just iterated bit.  Return true if there
!    is a bit to iterate.  */
  
  static inline bool
! bmp_iter_set (bitmap_iterator *bi, unsigned *bit_no)
  {
    if (bi->bits)
      {
!       bmp_iter_advance (bi, bit_no);
        return true;
      }
+   return bmp_iter_set_1 (bi, bit_no);
+ }
  
+ /* Advance to the next nonzero bit of an intersecting pair of
+    bitmaps.  We will have already advanced past the just iterated bit.
+    Return true if there is a bit to iterate.  */
+ 
+ static inline bool
+ bmp_iter_and_1 (bitmap_iterator *bi, unsigned *bit_no)
+ {
    /* Round up to the word boundary.  We might have just iterated past
       the end of the last word, hence the -1.  It is not possible for
       bit_no to point at the beginning of the now last word.  */
*************** bmp_iter_and (bitmap_iterator *bi, unsig
*** 438,444 ****
  	{
  	  bi->bits = bi->elt1->bits[bi->word_no] & bi->elt2->bits[bi->word_no];
  	  if (bi->bits)
! 	    goto next_bit;
  	  *bit_no += BITMAP_WORD_BITS;
  	  bi->word_no++;
  	}
--- 490,499 ----
  	{
  	  bi->bits = bi->elt1->bits[bi->word_no] & bi->elt2->bits[bi->word_no];
  	  if (bi->bits)
! 	    {
! 	      bmp_iter_advance (bi, bit_no);
! 	      return true;
! 	    }
  	  *bit_no += BITMAP_WORD_BITS;
  	  bi->word_no++;
  	}
*************** bmp_iter_and (bitmap_iterator *bi, unsig
*** 472,496 ****
      }
  }
  
! /* Advance to the next nonzero bit in the intersection of
!    complemented bitmaps.  We will have already advanced past the just
!    iterated bit.  */
  
  static inline bool
! bmp_iter_and_compl (bitmap_iterator *bi, unsigned *bit_no)
  {
-   /* If our current word is nonzero, it contains the bit we want.  */
    if (bi->bits)
      {
!     next_bit:
!       while (!(bi->bits & 1))
! 	{
! 	  bi->bits >>= 1;
! 	  *bit_no += 1;
! 	}
        return true;
      }
  
    /* Round up to the word boundary.  We might have just iterated past
       the end of the last word, hence the -1.  It is not possible for
       bit_no to point at the beginning of the now last word.  */
--- 527,554 ----
      }
  }
  
! /* Advance to the next nonzero bit of an intersecting pair of
!    bitmaps.  We will have already advanced past the just iterated bit.
!    Return true if there is a bit to iterate.  */
  
  static inline bool
! bmp_iter_and (bitmap_iterator *bi, unsigned *bit_no)
  {
    if (bi->bits)
      {
!       bmp_iter_advance (bi, bit_no);
        return true;
      }
+   return bmp_iter_and_1 (bi, bit_no);
+ }
+ 
+ /* Advance to the next nonzero bit in the intersection of
+    complemented bitmaps.  We will have already advanced past the just
+    iterated bit.  */
  
+ static inline bool
+ bmp_iter_and_compl_1 (bitmap_iterator *bi, unsigned *bit_no)
+ {
    /* Round up to the word boundary.  We might have just iterated past
       the end of the last word, hence the -1.  It is not possible for
       bit_no to point at the beginning of the now last word.  */
*************** bmp_iter_and_compl (bitmap_iterator *bi,
*** 507,513 ****
  	  if (bi->elt2 && bi->elt2->indx == bi->elt1->indx)
  	    bi->bits &= ~bi->elt2->bits[bi->word_no];
  	  if (bi->bits)
! 	    goto next_bit;
  	  *bit_no += BITMAP_WORD_BITS;
  	  bi->word_no++;
  	}
--- 565,574 ----
  	  if (bi->elt2 && bi->elt2->indx == bi->elt1->indx)
  	    bi->bits &= ~bi->elt2->bits[bi->word_no];
  	  if (bi->bits)
! 	    {
! 	      bmp_iter_advance (bi, bit_no);
! 	      return true;
! 	    }
  	  *bit_no += BITMAP_WORD_BITS;
  	  bi->word_no++;
  	}
*************** bmp_iter_and_compl (bitmap_iterator *bi,
*** 526,531 ****
--- 587,607 ----
      }
  }
  
+ /* Advance to the next nonzero bit in the intersection of
+    complemented bitmaps.  We will have already advanced past the just
+    iterated bit.  */
+ 
+ static inline bool
+ bmp_iter_and_compl (bitmap_iterator *bi, unsigned *bit_no)
+ {
+   if (bi->bits)
+     {
+       bmp_iter_advance (bi, bit_no);
+       return true;
+     }
+   return bmp_iter_and_compl_1 (bi, bit_no);
+ }
+ 
  /* Loop over all bits set in BITMAP, starting with MIN and setting
     BITNUM to the bit number.  ITER is a bitmap iterator.  BITNUM
     should be treated as a read-only variable as it contains loop
Index: tree-ssa-loop-ivopts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
retrieving revision 2.24
diff -c -3 -p -r2.24 tree-ssa-loop-ivopts.c
*** tree-ssa-loop-ivopts.c	4 Nov 2004 08:57:55 -0000	2.24
--- tree-ssa-loop-ivopts.c	6 Nov 2004 15:06:26 -0000
*************** struct iv_use
*** 158,164 ****
    struct iv *iv;	/* The induction variable it is based on.  */
    tree stmt;		/* Statement in that it occurs.  */
    tree *op_p;		/* The place where it occurs.  */
!   bitmap related_cands;	/* The set of "related" iv candidates.  */
  
    unsigned n_map_members; /* Number of candidates in the cost_map list.  */
    struct cost_pair *cost_map;
--- 158,165 ----
    struct iv *iv;	/* The induction variable it is based on.  */
    tree stmt;		/* Statement in that it occurs.  */
    tree *op_p;		/* The place where it occurs.  */
!   bitmap related_cands;	/* The set of "related" iv candidates, plus the common
! 			   important ones.  */
  
    unsigned n_map_members; /* Number of candidates in the cost_map list.  */
    struct cost_pair *cost_map;
*************** struct ivopts_data
*** 227,232 ****
--- 228,285 ----
    bool consider_all_candidates;
  };
  
+ /* An assignment of iv candidates to uses.  */
+ 
+ struct iv_ca
+ {
+   /* The number of uses covered by the assignment.  */
+   unsigned upto;
+ 
+   /* Number of uses that cannot be expressed by the candidates in the set.  */
+   unsigned bad_uses;
+ 
+   /* Candidate assigned to a use, together with the related costs.  */
+   struct cost_pair **cand_for_use;
+ 
+   /* Number of times each candidate is used.  */
+   unsigned *n_cand_uses;
+ 
+   /* The candidates used.  */
+   bitmap cands;
+ 
+   /* Total number of registers needed.  */
+   unsigned n_regs;
+ 
+   /* Total cost of expressing uses.  */
+   unsigned cand_use_cost;
+ 
+   /* Total cost of candidates.  */
+   unsigned cand_cost;
+ 
+   /* Number of times each invariant is used.  */
+   unsigned *n_invariant_uses;
+ 
+   /* Total cost of the assignment.  */
+   unsigned cost;
+ };
+ 
+ /* Difference of two iv candidate assignments.  */
+ 
+ struct iv_ca_delta
+ {
+   /* Changed use.  */
+   struct iv_use *use;
+ 
+   /* An old assignment (for rollback purposes).  */
+   struct cost_pair *old_cp;
+ 
+   /* A new assignment.  */
+   struct cost_pair *new_cp;
+ 
+   /* Next change in the list.  */
+   struct iv_ca_delta *next_change;
+ };
+ 
  /* Bound on number of candidates below that all candidates are considered.  */
  
  #define CONSIDER_ALL_CANDIDATES_BOUND \
*************** tree_ssa_iv_optimize_init (struct loops 
*** 589,594 ****
--- 642,648 ----
    data->version_info = xcalloc (data->version_info_size,
  				sizeof (struct version_info));
    data->relevant = BITMAP_XMALLOC ();
+   data->important_candidates = BITMAP_XMALLOC ();
    data->max_inv_id = 0;
  
    for (i = 1; i < loops->num; i++)
*************** add_derived_ivs_candidates (struct ivopt
*** 1889,1894 ****
--- 1943,1987 ----
      }
  }
  
+ /* Record important candidates and add them to related_cands bitmaps
+    if needed.  */
+ 
+ static void
+ record_important_candidates (struct ivopts_data *data)
+ {
+   unsigned i;
+   struct iv_use *use;
+ 
+   for (i = 0; i < n_iv_cands (data); i++)
+     {
+       struct iv_cand *cand = iv_cand (data, i);
+ 
+       if (cand->important)
+ 	bitmap_set_bit (data->important_candidates, i);
+     }
+ 
+   data->consider_all_candidates = (n_iv_cands (data)
+ 				   <= CONSIDER_ALL_CANDIDATES_BOUND);
+ 
+   if (data->consider_all_candidates)
+     {
+       /* We will not need "related_cands" bitmaps in this case,
+ 	 so release them to decrease peak memory consumption.  */
+       for (i = 0; i < n_iv_uses (data); i++)
+ 	{
+ 	  use = iv_use (data, i);
+ 	  BITMAP_XFREE (use->related_cands);
+ 	}
+     }
+   else
+     {
+       /* Add important candidates to the related_cands bitmaps.  */
+       for (i = 0; i < n_iv_uses (data); i++)
+ 	bitmap_ior_into (iv_use (data, i)->related_cands,
+ 			 data->important_candidates);
+     }
+ }
+ 
  /* Finds the candidates for the induction variables.  */
  
  static void
*************** find_iv_candidates (struct ivopts_data *
*** 1902,1907 ****
--- 1995,2003 ----
  
    /* Add induction variables derived from uses.  */
    add_derived_ivs_candidates (data);
+ 
+   /* Record the important candidates.  */
+   record_important_candidates (data);
  }
  
  /* Allocates the data structure mapping the (use, candidate) pairs to costs.
*************** find_iv_candidates (struct ivopts_data *
*** 1911,1927 ****
  static void
  alloc_use_cost_map (struct ivopts_data *data)
  {
!   unsigned i, n_imp = 0, size, j;
! 
!   if (!data->consider_all_candidates)
!     {
!       for (i = 0; i < n_iv_cands (data); i++)
! 	{
! 	  struct iv_cand *cand = iv_cand (data, i);
! 	  if (cand->important)
! 	    n_imp++;
! 	}
!     }
  
    for (i = 0; i < n_iv_uses (data); i++)
      {
--- 2007,2013 ----
  static void
  alloc_use_cost_map (struct ivopts_data *data)
  {
!   unsigned i, size, s, j;
  
    for (i = 0; i < n_iv_uses (data); i++)
      {
*************** alloc_use_cost_map (struct ivopts_data *
*** 1929,1948 ****
        bitmap_iterator bi;
  
        if (data->consider_all_candidates)
! 	{
! 	  size = n_iv_cands (data);
! 	  use->n_map_members = size;
! 	}
        else
  	{
! 	  size = n_imp;
  	  EXECUTE_IF_SET_IN_BITMAP (use->related_cands, 0, j, bi)
  	    {
! 	      size++;
  	    }
! 	  use->n_map_members = 0;
  	}
  
        use->cost_map = xcalloc (size, sizeof (struct cost_pair));
      }
  }
--- 2015,2035 ----
        bitmap_iterator bi;
  
        if (data->consider_all_candidates)
! 	size = n_iv_cands (data);
        else
  	{
! 	  s = 0;
  	  EXECUTE_IF_SET_IN_BITMAP (use->related_cands, 0, j, bi)
  	    {
! 	      s++;
  	    }
! 
! 	  /* Round up to the power of two, so that moduling by it is fast.  */
! 	  for (size = 1; size < s; size <<= 1)
! 	    continue;
  	}
  
+       use->n_map_members = size;
        use->cost_map = xcalloc (size, sizeof (struct cost_pair));
      }
  }
*************** set_use_iv_cost (struct ivopts_data *dat
*** 1955,1965 ****
  		 struct iv_use *use, struct iv_cand *cand, unsigned cost,
  		 bitmap depends_on)
  {
!   if (cost == INFTY
!       && depends_on)
      {
!       BITMAP_XFREE (depends_on);
!       depends_on = NULL;
      }
  
    if (data->consider_all_candidates)
--- 2042,2054 ----
  		 struct iv_use *use, struct iv_cand *cand, unsigned cost,
  		 bitmap depends_on)
  {
!   unsigned i, s;
! 
!   if (cost == INFTY)
      {
!       if (depends_on)
! 	BITMAP_XFREE (depends_on);
!       return;
      }
  
    if (data->consider_all_candidates)
*************** set_use_iv_cost (struct ivopts_data *dat
*** 1970,2011 ****
        return;
      }
  
!   if (cost == INFTY)
!     return;
! 
!   use->cost_map[use->n_map_members].cand = cand;
!   use->cost_map[use->n_map_members].cost = cost;
!   use->cost_map[use->n_map_members].depends_on = depends_on;
!   use->n_map_members++;
! }
! 
! /* Gets cost of (USE, CANDIDATE) pair.  Stores the bitmap of dependencies to
!    DEPENDS_ON.  */
! 
! static unsigned
! get_use_iv_cost (struct ivopts_data *data,
! 		 struct iv_use *use, struct iv_cand *cand, bitmap *depends_on)
  {
!   unsigned i;
  
    if (!cand)
!     return INFTY;
  
    if (data->consider_all_candidates)
-     i = cand->id;
-   else
      {
!       for (i = 0; i < use->n_map_members; i++)
! 	if (use->cost_map[i].cand == cand)
! 	  break;
  
!       if (i == use->n_map_members)
! 	return INFTY;
      }
  
!   if (depends_on)
!     *depends_on = use->cost_map[i].depends_on;
!   return use->cost_map[i].cost;
  }
  
  /* Returns estimate on cost of computing SEQ.  */
--- 2059,2113 ----
        return;
      }
  
!   /* n_map_members is a power of two, so this computes modulo.  */
!   s = cand->id & (use->n_map_members - 1);
!   for (i = s; i < use->n_map_members; i++)
!     if (!use->cost_map[i].cand)
!       goto found;
!   for (i = 0; i < s; i++)
!     if (!use->cost_map[i].cand)
!       goto found;
! 
!   gcc_unreachable ();
! 
! found:
!   use->cost_map[i].cand = cand;
!   use->cost_map[i].cost = cost;
!   use->cost_map[i].depends_on = depends_on;
! }
! 
! /* Gets cost of (USE, CANDIDATE) pair.  */
! 
! static struct cost_pair *
! get_use_iv_cost (struct ivopts_data *data, struct iv_use *use,
! 		 struct iv_cand *cand)
  {
!   unsigned i, s;
!   struct cost_pair *ret;
  
    if (!cand)
!     return NULL;
  
    if (data->consider_all_candidates)
      {
!       ret = use->cost_map + cand->id;
!       if (!ret->cand)
! 	return NULL;
  
!       return ret;
      }
+       
+   /* n_map_members is a power of two, so this computes modulo.  */
+   s = cand->id & (use->n_map_members - 1);
+   for (i = s; i < use->n_map_members; i++)
+     if (use->cost_map[i].cand == cand)
+       return use->cost_map + i;
+ 
+   for (i = 0; i < s; i++)
+     if (use->cost_map[i].cand == cand)
+       return use->cost_map + i;
  
!   return NULL;
  }
  
  /* Returns estimate on cost of computing SEQ.  */
*************** get_computation_cost (struct ivopts_data
*** 2948,2954 ****
  /* Determines cost of basing replacement of USE on CAND in a generic
     expression.  */
  
! static void
  determine_use_iv_cost_generic (struct ivopts_data *data,
  			       struct iv_use *use, struct iv_cand *cand)
  {
--- 3050,3056 ----
  /* Determines cost of basing replacement of USE on CAND in a generic
     expression.  */
  
! static bool
  determine_use_iv_cost_generic (struct ivopts_data *data,
  			       struct iv_use *use, struct iv_cand *cand)
  {
*************** determine_use_iv_cost_generic (struct iv
*** 2956,2966 ****
    unsigned cost = get_computation_cost (data, use, cand, false, &depends_on);
  
    set_use_iv_cost (data, use, cand, cost, depends_on);
  }
  
  /* Determines cost of basing replacement of USE on CAND in an address.  */
  
! static void
  determine_use_iv_cost_address (struct ivopts_data *data,
  			       struct iv_use *use, struct iv_cand *cand)
  {
--- 3058,3070 ----
    unsigned cost = get_computation_cost (data, use, cand, false, &depends_on);
  
    set_use_iv_cost (data, use, cand, cost, depends_on);
+ 
+   return cost != INFTY;
  }
  
  /* Determines cost of basing replacement of USE on CAND in an address.  */
  
! static bool
  determine_use_iv_cost_address (struct ivopts_data *data,
  			       struct iv_use *use, struct iv_cand *cand)
  {
*************** determine_use_iv_cost_address (struct iv
*** 2968,2973 ****
--- 3072,3079 ----
    unsigned cost = get_computation_cost (data, use, cand, true, &depends_on);
  
    set_use_iv_cost (data, use, cand, cost, depends_on);
+ 
+   return cost != INFTY;
  }
  
  /* Computes value of induction variable IV in iteration NITER.  */
*************** may_eliminate_iv (struct loop *loop,
*** 3069,3075 ****
  
  /* Determines cost of basing replacement of USE on CAND in a condition.  */
  
! static void
  determine_use_iv_cost_condition (struct ivopts_data *data,
  				 struct iv_use *use, struct iv_cand *cand)
  {
--- 3175,3181 ----
  
  /* Determines cost of basing replacement of USE on CAND in a condition.  */
  
! static bool
  determine_use_iv_cost_condition (struct ivopts_data *data,
  				 struct iv_use *use, struct iv_cand *cand)
  {
*************** determine_use_iv_cost_condition (struct 
*** 3080,3086 ****
    if (!cand->iv)
      {
        set_use_iv_cost (data, use, cand, INFTY, NULL);
!       return;
      }
  
    if (may_eliminate_iv (data->current_loop, use, cand, &compare, &bound))
--- 3186,3192 ----
    if (!cand->iv)
      {
        set_use_iv_cost (data, use, cand, INFTY, NULL);
!       return false;
      }
  
    if (may_eliminate_iv (data->current_loop, use, cand, &compare, &bound))
*************** determine_use_iv_cost_condition (struct 
*** 3089,3095 ****
        unsigned cost = force_var_cost (data, bound, &depends_on);
  
        set_use_iv_cost (data, use, cand, cost, depends_on);
!       return;
      }
  
    /* The induction variable elimination failed; just express the original
--- 3195,3201 ----
        unsigned cost = force_var_cost (data, bound, &depends_on);
  
        set_use_iv_cost (data, use, cand, cost, depends_on);
!       return cost != INFTY;
      }
  
    /* The induction variable elimination failed; just express the original
*************** determine_use_iv_cost_condition (struct 
*** 3103,3109 ****
        record_invariant (data, TREE_OPERAND (*use->op_p, 1), true);
      }
  
!   determine_use_iv_cost_generic (data, use, cand);
  }
  
  /* Checks whether it is possible to replace the final value of USE by
--- 3209,3215 ----
        record_invariant (data, TREE_OPERAND (*use->op_p, 1), true);
      }
  
!   return determine_use_iv_cost_generic (data, use, cand);
  }
  
  /* Checks whether it is possible to replace the final value of USE by
*************** may_replace_final_value (struct loop *lo
*** 3135,3141 ****
  
  /* Determines cost of replacing final value of USE using CAND.  */
  
! static void
  determine_use_iv_cost_outer (struct ivopts_data *data,
  			     struct iv_use *use, struct iv_cand *cand)
  {
--- 3241,3247 ----
  
  /* Determines cost of replacing final value of USE using CAND.  */
  
! static bool
  determine_use_iv_cost_outer (struct ivopts_data *data,
  			     struct iv_use *use, struct iv_cand *cand)
  {
*************** determine_use_iv_cost_outer (struct ivop
*** 3150,3156 ****
        if (!may_replace_final_value (loop, use, &value))
  	{
  	  set_use_iv_cost (data, use, cand, INFTY, NULL);
! 	  return;
  	}
  
        depends_on = NULL;
--- 3256,3262 ----
        if (!may_replace_final_value (loop, use, &value))
  	{
  	  set_use_iv_cost (data, use, cand, INFTY, NULL);
! 	  return false;
  	}
  
        depends_on = NULL;
*************** determine_use_iv_cost_outer (struct ivop
*** 3159,3165 ****
        cost /= AVG_LOOP_NITER (loop);
  
        set_use_iv_cost (data, use, cand, cost, depends_on);
!       return;
      }
  
    exit = single_dom_exit (loop);
--- 3265,3271 ----
        cost /= AVG_LOOP_NITER (loop);
  
        set_use_iv_cost (data, use, cand, cost, depends_on);
!       return cost != INFTY;
      }
  
    exit = single_dom_exit (loop);
*************** determine_use_iv_cost_outer (struct ivop
*** 3179,3209 ****
      }
  				   
    set_use_iv_cost (data, use, cand, cost, depends_on);
  }
  
! /* Determines cost of basing replacement of USE on CAND.  */
  
! static void
  determine_use_iv_cost (struct ivopts_data *data,
  		       struct iv_use *use, struct iv_cand *cand)
  {
    switch (use->type)
      {
      case USE_NONLINEAR_EXPR:
!       determine_use_iv_cost_generic (data, use, cand);
!       break;
  
      case USE_OUTER:
!       determine_use_iv_cost_outer (data, use, cand);
!       break;
  
      case USE_ADDRESS:
!       determine_use_iv_cost_address (data, use, cand);
!       break;
  
      case USE_COMPARE:
!       determine_use_iv_cost_condition (data, use, cand);
!       break;
  
      default:
        gcc_unreachable ();
--- 3285,3314 ----
      }
  				   
    set_use_iv_cost (data, use, cand, cost, depends_on);
+ 
+   return cost != INFTY;
  }
  
! /* Determines cost of basing replacement of USE on CAND.  Returns false
!    if USE cannot be based on CAND.  */
  
! static bool
  determine_use_iv_cost (struct ivopts_data *data,
  		       struct iv_use *use, struct iv_cand *cand)
  {
    switch (use->type)
      {
      case USE_NONLINEAR_EXPR:
!       return determine_use_iv_cost_generic (data, use, cand);
  
      case USE_OUTER:
!       return determine_use_iv_cost_outer (data, use, cand);
  
      case USE_ADDRESS:
!       return determine_use_iv_cost_address (data, use, cand);
  
      case USE_COMPARE:
!       return determine_use_iv_cost_condition (data, use, cand);
  
      default:
        gcc_unreachable ();
*************** determine_use_iv_costs (struct ivopts_da
*** 3218,3245 ****
    unsigned i, j;
    struct iv_use *use;
    struct iv_cand *cand;
! 
!   data->consider_all_candidates = (n_iv_cands (data)
! 				   <= CONSIDER_ALL_CANDIDATES_BOUND);
  
    alloc_use_cost_map (data);
  
-   if (!data->consider_all_candidates)
-     {
-       /* Add the important candidate entries.  */
-       for (j = 0; j < n_iv_cands (data); j++)
- 	{
- 	  cand = iv_cand (data, j);
- 	  if (!cand->important)
- 	    continue;
- 	  for (i = 0; i < n_iv_uses (data); i++)
- 	    {
- 	      use = iv_use (data, i);
- 	      determine_use_iv_cost (data, use, cand);
- 	    }
- 	}
-     }
- 
    for (i = 0; i < n_iv_uses (data); i++)
      {
        use = iv_use (data, i);
--- 3323,3332 ----
    unsigned i, j;
    struct iv_use *use;
    struct iv_cand *cand;
!   bitmap to_clear = BITMAP_XMALLOC ();
  
    alloc_use_cost_map (data);
  
    for (i = 0; i < n_iv_uses (data); i++)
      {
        use = iv_use (data, i);
*************** determine_use_iv_costs (struct ivopts_da
*** 3259,3270 ****
  	  EXECUTE_IF_SET_IN_BITMAP (use->related_cands, 0, j, bi)
  	    {
  	      cand = iv_cand (data, j);
! 	      if (!cand->important)
! 	        determine_use_iv_cost (data, use, cand);
  	    }
  	}
      }
  
    if (dump_file && (dump_flags & TDF_DETAILS))
      {
        fprintf (dump_file, "Use-candidate costs:\n");
--- 3346,3364 ----
  	  EXECUTE_IF_SET_IN_BITMAP (use->related_cands, 0, j, bi)
  	    {
  	      cand = iv_cand (data, j);
! 	      if (!determine_use_iv_cost (data, use, cand))
! 		bitmap_set_bit (to_clear, j);
  	    }
+ 
+ 	  /* Remove the candidates for that the cost is infinite from
+ 	     the list of related candidates.  */
+ 	  bitmap_and_compl_into (use->related_cands, to_clear);
+ 	  bitmap_clear (to_clear);
  	}
      }
  
+   BITMAP_XFREE (to_clear);
+ 
    if (dump_file && (dump_flags & TDF_DETAILS))
      {
        fprintf (dump_file, "Use-candidate costs:\n");
*************** determine_set_costs (struct ivopts_data 
*** 3451,3609 ****
      }
  }
  
! /* Finds a best candidate for USE and stores it to CAND.  The candidates are
!    taken from the set SOL and they may depend on invariants in the set INV.
!    The really used candidate and invariants are noted to USED_IVS and
!    USED_INV.  */
  
! static unsigned
! find_best_candidate (struct ivopts_data *data,
! 		     struct iv_use *use, bitmap sol, bitmap inv,
! 		     bitmap used_ivs, bitmap used_inv, struct iv_cand **cand)
! {
!   unsigned c, d;
!   unsigned best_cost = INFTY, cost;
!   struct iv_cand *cnd = NULL, *acnd;
!   bitmap depends_on = NULL, asol;
!   bitmap_iterator bi, bi1;
  
!   if (data->consider_all_candidates)
!     asol = sol;
!   else
      {
!       asol = BITMAP_XMALLOC ();
  
!       bitmap_ior (asol, data->important_candidates, use->related_cands);
!       bitmap_and_into (asol, sol);
      }
  
!   EXECUTE_IF_SET_IN_BITMAP (asol, 0, c, bi)
      {
!       acnd = iv_cand (data, c);
!       cost = get_use_iv_cost (data, use, acnd, &depends_on);
  
!       if (cost == INFTY)
! 	continue;
!       if (cost > best_cost)
! 	continue;
!       if (cost == best_cost)
  	{
! 	  /* Prefer the cheaper iv.  */
! 	  if (acnd->cost >= cnd->cost)
! 	    continue;
  	}
  
!       if (depends_on)
  	{
! 	  EXECUTE_IF_AND_COMPL_IN_BITMAP (depends_on, inv, 0, d, bi1)
  	    {
! 	      goto next_cand;
  	    }
- 	  if (used_inv)
- 	    bitmap_ior_into (used_inv, depends_on);
  	}
  
!       cnd = acnd;
!       best_cost = cost;
! 
! next_cand: ;
      }
  
!   if (cnd && used_ivs)
!     bitmap_set_bit (used_ivs, cnd->id);
  
!   if (cand)
!     *cand = cnd;
  
!   if (!data->consider_all_candidates)
!     BITMAP_XFREE (asol);
  
!   return best_cost;
  }
  
! /* Returns cost of set of ivs SOL + invariants INV.  Removes unnecessary
!    induction variable candidates and invariants from the sets.  Only
!    uses 0 .. MAX_USE - 1 are taken into account.  */
  
  static unsigned
! set_cost_up_to (struct ivopts_data *data, bitmap sol, bitmap inv,
! 		unsigned max_use)
  {
    unsigned i;
-   unsigned cost = 0, size = 0, acost;
-   struct iv_use *use;
-   struct iv_cand *cand;
-   bitmap used_ivs = BITMAP_XMALLOC (), used_inv = BITMAP_XMALLOC ();
    bitmap_iterator bi;
  
!   for (i = 0; i < max_use; i++)
      {
!       use = iv_use (data, i);
!       acost = find_best_candidate (data, use, sol, inv,
! 				   used_ivs, used_inv, NULL);
!       if (acost == INFTY)
  	{
! 	  BITMAP_XFREE (used_ivs);
! 	  BITMAP_XFREE (used_inv);
! 	  return INFTY;
  	}
!       cost += acost;
      }
  
!   EXECUTE_IF_SET_IN_BITMAP (used_ivs, 0, i, bi)
!     {
!       cand = iv_cand (data, i);
  
!       /* Do not count the pseudocandidates.  */
!       if (cand->iv)
! 	size++;
  
!       cost += cand->cost;
!     }
!   EXECUTE_IF_SET_IN_BITMAP (used_inv, 0, i, bi)
      {
!       size++;
      }
-   cost += ivopts_global_cost_for_size (data, size);
  
!   bitmap_copy (sol, used_ivs);
!   bitmap_copy (inv, used_inv);
  
!   BITMAP_XFREE (used_ivs);
!   BITMAP_XFREE (used_inv);
  
    return cost;
  }
  
! /* Computes cost of set of ivs SOL + invariants INV.  Removes unnecessary
!    induction variable candidates and invariants from the sets.  */
  
  static unsigned
! set_cost (struct ivopts_data *data, bitmap sol, bitmap inv)
  {
!   return set_cost_up_to (data, sol, inv, n_iv_uses (data));
  }
  
! /* Tries to extend the sets IVS and INV in the best possible way in order
     to express the USE.  */
  
  static bool
! try_add_cand_for (struct ivopts_data *data, bitmap ivs, bitmap inv,
  		  struct iv_use *use)
  {
!   unsigned best_cost = set_cost_up_to (data, ivs, inv, use->id + 1), act_cost;
!   bitmap best_ivs = BITMAP_XMALLOC ();
!   bitmap best_inv = BITMAP_XMALLOC ();
!   bitmap act_ivs = BITMAP_XMALLOC ();
!   bitmap act_inv = BITMAP_XMALLOC ();
    unsigned i;
-   struct cost_pair *cp;
    bitmap_iterator bi;
    struct iv_cand *cand;
!   bitmap depends_on;
  
!   bitmap_copy (best_ivs, ivs);
!   bitmap_copy (best_inv, inv);
  
    /* First try important candidates.  Only if it fails, try the specific ones.
       Rationale -- in loops with many variables the best choice often is to use
--- 3545,4020 ----
      }
  }
  
! /* Returns true if A is a cheaper cost pair than B.  */
  
! static bool
! cheaper_cost_pair (struct cost_pair *a, struct cost_pair *b)
! {
!   if (!a)
!     return false;
  
!   if (!b)
!     return true;
! 
!   if (a->cost < b->cost)
!     return true;
! 
!   if (a->cost > b->cost)
!     return false;
! 
!   /* In case the costs are the same, prefer the cheaper candidate.  */
!   if (a->cand->cost < b->cand->cost)
!     return true;
! 
!   return false;
! }
! 
! /* Computes the cost field of IVS structure.  */
! 
! static void
! iv_ca_recount_cost (struct ivopts_data *data, struct iv_ca *ivs)
! {
!   unsigned cost = 0;
! 
!   cost += ivs->cand_use_cost;
!   cost += ivs->cand_cost;
!   cost += ivopts_global_cost_for_size (data, ivs->n_regs);
! 
!   ivs->cost = cost;
! }
! 
! /* Set USE not to be expressed by any candidate in IVS.  */
! 
! static void
! iv_ca_set_no_cp (struct ivopts_data *data, struct iv_ca *ivs,
! 		 struct iv_use *use)
! {
!   unsigned uid = use->id, cid, iid;
!   bitmap deps;
!   struct cost_pair *cp;
!   bitmap_iterator bi;
! 
!   cp = ivs->cand_for_use[uid];
!   if (!cp)
!     return;
!   cid = cp->cand->id;
! 
!   ivs->bad_uses++;
!   ivs->cand_for_use[uid] = NULL;
!   ivs->n_cand_uses[cid]--;
! 
!   if (ivs->n_cand_uses[cid] == 0)
      {
!       bitmap_clear_bit (ivs->cands, cid);
!       /* Do not count the pseudocandidates.  */
!       if (cp->cand->iv)
! 	ivs->n_regs--;
!       ivs->cand_cost -= cp->cand->cost;
!     }
! 
!   ivs->cand_use_cost -= cp->cost;
  
!   deps = cp->depends_on;
! 
!   if (deps)
!     {
!       EXECUTE_IF_SET_IN_BITMAP (deps, 0, iid, bi)
! 	{
! 	  ivs->n_invariant_uses[iid]--;
! 	  if (ivs->n_invariant_uses[iid] == 0)
! 	    ivs->n_regs--;
! 	}
      }
  
!   iv_ca_recount_cost (data, ivs);
! }
! 
! /* Set cost pair for USE in set IVS to CP.  */
! 
! static void
! iv_ca_set_cp (struct ivopts_data *data, struct iv_ca *ivs,
! 	      struct iv_use *use, struct cost_pair *cp)
! {
!   unsigned uid = use->id, cid, iid;
!   bitmap deps;
!   bitmap_iterator bi;
! 
!   if (ivs->cand_for_use[uid] == cp)
!     return;
! 
!   if (ivs->cand_for_use[uid])
!     iv_ca_set_no_cp (data, ivs, use);
! 
!   if (cp)
      {
!       cid = cp->cand->id;
  
!       ivs->bad_uses--;
!       ivs->cand_for_use[uid] = cp;
!       ivs->n_cand_uses[cid]++;
!       if (ivs->n_cand_uses[cid] == 1)
  	{
! 	  bitmap_set_bit (ivs->cands, cid);
! 	  /* Do not count the pseudocandidates.  */
! 	  if (cp->cand->iv)
! 	    ivs->n_regs++;
! 	  ivs->cand_cost += cp->cand->cost;
  	}
  
!       ivs->cand_use_cost += cp->cost;
! 
!       deps = cp->depends_on;
! 
!       if (deps)
  	{
! 	  EXECUTE_IF_SET_IN_BITMAP (deps, 0, iid, bi)
  	    {
! 	      ivs->n_invariant_uses[iid]++;
! 	      if (ivs->n_invariant_uses[iid] == 1)
! 		ivs->n_regs++;
  	    }
  	}
  
!       iv_ca_recount_cost (data, ivs);
      }
+ }
+ 
+ /* Extend set IVS by expressing USE by some of the candidates in it
+    if possible.  */
+ 
+ static void
+ iv_ca_add_use (struct ivopts_data *data, struct iv_ca *ivs,
+ 	       struct iv_use *use)
+ {
+   struct cost_pair *best_cp = NULL, *cp;
+   bitmap_iterator bi;
+   unsigned i;
+ 
+   gcc_assert (ivs->upto >= use->id);
  
!   if (ivs->upto == use->id)
!     {
!       ivs->upto++;
!       ivs->bad_uses++;
!     }
  
!   EXECUTE_IF_SET_IN_BITMAP (ivs->cands, 0, i, bi)
!     {
!       cp = get_use_iv_cost (data, use, iv_cand (data, i));
  
!       if (cheaper_cost_pair (cp, best_cp))
! 	best_cp = cp;
!     }
  
!   iv_ca_set_cp (data, ivs, use, best_cp);
  }
  
! /* Get cost for assignment IVS.  */
  
  static unsigned
! iv_ca_cost (struct iv_ca *ivs)
! {
!   return (ivs->bad_uses ? INFTY : ivs->cost);
! }
! 
! /* Returns true if all dependences of CP are among invariants in IVS.  */
! 
! static bool
! iv_ca_has_deps (struct iv_ca *ivs, struct cost_pair *cp)
  {
    unsigned i;
    bitmap_iterator bi;
  
!   if (!cp->depends_on)
!     return true;
! 
!   EXECUTE_IF_SET_IN_BITMAP (cp->depends_on, 0, i, bi)
      {
!       if (ivs->n_invariant_uses[i] == 0)
! 	return false;
!     }
! 
!   return true;
! }
! 
! /* Creates change of expressing USE by NEW_CP instead of OLD_CP and chains
!    it before NEXT_CHANGE.  */
! 
! static struct iv_ca_delta *
! iv_ca_delta_add (struct iv_use *use, struct cost_pair *old_cp,
! 		 struct cost_pair *new_cp, struct iv_ca_delta *next_change)
! {
!   struct iv_ca_delta *change = xmalloc (sizeof (struct iv_ca_delta));
! 
!   change->use = use;
!   change->old_cp = old_cp;
!   change->new_cp = new_cp;
!   change->next_change = next_change;
! 
!   return change;
! }
! 
! /* Returns candidate by that USE is expressed in IVS.  */
! 
! static struct cost_pair *
! iv_ca_cand_for_use (struct iv_ca *ivs, struct iv_use *use)
! {
!   return ivs->cand_for_use[use->id];
! }
! 
! /* Commit changes in DELTA to IVS.  If FORWARD is false, the changes are
!    reverted instead.  */
! 
! static void
! iv_ca_delta_commit (struct ivopts_data *data, struct iv_ca *ivs,
! 		    struct iv_ca_delta *delta, bool forward)
! {
!   struct cost_pair *from, *to;
! 
!   for (; delta; delta = delta->next_change)
!     {
!       if (forward)
  	{
! 	  from = delta->old_cp;
! 	  to = delta->new_cp;
! 	}
!       else
! 	{
! 	  from = delta->new_cp;
! 	  to = delta->old_cp;
  	}
! 
!       gcc_assert (iv_ca_cand_for_use (ivs, delta->use) == from);
!       iv_ca_set_cp (data, ivs, delta->use, to);
      }
+ }
  
! /* Returns true if CAND is used in IVS.  */
  
! static bool
! iv_ca_cand_used_p (struct iv_ca *ivs, struct iv_cand *cand)
! {
!   return ivs->n_cand_uses[cand->id] > 0;
! }
  
! /* Free the list of changes DELTA.  */
! 
! static void
! iv_ca_delta_free (struct iv_ca_delta **delta)
! {
!   struct iv_ca_delta *act, *next;
! 
!   for (act = *delta; act; act = next)
      {
!       next = act->next_change;
!       free (act);
      }
  
!   *delta = NULL;
! }
! 
! /* Allocates new iv candidates assignment.  */
! 
! static struct iv_ca *
! iv_ca_new (struct ivopts_data *data)
! {
!   struct iv_ca *nw = xmalloc (sizeof (struct iv_ca));
! 
!   nw->upto = 0;
!   nw->bad_uses = 0;
!   nw->cand_for_use = xcalloc (n_iv_uses (data), sizeof (struct cost_pair *));
!   nw->n_cand_uses = xcalloc (n_iv_cands (data), sizeof (unsigned));
!   nw->cands = BITMAP_XMALLOC ();
!   nw->n_regs = 0;
!   nw->cand_use_cost = 0;
!   nw->cand_cost = 0;
!   nw->n_invariant_uses = xcalloc (data->max_inv_id + 1, sizeof (unsigned));
!   nw->cost = 0;
! 
!   return nw;
! }
  
! /* Free memory occupied by the set IVS.  */
! 
! static void
! iv_ca_free (struct iv_ca **ivs)
! {
!   free ((*ivs)->cand_for_use);
!   free ((*ivs)->n_cand_uses);
!   BITMAP_XFREE ((*ivs)->cands);
!   free ((*ivs)->n_invariant_uses);
!   free (*ivs);
!   *ivs = NULL;
! }
! 
! /* Dumps IVS to FILE.  */
! 
! static void
! iv_ca_dump (struct ivopts_data *data, FILE *file, struct iv_ca *ivs)
! {
!   const char *pref = "  invariants ";
!   unsigned i;
! 
!   fprintf (file, "  cost %d\n", iv_ca_cost (ivs));
!   bitmap_print (file, ivs->cands, "  candidates ","\n");
! 
!   for (i = 1; i <= data->max_inv_id; i++)
!     if (ivs->n_invariant_uses[i])
!       {
! 	fprintf (file, "%s%d", pref, i);
! 	pref = ", ";
!       }
!   fprintf (file, "\n");
! }
! 
! /* Try changing candidate in IVS to CAND for each use.  Return cost of the
!    new set, and store differences in DELTA.  */
! 
! static unsigned
! iv_ca_extend (struct ivopts_data *data, struct iv_ca *ivs,
! 	      struct iv_cand *cand, struct iv_ca_delta **delta)
! {
!   unsigned i, cost;
!   struct iv_use *use;
!   struct cost_pair *old_cp, *new_cp;
! 
!   *delta = NULL;
!   for (i = 0; i < ivs->upto; i++)
!     {
!       use = iv_use (data, i);
!       old_cp = iv_ca_cand_for_use (ivs, use);
! 
!       if (old_cp
! 	  && old_cp->cand == cand)
! 	continue;
! 
!       new_cp = get_use_iv_cost (data, use, cand);
!       if (!new_cp)
! 	continue;
! 
!       if (!iv_ca_has_deps (ivs, new_cp))
! 	continue;
!       
!       if (!cheaper_cost_pair (new_cp, old_cp))
! 	continue;
! 
!       *delta = iv_ca_delta_add (use, old_cp, new_cp, *delta);
!     }
! 
!   iv_ca_delta_commit (data, ivs, *delta, true);
!   cost = iv_ca_cost (ivs);
!   iv_ca_delta_commit (data, ivs, *delta, false);
  
    return cost;
  }
  
! /* Try narowing set IVS by removing CAND.  Return the cost of
!    the new set and store the differences in DELTA.  */
  
  static unsigned
! iv_ca_narrow (struct ivopts_data *data, struct iv_ca *ivs,
! 	      struct iv_cand *cand, struct iv_ca_delta **delta)
  {
!   unsigned i, ci;
!   struct iv_use *use;
!   struct cost_pair *old_cp, *new_cp, *cp;
!   bitmap_iterator bi;
!   struct iv_cand *cnd;
!   unsigned cost;
! 
!   *delta = NULL;
!   for (i = 0; i < n_iv_uses (data); i++)
!     {
!       use = iv_use (data, i);
! 
!       old_cp = iv_ca_cand_for_use (ivs, use);
!       if (old_cp->cand != cand)
! 	continue;
! 
!       new_cp = NULL;
! 
!       if (data->consider_all_candidates)
! 	{
! 	  EXECUTE_IF_SET_IN_BITMAP (ivs->cands, 0, ci, bi)
! 	    {
! 	      if (ci == cand->id)
! 		continue;
! 
! 	      cnd = iv_cand (data, ci);
! 
! 	      cp = get_use_iv_cost (data, use, cnd);
! 	      if (!cp)
! 		continue;
! 	      if (!iv_ca_has_deps (ivs, cp))
! 		continue;
!       
! 	      if (!cheaper_cost_pair (cp, new_cp))
! 		continue;
! 
! 	      new_cp = cp;
! 	    }
! 	}
!       else
! 	{
! 	  EXECUTE_IF_AND_IN_BITMAP (use->related_cands, ivs->cands, 0, ci, bi)
! 	    {
! 	      if (ci == cand->id)
! 		continue;
! 
! 	      cnd = iv_cand (data, ci);
! 
! 	      cp = get_use_iv_cost (data, use, cnd);
! 	      if (!cp)
! 		continue;
! 	      if (!iv_ca_has_deps (ivs, cp))
! 		continue;
!       
! 	      if (!cheaper_cost_pair (cp, new_cp))
! 		continue;
! 
! 	      new_cp = cp;
! 	    }
! 	}
! 
!       if (!new_cp)
! 	{
! 	  iv_ca_delta_free (delta);
! 	  return INFTY;
! 	}
! 
!       *delta = iv_ca_delta_add (use, old_cp, new_cp, *delta);
!     }
! 
!   iv_ca_delta_commit (data, ivs, *delta, true);
!   cost = iv_ca_cost (ivs);
!   iv_ca_delta_commit (data, ivs, *delta, false);
! 
!   return cost;
  }
  
! /* Tries to extend the sets IVS in the best possible way in order
     to express the USE.  */
  
  static bool
! try_add_cand_for (struct ivopts_data *data, struct iv_ca *ivs,
  		  struct iv_use *use)
  {
!   unsigned best_cost, act_cost;
    unsigned i;
    bitmap_iterator bi;
    struct iv_cand *cand;
!   struct iv_ca_delta *best_delta = NULL, *act_delta;
!   struct cost_pair *cp;
! 
!   iv_ca_add_use (data, ivs, use);
!   best_cost = iv_ca_cost (ivs);
  
!   cp = iv_ca_cand_for_use (ivs, use);
!   if (cp)
!     {
!       best_delta = iv_ca_delta_add (use, NULL, cp, NULL);
!       iv_ca_set_no_cp (data, ivs, use);
!     }
  
    /* First try important candidates.  Only if it fails, try the specific ones.
       Rationale -- in loops with many variables the best choice often is to use
*************** try_add_cand_for (struct ivopts_data *da
*** 3617,3639 ****
      {
        cand = iv_cand (data, i);
  
!       if (get_use_iv_cost (data, use, cand, &depends_on) == INFTY)
  	continue;
  
!       bitmap_copy (act_ivs, ivs);
!       bitmap_set_bit (act_ivs, cand->id);
!       if (depends_on)
! 	bitmap_ior (act_inv, inv, depends_on);
!       else
! 	bitmap_copy (act_inv, inv);
!       act_cost = set_cost_up_to (data, act_ivs, act_inv, use->id + 1);
  
        if (act_cost < best_cost)
  	{
  	  best_cost = act_cost;
! 	  bitmap_copy (best_ivs, act_ivs);
! 	  bitmap_copy (best_inv, act_inv);
  	}
      }
  
    if (best_cost == INFTY)
--- 4028,4054 ----
      {
        cand = iv_cand (data, i);
  
!       if (iv_ca_cand_used_p (ivs, cand))
  	continue;
  
!       cp = get_use_iv_cost (data, use, cand);
!       if (!cp)
! 	continue;
! 
!       iv_ca_set_cp (data, ivs, use, cp);
!       act_cost = iv_ca_extend (data, ivs, cand, &act_delta);
!       iv_ca_set_no_cp (data, ivs, use);
!       act_delta = iv_ca_delta_add (use, NULL, cp, act_delta);
  
        if (act_cost < best_cost)
  	{
  	  best_cost = act_cost;
! 
! 	  iv_ca_delta_free (&best_delta);
! 	  best_delta = act_delta;
  	}
+       else
+ 	iv_ca_delta_free (&act_delta);
      }
  
    if (best_cost == INFTY)
*************** try_add_cand_for (struct ivopts_data *da
*** 3641,3772 ****
        for (i = 0; i < use->n_map_members; i++)
  	{
  	  cp = use->cost_map + i;
! 	  if (cp->cost == INFTY)
  	    continue;
  
  	  /* Already tried this.  */
! 	  if (cp->cand->important)
  	    continue;
  
! 	  bitmap_copy (act_ivs, ivs);
! 	  bitmap_set_bit (act_ivs, cp->cand->id);
! 	  if (cp->depends_on)
! 	    bitmap_ior (act_inv, inv, cp->depends_on);
! 	  else
! 	    bitmap_copy (act_inv, inv);
! 	  act_cost = set_cost_up_to (data, act_ivs, act_inv, use->id + 1);
  
  	  if (act_cost < best_cost)
  	    {
  	      best_cost = act_cost;
! 	      bitmap_copy (best_ivs, act_ivs);
! 	      bitmap_copy (best_inv, act_inv);
  	    }
  	}
      }
  
!   bitmap_copy (ivs, best_ivs);
!   bitmap_copy (inv, best_inv);
! 
!   BITMAP_XFREE (best_ivs);
!   BITMAP_XFREE (best_inv);
!   BITMAP_XFREE (act_ivs);
!   BITMAP_XFREE (act_inv);
  
    return (best_cost != INFTY);
  }
  
! /* Finds an initial set of IVS and invariants INV.  We do this by simply
!    choosing the best candidate for each use.  */
  
! static unsigned
! get_initial_solution (struct ivopts_data *data, bitmap ivs, bitmap inv)
  {
    unsigned i;
  
    for (i = 0; i < n_iv_uses (data); i++)
!     if (!try_add_cand_for (data, ivs, inv, iv_use (data, i)))
!       return INFTY;
  
!   return set_cost (data, ivs, inv);
  }
  
! /* Tries to improve set of induction variables IVS and invariants INV to get
!    it better than COST.  */
  
  static bool
! try_improve_iv_set (struct ivopts_data *data,
! 		    bitmap ivs, bitmap inv, unsigned *cost)
  {
!   unsigned i, acost;
!   bitmap new_ivs = BITMAP_XMALLOC (), new_inv = BITMAP_XMALLOC ();
!   bitmap best_new_ivs = NULL, best_new_inv = NULL;
  
    /* Try altering the set of induction variables by one.  */
    for (i = 0; i < n_iv_cands (data); i++)
      {
!       bitmap_copy (new_ivs, ivs);
!       bitmap_copy (new_inv, inv);
! 
!       if (bitmap_bit_p (ivs, i))
! 	bitmap_clear_bit (new_ivs, i);
        else
! 	bitmap_set_bit (new_ivs, i);
! 
!       acost = set_cost (data, new_ivs, new_inv);
!       if (acost >= *cost)
! 	continue;
  
!       if (!best_new_ivs)
  	{
! 	  best_new_ivs = BITMAP_XMALLOC ();
! 	  best_new_inv = BITMAP_XMALLOC ();
  	}
- 
-       *cost = acost;
-       bitmap_copy (best_new_ivs, new_ivs);
-       bitmap_copy (best_new_inv, new_inv);
-     }
- 
-   /* Ditto for invariants.  */
-   for (i = 1; i <= data->max_inv_id; i++)
-     {
-       if (ver_info (data, i)->has_nonlin_use)
- 	continue;
- 
-       bitmap_copy (new_ivs, ivs);
-       bitmap_copy (new_inv, inv);
- 
-       if (bitmap_bit_p (inv, i))
- 	bitmap_clear_bit (new_inv, i);
        else
! 	bitmap_set_bit (new_inv, i);
! 
!       acost = set_cost (data, new_ivs, new_inv);
!       if (acost >= *cost)
! 	continue;
! 
!       if (!best_new_ivs)
! 	{
! 	  best_new_ivs = BITMAP_XMALLOC ();
! 	  best_new_inv = BITMAP_XMALLOC ();
! 	}
! 
!       *cost = acost;
!       bitmap_copy (best_new_ivs, new_ivs);
!       bitmap_copy (best_new_inv, new_inv);
      }
  
!   BITMAP_XFREE (new_ivs);
!   BITMAP_XFREE (new_inv);
! 
!   if (!best_new_ivs)
      return false;
  
!   bitmap_copy (ivs, best_new_ivs);
!   bitmap_copy (inv, best_new_inv);
!   BITMAP_XFREE (best_new_ivs);
!   BITMAP_XFREE (best_new_inv);
    return true;
  }
  
--- 4056,4151 ----
        for (i = 0; i < use->n_map_members; i++)
  	{
  	  cp = use->cost_map + i;
! 	  cand = cp->cand;
! 	  if (!cand)
  	    continue;
  
  	  /* Already tried this.  */
! 	  if (cand->important)
! 	    continue;
!       
! 	  if (iv_ca_cand_used_p (ivs, cand))
  	    continue;
  
! 	  act_delta = NULL;
! 	  iv_ca_set_cp (data, ivs, use, cp);
! 	  act_cost = iv_ca_extend (data, ivs, cand, &act_delta);
! 	  iv_ca_set_no_cp (data, ivs, use);
! 	  act_delta = iv_ca_delta_add (use, iv_ca_cand_for_use (ivs, use),
! 				       cp, act_delta);
  
  	  if (act_cost < best_cost)
  	    {
  	      best_cost = act_cost;
! 
! 	      if (best_delta)
! 		iv_ca_delta_free (&best_delta);
! 	      best_delta = act_delta;
  	    }
+ 	  else
+ 	    iv_ca_delta_free (&act_delta);
  	}
      }
  
!   iv_ca_delta_commit (data, ivs, best_delta, true);
!   iv_ca_delta_free (&best_delta);
  
    return (best_cost != INFTY);
  }
  
! /* Finds an initial assignment of candidates to uses.  */
  
! static struct iv_ca *
! get_initial_solution (struct ivopts_data *data)
  {
+   struct iv_ca *ivs = iv_ca_new (data);
    unsigned i;
  
    for (i = 0; i < n_iv_uses (data); i++)
!     if (!try_add_cand_for (data, ivs, iv_use (data, i)))
!       {
! 	iv_ca_free (&ivs);
! 	return NULL;
!       }
  
!   return ivs;
  }
  
! /* Tries to improve set of induction variables IVS.  */
  
  static bool
! try_improve_iv_set (struct ivopts_data *data, struct iv_ca *ivs)
  {
!   unsigned i, acost, best_cost = iv_ca_cost (ivs);
!   struct iv_ca_delta *best_delta = NULL, *act_delta;
!   struct iv_cand *cand;
  
    /* Try altering the set of induction variables by one.  */
    for (i = 0; i < n_iv_cands (data); i++)
      {
!       cand = iv_cand (data, i);
!       
!       if (iv_ca_cand_used_p (ivs, cand))
! 	acost = iv_ca_narrow (data, ivs, cand, &act_delta);
        else
! 	acost = iv_ca_extend (data, ivs, cand, &act_delta);
  
!       if (acost < best_cost)
  	{
! 	  best_cost = acost;
! 	  if (best_delta)
! 	    iv_ca_delta_free (&best_delta);
! 	  best_delta = act_delta;
  	}
        else
! 	iv_ca_delta_free (&act_delta);
      }
  
!   if (!best_delta)
      return false;
  
!   iv_ca_delta_commit (data, ivs, best_delta, true);
!   iv_ca_delta_free (&best_delta);
    return true;
  }
  
*************** try_improve_iv_set (struct ivopts_data *
*** 3774,3840 ****
     greedy heuristic -- we try to replace at most one candidate in the selected
     solution and remove the unused ivs while this improves the cost.  */
  
! static bitmap
  find_optimal_iv_set (struct ivopts_data *data)
  {
!   unsigned cost, i;
!   bitmap set = BITMAP_XMALLOC ();
!   bitmap inv = BITMAP_XMALLOC ();
    struct iv_use *use;
  
!   data->important_candidates = BITMAP_XMALLOC ();
!   for (i = 0; i < n_iv_cands (data); i++)
!     {
!       struct iv_cand *cand = iv_cand (data, i);
! 
!       if (cand->important)
! 	bitmap_set_bit (data->important_candidates, i);
!     }
! 
!   /* Set the upper bound.  */
!   cost = get_initial_solution (data, set, inv);
!   if (cost == INFTY)
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	fprintf (dump_file, "Unable to substitute for ivs, failed.\n");
-       BITMAP_XFREE (inv);
-       BITMAP_XFREE (set);
        return NULL;
      }
  
    if (dump_file && (dump_flags & TDF_DETAILS))
      {
!       fprintf (dump_file, "Initial set of candidates (cost %d): ", cost);
!       bitmap_print (dump_file, set, "", "");
!       fprintf (dump_file, " invariants ");
!       bitmap_print (dump_file, inv, "", "");
!       fprintf (dump_file, "\n");
      }
  
!   while (try_improve_iv_set (data, set, inv, &cost))
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	{
! 	  fprintf (dump_file, "Improved to (cost %d): ", cost);
! 	  bitmap_print (dump_file, set, "", "");
! 	  fprintf (dump_file, " invariants ");
! 	  bitmap_print (dump_file, inv, "", "");
! 	  fprintf (dump_file, "\n");
  	}
      }
  
    if (dump_file && (dump_flags & TDF_DETAILS))
!     fprintf (dump_file, "Final cost %d\n\n", cost);
  
    for (i = 0; i < n_iv_uses (data); i++)
      {
        use = iv_use (data, i);
!       find_best_candidate (data, use, set, inv, NULL, NULL, &use->selected);
      }
  
-   BITMAP_XFREE (inv);
-   BITMAP_XFREE (data->important_candidates);
- 
    return set;
  }
  
--- 4153,4198 ----
     greedy heuristic -- we try to replace at most one candidate in the selected
     solution and remove the unused ivs while this improves the cost.  */
  
! static struct iv_ca *
  find_optimal_iv_set (struct ivopts_data *data)
  {
!   unsigned i;
!   struct iv_ca *set;
    struct iv_use *use;
  
!   /* Get the initial solution.  */
!   set = get_initial_solution (data);
!   if (!set)
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	fprintf (dump_file, "Unable to substitute for ivs, failed.\n");
        return NULL;
      }
  
    if (dump_file && (dump_flags & TDF_DETAILS))
      {
!       fprintf (dump_file, "Initial set of candidates:\n");
!       iv_ca_dump (data, dump_file, set);
      }
  
!   while (try_improve_iv_set (data, set))
      {
        if (dump_file && (dump_flags & TDF_DETAILS))
  	{
! 	  fprintf (dump_file, "Improved to:\n");
! 	  iv_ca_dump (data, dump_file, set);
  	}
      }
  
    if (dump_file && (dump_flags & TDF_DETAILS))
!     fprintf (dump_file, "Final cost %d\n\n", iv_ca_cost (set));
  
    for (i = 0; i < n_iv_uses (data); i++)
      {
        use = iv_use (data, i);
!       use->selected = iv_ca_cand_for_use (set, use)->cand;
      }
  
    return set;
  }
  
*************** create_new_iv (struct ivopts_data *data,
*** 3884,3896 ****
  /* Creates new induction variables described in SET.  */
  
  static void
! create_new_ivs (struct ivopts_data *data, bitmap set)
  {
    unsigned i;
    struct iv_cand *cand;
    bitmap_iterator bi;
  
!   EXECUTE_IF_SET_IN_BITMAP (set, 0, i, bi)
      {
        cand = iv_cand (data, i);
        create_new_iv (data, cand);
--- 4242,4254 ----
  /* Creates new induction variables described in SET.  */
  
  static void
! create_new_ivs (struct ivopts_data *data, struct iv_ca *set)
  {
    unsigned i;
    struct iv_cand *cand;
    bitmap_iterator bi;
  
!   EXECUTE_IF_SET_IN_BITMAP (set->cands, 0, i, bi)
      {
        cand = iv_cand (data, i);
        create_new_iv (data, cand);
*************** free_loop_data (struct ivopts_data *data
*** 4422,4427 ****
--- 4780,4786 ----
        info->inv_id = 0;
      }
    bitmap_clear (data->relevant);
+   bitmap_clear (data->important_candidates);
  
    for (i = 0; i < n_iv_uses (data); i++)
      {
*************** tree_ssa_iv_optimize_finalize (struct lo
*** 4484,4489 ****
--- 4843,4849 ----
    free_loop_data (data);
    free (data->version_info);
    BITMAP_XFREE (data->relevant);
+   BITMAP_XFREE (data->important_candidates);
  
    VARRAY_FREE (decl_rtl_to_reset);
    VARRAY_FREE (data->iv_uses);
*************** static bool
*** 4496,4502 ****
  tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop)
  {
    bool changed = false;
!   bitmap iv_set;
    edge exit;
  
    data->current_loop = loop;
--- 4856,4862 ----
  tree_ssa_iv_optimize_loop (struct ivopts_data *data, struct loop *loop)
  {
    bool changed = false;
!   struct iv_ca *iv_ca;
    edge exit;
  
    data->current_loop = loop;
*************** tree_ssa_iv_optimize_loop (struct ivopts
*** 4536,4548 ****
    determine_set_costs (data);
  
    /* Find the optimal set of induction variables (item 3, part 2).  */
!   iv_set = find_optimal_iv_set (data);
!   if (!iv_set)
      goto finish;
    changed = true;
  
    /* Create the new induction variables (item 4, part 1).  */
!   create_new_ivs (data, iv_set);
    
    /* Rewrite the uses (item 4, part 2).  */
    rewrite_uses (data);
--- 4896,4909 ----
    determine_set_costs (data);
  
    /* Find the optimal set of induction variables (item 3, part 2).  */
!   iv_ca = find_optimal_iv_set (data);
!   if (!iv_ca)
      goto finish;
    changed = true;
  
    /* Create the new induction variables (item 4, part 1).  */
!   create_new_ivs (data, iv_ca);
!   iv_ca_free (&iv_ca);
    
    /* Rewrite the uses (item 4, part 2).  */
    rewrite_uses (data);
*************** tree_ssa_iv_optimize_loop (struct ivopts
*** 4552,4559 ****
  
    loop_commit_inserts ();
  
-   BITMAP_XFREE (iv_set);
- 
    /* We have changed the structure of induction variables; it might happen
       that definitions in the scev database refer to some of them that were
       eliminated.  */
--- 4913,4918 ----

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch] Speed up ivopts
  2004-11-06 21:22 [patch] Speed up ivopts Zdenek Dvorak
@ 2004-11-08 19:20 ` Richard Henderson
  2004-11-08 20:08   ` Zdenek Dvorak
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Henderson @ 2004-11-08 19:20 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: gcc-patches

On Sat, Nov 06, 2004 at 09:53:53PM +0100, Zdenek Dvorak wrote:
> ! static const unsigned nzb_tab[] =

This is not ok.  At minimum you can put this out of line.  But as for
how to use __builtin_ctz, see the use of __builtin_clz in toplev.h.

> + /* Advance to the next nonzero bit in the intersection of
> +    complemented bitmaps.  We will have already advanced past the just
> +    iterated bit.  */
> + 
> + static inline bool
> + bmp_iter_and_compl (bitmap_iterator *bi, unsigned *bit_no)
> + {
> +   if (bi->bits)
> +     {
> +       bmp_iter_advance (bi, bit_no);
> +       return true;
> +     }
> +   return bmp_iter_and_compl_1 (bi, bit_no);

With the bits-in-word part factored out and inlined, does it make
sense to move bmp_iter_and_compl_1 (etc) out of line?

The actual iv parts look ok.


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch] Speed up ivopts
  2004-11-08 19:20 ` Richard Henderson
@ 2004-11-08 20:08   ` Zdenek Dvorak
  2004-11-08 20:13     ` Richard Henderson
  0 siblings, 1 reply; 4+ messages in thread
From: Zdenek Dvorak @ 2004-11-08 20:08 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

Hello,

> On Sat, Nov 06, 2004 at 09:53:53PM +0100, Zdenek Dvorak wrote:
> > ! static const unsigned nzb_tab[] =
> 
> This is not ok.  At minimum you can put this out of line.  But as for
> how to use __builtin_ctz, see the use of __builtin_clz in toplev.h.
>
> > + /* Advance to the next nonzero bit in the intersection of
> > +    complemented bitmaps.  We will have already advanced past the just
> > +    iterated bit.  */
> > + 
> > + static inline bool
> > + bmp_iter_and_compl (bitmap_iterator *bi, unsigned *bit_no)
> > + {
> > +   if (bi->bits)
> > +     {
> > +       bmp_iter_advance (bi, bit_no);
> > +       return true;
> > +     }
> > +   return bmp_iter_and_compl_1 (bi, bit_no);
> 
> With the bits-in-word part factored out and inlined, does it make
> sense to move bmp_iter_and_compl_1 (etc) out of line?

I do not think so.  Unless these functions are fully inlined, sra won't
decompose the iterators and we are going to lose a lot on compile time
performace (or do you mean something else by "putting out of line"?

Zdenek

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch] Speed up ivopts
  2004-11-08 20:08   ` Zdenek Dvorak
@ 2004-11-08 20:13     ` Richard Henderson
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2004-11-08 20:13 UTC (permalink / raw)
  To: Zdenek Dvorak; +Cc: gcc-patches

On Mon, Nov 08, 2004 at 09:04:50PM +0100, Zdenek Dvorak wrote:
> I do not think so.  Unless these functions are fully inlined, sra won't
> decompose the iterators and we are going to lose a lot on compile time
> performace (or do you mean something else by "putting out of line"?

Hum..  I guess.


r~

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-11-08 20:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-06 21:22 [patch] Speed up ivopts Zdenek Dvorak
2004-11-08 19:20 ` Richard Henderson
2004-11-08 20:08   ` Zdenek Dvorak
2004-11-08 20:13     ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).