From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1670 invoked by alias); 22 Jul 2012 15:20:50 -0000 Received: (qmail 1648 invoked by uid 22791); 22 Jul 2012 15:20:44 -0000 X-SWARE-Spam-Status: No, hits=-7.2 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,KHOP_THREADED,RCVD_IN_DNSWL_HI,RCVD_IN_HOSTKARMA_W,TW_TM,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e8.ny.us.ibm.com (HELO e8.ny.us.ibm.com) (32.97.182.138) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 22 Jul 2012 15:20:29 +0000 Received: from /spool/local by e8.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 22 Jul 2012 11:20:27 -0400 Received: from d01dlp01.pok.ibm.com (9.56.224.56) by e8.ny.us.ibm.com (192.168.1.108) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 22 Jul 2012 11:19:36 -0400 Received: from d01relay07.pok.ibm.com (d01relay07.pok.ibm.com [9.56.227.147]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id CEEF338C8039 for ; Sun, 22 Jul 2012 11:19:35 -0400 (EDT) Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay07.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6MFJZ0u57868528 for ; Sun, 22 Jul 2012 11:19:35 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6MKoRAA030625 for ; Sun, 22 Jul 2012 16:50:27 -0400 Received: from [9.76.47.115] (sig-9-76-47-115.mts.ibm.com [9.76.47.115]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q6MKoQPs030576; Sun, 22 Jul 2012 16:50:27 -0400 Message-ID: <1342970377.3470.75.camel@gnopaine> Subject: Ping: [PATCH] Fix PR46556 (straight-line strength reduction, part 2) From: "William J. Schmidt" To: gcc-patches@gcc.gnu.org Cc: bergner@vnet.ibm.com, rguenther@suse.de Date: Sun, 22 Jul 2012 15:20:00 -0000 In-Reply-To: <1340919932.2861.15.camel@gnopaine> References: <1340919932.2861.15.camel@gnopaine> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Mime-Version: 1.0 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12072215-9360-0000-0000-000008A3C79A Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-07/txt/msg01080.txt.bz2 Ping... On Thu, 2012-06-28 at 16:45 -0500, William J. Schmidt wrote: > Here's a relatively small piece of strength reduction that solves that > pesky addressing bug that got me looking at this in the first place... > > The main part of the code is the stuff that was reviewed last year, but > which needed to find a good home. So hopefully that's in pretty good > shape. I recast base_cand_map as an htab again since I now need to look > up trees other than SSA names. I plan to put together a follow-up patch > to change code and commentary references so that "base_name" becomes > "base_expr". Doing that now would clutter up the patch too much. > > Bootstrapped and tested on powerpc64-linux-gnu with no new regressions. > Ok for trunk? > > Thanks, > Bill > > > gcc: > > PR tree-optimization/46556 > * gimple-ssa-strength-reduction.c (enum cand_kind): Add CAND_REF. > (base_cand_map): Change to hash table. > (base_cand_hash): New function. > (base_cand_free): Likewise. > (base_cand_eq): Likewise. > (lookup_cand): Change base_cand_map to hash table. > (find_basis_for_candidate): Likewise. > (base_cand_from_table): Exclude CAND_REF. > (restructure_reference): New function. > (slsr_process_ref): Likewise. > (find_candidates_in_block): Call slsr_process_ref. > (dump_candidate): Handle CAND_REF. > (base_cand_dump_callback): New function. > (dump_cand_chains): Change base_cand_map to hash table. > (replace_ref): New function. > (replace_refs): Likewise. > (analyze_candidates_and_replace): Call replace_refs. > (execute_strength_reduction): Change base_cand_map to hash table. > > gcc/testsuite: > > PR tree-optimization/46556 > * testsuite/gcc.dg/tree-ssa/slsr-27.c: New. > * testsuite/gcc.dg/tree-ssa/slsr-28.c: New. > * testsuite/gcc.dg/tree-ssa/slsr-29.c: New. > > > Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-27.c (revision 0) > @@ -0,0 +1,22 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-dom2" } */ > + > +struct x > +{ > + int a[16]; > + int b[16]; > + int c[16]; > +}; > + > +extern void foo (int, int, int); > + > +void > +f (struct x *p, unsigned int n) > +{ > + foo (p->a[n], p->c[n], p->b[n]); > +} > + > +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ > +/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */ > +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 3 "dom2" } } */ > +/* { dg-final { cleanup-tree-dump "dom2" } } */ > Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-28.c (revision 0) > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-dom2" } */ > + > +struct x > +{ > + int a[16]; > + int b[16]; > + int c[16]; > +}; > + > +extern void foo (int, int, int); > + > +void > +f (struct x *p, unsigned int n) > +{ > + foo (p->a[n], p->c[n], p->b[n]); > + if (n > 12) > + foo (p->a[n], p->c[n], p->b[n]); > + else if (n > 3) > + foo (p->b[n], p->a[n], p->c[n]); > +} > + > +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ > +/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */ > +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 9 "dom2" } } */ > +/* { dg-final { cleanup-tree-dump "dom2" } } */ > Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c > =================================================================== > --- gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0) > +++ gcc/testsuite/gcc.dg/tree-ssa/slsr-29.c (revision 0) > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-dom2" } */ > + > +struct x > +{ > + int a[16]; > + int b[16]; > + int c[16]; > +}; > + > +extern void foo (int, int, int); > + > +void > +f (struct x *p, unsigned int n) > +{ > + foo (p->a[n], p->c[n], p->b[n]); > + if (n > 3) > + { > + foo (p->a[n], p->c[n], p->b[n]); > + if (n > 12) > + foo (p->b[n], p->a[n], p->c[n]); > + } > +} > + > +/* { dg-final { scan-tree-dump-times "\\* 4;" 1 "dom2" } } */ > +/* { dg-final { scan-tree-dump-times "p_\\d\+\\(D\\) \\+ D" 1 "dom2" } } */ > +/* { dg-final { scan-tree-dump-times "MEM\\\[\\(struct x \\*\\)D" 9 "dom2" } } */ > +/* { dg-final { cleanup-tree-dump "dom2" } } */ > Index: gcc/gimple-ssa-strength-reduction.c > =================================================================== > --- gcc/gimple-ssa-strength-reduction.c (revision 189025) > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > @@ -32,7 +32,7 @@ along with GCC; see the file COPYING3. If not see > 2) Explicit multiplies, unknown constant multipliers, > no conditional increments. (data gathering complete, > replacements pending) > - 3) Implicit multiplies in addressing expressions. (pending) > + 3) Implicit multiplies in addressing expressions. (complete) > 4) Explicit multiplies, conditional increments. (pending) > > It would also be possible to apply strength reduction to divisions > @@ -107,9 +107,49 @@ along with GCC; see the file COPYING3. If not see > > as a strength reduction opportunity, even though this S1 would > also be replaceable by the S1' above. This can be added if it > - comes up in practice. */ > + comes up in practice. > > + Strength reduction in addressing > + -------------------------------- > + There is another kind of candidate known as CAND_REF. A CAND_REF > + describes a statement containing a memory reference having > + complex addressing that might benefit from strength reduction. > + Specifically, we are interested in references for which > + get_inner_reference returns a base address, offset, and bitpos as > + follows: > > + base: MEM_REF (T1, C1) > + offset: MULT_EXPR (PLUS_EXPR (T2, C2), C3) > + bitpos: C4 * BITS_PER_UNIT > + > + Here T1 and T2 are arbitrary trees, and C1, C2, C3, C4 are > + arbitrary integer constants. Note that C2 may be zero, in which > + case the offset will be MULT_EXPR (T2, C3). > + > + When this pattern is recognized, the original memory reference > + can be replaced with: > + > + MEM_REF (POINTER_PLUS_EXPR (T1, MULT_EXPR (T2, C3)), > + C1 + (C2 * C3) + C4) > + > + which distributes the multiply to allow constant folding. When > + two or more addressing expressions can be represented by MEM_REFs > + of this form, differing only in the constants C1, C2, and C4, > + making this substitution produces more efficient addressing during > + the RTL phases. When there are not at least two expressions with > + the same values of T1, T2, and C3, there is nothing to be gained > + by the replacement. > + > + Strength reduction of CAND_REFs uses the same infrastructure as > + that used by CAND_MULTs and CAND_ADDs. We record T1 in the base (B) > + field, MULT_EXPR (T2, C3) in the stride (S) field, and > + C1 + (C2 * C3) + C4 in the index (i) field. A basis for a CAND_REF > + is thus another CAND_REF with the same B and S values. When at > + least two CAND_REFs are chained together using the basis relation, > + each of them is replaced as above, resulting in improved code > + generation for addressing. */ > + > + > /* Index into the candidate vector, offset by 1. VECs are zero-based, > while cand_idx's are one-based, with zero indicating null. */ > typedef unsigned cand_idx; > @@ -118,7 +158,8 @@ typedef unsigned cand_idx; > enum cand_kind > { > CAND_MULT, > - CAND_ADD > + CAND_ADD, > + CAND_REF > }; > > struct slsr_cand_d > @@ -137,7 +178,9 @@ struct slsr_cand_d > > /* The type of the candidate. This is normally the type of base_name, > but casts may have occurred when combining feeding instructions. > - A candidate can only be a basis for candidates of the same final type. */ > + A candidate can only be a basis for candidates of the same final type. > + (For CAND_REFs, this is the type to be used for operand 1 of the > + replacement MEM_REF.) */ > tree cand_type; > > /* The kind of candidate (CAND_MULT, etc.). */ > @@ -211,8 +254,8 @@ static struct pointer_map_t *stmt_cand_map; > /* Obstack for candidates. */ > static struct obstack cand_obstack; > > -/* Array mapping from base SSA names to chains of candidates. */ > -static cand_chain_t *base_cand_map; > +/* Hash table embodying a mapping from base names to chains of candidates. */ > +static htab_t base_cand_map; > > /* Obstack for candidate chains. */ > static struct obstack chain_obstack; > @@ -225,6 +268,33 @@ lookup_cand (cand_idx idx) > return VEC_index (slsr_cand_t, cand_vec, idx - 1); > } > > +/* Callback to produce a hash value for a candidate chain header. */ > + > +static hashval_t > +base_cand_hash (const void *p) > +{ > + tree base_expr = ((const_cand_chain_t) p)->base_name; > + return iterative_hash_expr (base_expr, 0); > +} > + > +/* Callback when an element is removed from the hash table. > + We never remove entries until the entire table is released. */ > + > +static void > +base_cand_free (void *p ATTRIBUTE_UNUSED) > +{ > +} > + > +/* Callback to return true if two candidate chain headers are equal. */ > + > +static int > +base_cand_eq (const void *p1, const void *p2) > +{ > + const_cand_chain_t const chain1 = (const_cand_chain_t) p1; > + const_cand_chain_t const chain2 = (const_cand_chain_t) p2; > + return operand_equal_p (chain1->base_name, chain2->base_name, 0); > +} > + > /* Use the base name from candidate C to look for possible candidates > that can serve as a basis for C. Each potential basis must also > appear in a block that dominates the candidate statement and have > @@ -235,11 +305,12 @@ lookup_cand (cand_idx idx) > static int > find_basis_for_candidate (slsr_cand_t c) > { > + cand_chain mapping_key; > cand_chain_t chain; > slsr_cand_t basis = NULL; > > - gcc_assert (TREE_CODE (c->base_name) == SSA_NAME); > - chain = base_cand_map[SSA_NAME_VERSION (c->base_name)]; > + mapping_key.base_name = c->base_name; > + chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); > > for (; chain; chain = chain->next) > { > @@ -273,23 +344,23 @@ find_basis_for_candidate (slsr_cand_t c) > static void > record_potential_basis (slsr_cand_t c) > { > - cand_chain_t node, head; > - int index; > + cand_chain_t node; > + void **slot; > > node = (cand_chain_t) obstack_alloc (&chain_obstack, sizeof (cand_chain)); > node->base_name = c->base_name; > node->cand = c; > node->next = NULL; > - index = SSA_NAME_VERSION (c->base_name); > - head = base_cand_map[index]; > + slot = htab_find_slot (base_cand_map, node, INSERT); > > - if (head) > + if (*slot) > { > + cand_chain_t head = (cand_chain_t) (*slot); > node->next = head->next; > head->next = node; > } > else > - base_cand_map[index] = node; > + *slot = node; > } > > /* Allocate storage for a new candidate and initialize its fields. > @@ -390,10 +461,11 @@ base_cand_from_table (tree base_in) > return (slsr_cand_t) NULL; > > result = (slsr_cand_t *) pointer_map_contains (stmt_cand_map, def); > - if (!result) > - return (slsr_cand_t) NULL; > + > + if (result && (*result)->kind != CAND_REF) > + return *result; > > - return *result; > + return (slsr_cand_t) NULL; > } > > /* Add an entry to the statement-to-candidate mapping. */ > @@ -406,6 +478,127 @@ add_cand_for_stmt (gimple gs, slsr_cand_t c) > *slot = c; > } > > +/* Look for the following pattern: > + > + *PBASE: MEM_REF (T1, C1) > + > + *POFFSET: MULT_EXPR (T2, C3) [C2 is zero] > + or > + MULT_EXPR (PLUS_EXPR (T2, C2), C3) > + or > + MULT_EXPR (MINUS_EXPR (T2, -C2), C3) > + > + *PINDEX: C4 * BITS_PER_UNIT > + > + If not present, leave the input values unchanged and return FALSE. > + Otherwise, modify the input values as follows and return TRUE: > + > + *PBASE: T1 > + *POFFSET: MULT_EXPR (T2, C3) > + *PINDEX: C1 + (C2 * C3) + C4 */ > + > +static bool > +restructure_reference (tree *pbase, tree *poffset, double_int *pindex, > + tree *ptype) > +{ > + tree base = *pbase, offset = *poffset; > + double_int index = *pindex; > + double_int bpu = uhwi_to_double_int (BITS_PER_UNIT); > + tree mult_op0, mult_op1, t1, t2, type; > + double_int c1, c2, c3, c4; > + > + if (!base > + || !offset > + || TREE_CODE (base) != MEM_REF > + || TREE_CODE (offset) != MULT_EXPR > + || TREE_CODE (TREE_OPERAND (offset, 1)) != INTEGER_CST > + || !double_int_zero_p (double_int_umod (index, bpu, FLOOR_MOD_EXPR))) > + return false; > + > + t1 = TREE_OPERAND (base, 0); > + c1 = mem_ref_offset (base); > + type = TREE_TYPE (TREE_OPERAND (base, 1)); > + > + mult_op0 = TREE_OPERAND (offset, 0); > + mult_op1 = TREE_OPERAND (offset, 1); > + > + c3 = tree_to_double_int (mult_op1); > + > + if (TREE_CODE (mult_op0) == PLUS_EXPR) > + > + if (TREE_CODE (TREE_OPERAND (mult_op0, 1)) == INTEGER_CST) > + { > + t2 = TREE_OPERAND (mult_op0, 0); > + c2 = tree_to_double_int (TREE_OPERAND (mult_op0, 1)); > + } > + else > + return false; > + > + else if (TREE_CODE (mult_op0) == MINUS_EXPR) > + > + if (TREE_CODE (TREE_OPERAND (mult_op0, 1)) == INTEGER_CST) > + { > + t2 = TREE_OPERAND (mult_op0, 0); > + c2 = double_int_neg (tree_to_double_int (TREE_OPERAND (mult_op0, 1))); > + } > + else > + return false; > + > + else > + { > + t2 = mult_op0; > + c2 = double_int_zero; > + } > + > + c4 = double_int_udiv (index, bpu, FLOOR_DIV_EXPR); > + > + *pbase = t1; > + *poffset = fold_build2 (MULT_EXPR, sizetype, t2, > + double_int_to_tree (sizetype, c3)); > + *pindex = double_int_add (double_int_add (c1, double_int_mul (c2, c3)), c4); > + *ptype = type; > + > + return true; > +} > + > +/* Given GS which contains a data reference, create a CAND_REF entry in > + the candidate table and attempt to find a basis. */ > + > +static void > +slsr_process_ref (gimple gs) > +{ > + tree ref_expr, base, offset, type; > + HOST_WIDE_INT bitsize, bitpos; > + enum machine_mode mode; > + int unsignedp, volatilep; > + double_int index; > + slsr_cand_t c; > + > + if (gimple_vdef (gs)) > + ref_expr = gimple_assign_lhs (gs); > + else > + ref_expr = gimple_assign_rhs1 (gs); > + > + if (!handled_component_p (ref_expr) > + || TREE_CODE (ref_expr) == BIT_FIELD_REF > + || (TREE_CODE (ref_expr) == COMPONENT_REF > + && DECL_BIT_FIELD (TREE_OPERAND (ref_expr, 1)))) > + return; > + > + base = get_inner_reference (ref_expr, &bitsize, &bitpos, &offset, &mode, > + &unsignedp, &volatilep, false); > + index = uhwi_to_double_int (bitpos); > + > + if (!restructure_reference (&base, &offset, &index, &type)) > + return; > + > + c = alloc_cand_and_find_basis (CAND_REF, gs, base, index, offset, > + type, 0); > + > + /* Add the candidate to the statement-candidate mapping. */ > + add_cand_for_stmt (gs, c); > +} > + > /* Create a candidate entry for a statement GS, where GS multiplies > two SSA names BASE_IN and STRIDE_IN. Propagate any known information > about the two SSA names into the new candidate. Return the new > @@ -1056,8 +1249,12 @@ find_candidates_in_block (struct dom_walk_data *wa > { > gimple gs = gsi_stmt (gsi); > > - if (is_gimple_assign (gs) > - && SCALAR_INT_MODE_P (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs))))) > + if (gimple_vuse (gs) && gimple_assign_single_p (gs)) > + slsr_process_ref (gs); > + > + else if (is_gimple_assign (gs) > + && SCALAR_INT_MODE_P > + (TYPE_MODE (TREE_TYPE (gimple_assign_lhs (gs))))) > { > tree rhs1 = NULL_TREE, rhs2 = NULL_TREE; > > @@ -1151,6 +1348,15 @@ dump_candidate (slsr_cand_t c) > print_generic_expr (dump_file, c->stride, 0); > fputs (") : ", dump_file); > break; > + case CAND_REF: > + fputs (" REF : ", dump_file); > + print_generic_expr (dump_file, c->base_name, 0); > + fputs (" + (", dump_file); > + print_generic_expr (dump_file, c->stride, 0); > + fputs (") + ", dump_file); > + dump_double_int (dump_file, c->index, false); > + fputs (" : ", dump_file); > + break; > default: > gcc_unreachable (); > } > @@ -1181,36 +1387,33 @@ dump_cand_vec (void) > dump_candidate (c); > } > > -/* Dump the candidate chains. */ > +/* Callback used to dump the candidate chains hash table. */ > > -static void > -dump_cand_chains (void) > +static int > +base_cand_dump_callback (void **slot, void *ignored ATTRIBUTE_UNUSED) > { > - unsigned i; > + const_cand_chain_t chain = *((const_cand_chain_t *) slot); > + cand_chain_t p; > > - fprintf (dump_file, "\nStrength reduction candidate chains:\n\n"); > + print_generic_expr (dump_file, chain->base_name, 0); > + fprintf (dump_file, " -> %d", chain->cand->cand_num); > > - for (i = 0; i < num_ssa_names; i++) > - { > - const_cand_chain_t chain = base_cand_map[i]; > + for (p = chain->next; p; p = p->next) > + fprintf (dump_file, " -> %d", p->cand->cand_num); > > - if (chain) > - { > - cand_chain_t p; > + fputs ("\n", dump_file); > + return 1; > +} > > - print_generic_expr (dump_file, chain->base_name, 0); > - fprintf (dump_file, " -> %d", chain->cand->cand_num); > +/* Dump the candidate chains. */ > > - for (p = chain->next; p; p = p->next) > - fprintf (dump_file, " -> %d", p->cand->cand_num); > - > - fputs ("\n", dump_file); > - } > - } > - > +static void > +dump_cand_chains (void) > +{ > + fprintf (dump_file, "\nStrength reduction candidate chains:\n\n"); > + htab_traverse_noresize (base_cand_map, base_cand_dump_callback, NULL); > fputs ("\n", dump_file); > } > - > > /* Recursive helper for unconditional_cands_with_known_stride_p. > Returns TRUE iff C, its siblings, and its dependents are all > @@ -1246,6 +1449,53 @@ unconditional_cands_with_known_stride_p (slsr_cand > return unconditional_cands (lookup_cand (root->dependent)); > } > > +/* Replace *EXPR in candidate C with an equivalent strength-reduced > + data reference. */ > + > +static void > +replace_ref (tree *expr, slsr_cand_t c) > +{ > + tree add_expr = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (c->base_name), > + c->base_name, c->stride); > + tree mem_ref = fold_build2 (MEM_REF, TREE_TYPE (*expr), add_expr, > + double_int_to_tree (c->cand_type, c->index)); > + > + /* Gimplify the base addressing expression for the new MEM_REF tree. */ > + gimple_stmt_iterator gsi = gsi_for_stmt (c->cand_stmt); > + TREE_OPERAND (mem_ref, 0) > + = force_gimple_operand_gsi (&gsi, TREE_OPERAND (mem_ref, 0), > + /*simple_p=*/true, NULL, > + /*before=*/true, GSI_SAME_STMT); > + copy_ref_info (mem_ref, *expr); > + *expr = mem_ref; > + update_stmt (c->cand_stmt); > +} > + > +/* Replace CAND_REF candidate C, each sibling of candidate C, and each > + dependent of candidate C with an equivalent strength-reduced data > + reference. */ > + > +static void > +replace_refs (slsr_cand_t c) > +{ > + if (gimple_vdef (c->cand_stmt)) > + { > + tree *lhs = gimple_assign_lhs_ptr (c->cand_stmt); > + replace_ref (lhs, c); > + } > + else > + { > + tree *rhs = gimple_assign_rhs1_ptr (c->cand_stmt); > + replace_ref (rhs, c); > + } > + > + if (c->sibling) > + replace_refs (lookup_cand (c->sibling)); > + > + if (c->dependent) > + replace_refs (lookup_cand (c->dependent)); > +} > + > /* Calculate the increment required for candidate C relative to > its basis. */ > > @@ -1413,13 +1663,18 @@ analyze_candidates_and_replace (void) > > first_dep = lookup_cand (c->dependent); > > + /* If this is a chain of CAND_REFs, unconditionally replace > + each of them with a strength-reduced data reference. */ > + if (c->kind == CAND_REF) > + replace_refs (c); > + > /* If the common stride of all related candidates is a > known constant, and none of these has a phi-dependence, > then all replacements are considered profitable. > Each replaces a multiply by a single add, with the > possibility that a feeding add also goes dead as a > result. */ > - if (unconditional_cands_with_known_stride_p (c)) > + else if (unconditional_cands_with_known_stride_p (c)) > replace_dependents (first_dep); > > /* TODO: When the stride is an SSA name, it may still be > @@ -1428,9 +1683,6 @@ analyze_candidates_and_replace (void) > can be reused, or are less expensive to calculate than > the replaced statements. */ > > - /* TODO: Strength-reduce data references with implicit > - multiplication in their addressing expressions. */ > - > /* TODO: When conditional increments occur so that a > candidate is dependent upon a phi-basis, the cost of > introducing a temporary must be accounted for. */ > @@ -1455,8 +1707,8 @@ execute_strength_reduction (void) > gcc_obstack_init (&chain_obstack); > > /* Allocate the mapping from base names to candidate chains. */ > - base_cand_map = XNEWVEC (cand_chain_t, num_ssa_names); > - memset (base_cand_map, 0, num_ssa_names * sizeof (cand_chain_t)); > + base_cand_map = htab_create (500, base_cand_hash, > + base_cand_eq, base_cand_free); > > /* Initialize the loop optimizer. We need to detect flow across > back edges, and this gives us dominator information as well. */ > @@ -1490,7 +1742,7 @@ execute_strength_reduction (void) > /* Free resources. */ > fini_walk_dominator_tree (&walk_data); > loop_optimizer_finalize (); > - free (base_cand_map); > + htab_delete (base_cand_map); > obstack_free (&chain_obstack, NULL); > pointer_map_destroy (stmt_cand_map); > VEC_free (slsr_cand_t, heap, cand_vec); >