From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6655 invoked by alias); 16 Jan 2013 10:29:19 -0000 Received: (qmail 6645 invoked by uid 22791); 16 Jan 2013 10:29:19 -0000 X-SWARE-Spam-Status: No, hits=-5.7 required=5.0 tests=AWL,BAYES_00,KHOP_RCVD_UNTRUST,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS,T_TVD_MIME_NO_HEADERS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 16 Jan 2013 10:29:12 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r0GATAjb021607 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 16 Jan 2013 05:29:11 -0500 Received: from freie.oliva.athome.lsd.ic.unicamp.br (ovpn01.gateway.prod.ext.phx2.redhat.com [10.5.9.1]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r0GAT7sk018953 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 16 Jan 2013 05:29:10 -0500 Received: from livre.localdomain (livre-to-gw.oliva.athome.lsd.ic.unicamp.br [172.31.160.19]) by freie.oliva.athome.lsd.ic.unicamp.br (8.14.5/8.14.5) with ESMTP id r0GAT1n7022360 for ; Wed, 16 Jan 2013 08:29:02 -0200 Received: from livre.localdomain (aoliva@localhost.localdomain [127.0.0.1]) by livre.localdomain (8.14.3/8.14.3/Debian-5+lenny1) with ESMTP id r0GAT0NP008592; Wed, 16 Jan 2013 08:29:00 -0200 Received: (from aoliva@localhost) by livre.localdomain (8.14.3/8.14.3/Submit) id r0GASxsK008590; Wed, 16 Jan 2013 08:28:59 -0200 From: Alexandre Oliva To: gcc-patches@gcc.gnu.org Subject: fix for PR49888 var-tracking compile-time regression Date: Wed, 16 Jan 2013 10:29:00 -0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2013-01/txt/msg00836.txt.bz2 --=-=-= Content-length: 702 PR49888 introduced clobber_overlapping_mems to detach VALUEs (and variables bound to them) from MEMs as the MEMs are modified. This turned out to be quite expensive, particularly the computation of canonical addresses passed to alias dependency. This patch introduces caching of the canonical addresses to alleviate this expensive operation. We cache mappings that apply to the entire function, from equivalences recorded in the global cselib table, and mappings that apply only to the local basic block. This cut down 2% of a full regstrap cycle, and about 1% of the time it took to build target libraries for C, C++ and Java. Regstrapped on x86_64-linux-gnu and i686-linux-gnu. Ok to install? --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=vta-get-addr-caching-pr54402-pr54114.patch Content-length: 10610 Cache canonical addresses within VTA From: Alexandre Oliva for gcc/ChangeLog PR debug/54114 PR debug/54402 PR debug/49888 * var-tracking.c (negative_power_of_two_p): New. (global_get_addr_cache, local_get_addr_cache): New. (get_addr_from_global_cache, get_addr_from_local_cache): New. (vt_canonicalize_addr): Rewrite using the above. (local_get_addr_clear_given_value): New. (val_reset): Clear local cached entries. (compute_bb_dataflow): Create and release the local cache. (vt_emit_notes): Likewise. (vt_initialize, vt_finalize): Create and release the global cache, respectively. --- gcc/var-tracking.c | 243 +++++++++++++++++++++++++++++++++++++++++++--------- 1 files changed, 200 insertions(+), 43 deletions(-) diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 8c76fe0..befde0a 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -1948,6 +1948,14 @@ var_regno_delete (dataflow_set *set, int regno) *reg = NULL; } +/* Return true if I is the negated value of a power of two. */ +static bool +negative_power_of_two_p (HOST_WIDE_INT i) +{ + unsigned HOST_WIDE_INT x = -(unsigned HOST_WIDE_INT)i; + return x == (x & -x); +} + /* Strip constant offsets and alignments off of LOC. Return the base expression. */ @@ -1958,12 +1966,123 @@ vt_get_canonicalize_base (rtx loc) || GET_CODE (loc) == AND) && GET_CODE (XEXP (loc, 1)) == CONST_INT && (GET_CODE (loc) != AND - || INTVAL (XEXP (loc, 1)) < 0)) + || negative_power_of_two_p (INTVAL (XEXP (loc, 1))))) loc = XEXP (loc, 0); return loc; } +/* This caches canonicalized addresses for VALUEs, computed using + information in the global cselib table. */ +static struct pointer_map_t *global_get_addr_cache; + +/* This caches canonicalized addresses for VALUEs, computed using + information from the global cache and information pertaining to a + basic block being analyzed. */ +static struct pointer_map_t *local_get_addr_cache; + +static rtx vt_canonicalize_addr (dataflow_set *, rtx); + +/* Return the canonical address for LOC, that must be a VALUE, using a + cached global equivalence or computing it and storing it in the + global cache. */ + +static rtx +get_addr_from_global_cache (rtx const loc) +{ + rtx x; + void **slot; + + gcc_checking_assert (GET_CODE (loc) == VALUE); + + slot = pointer_map_insert (global_get_addr_cache, loc); + if (*slot) + return (rtx)*slot; + + x = canon_rtx (get_addr (loc)); + + /* Tentative, avoiding infinite recursion. */ + *slot = x; + + if (x != loc) + { + rtx nx = vt_canonicalize_addr (NULL, x); + if (nx != x) + { + /* The table may have moved during recursion, recompute + SLOT. */ + slot = pointer_map_contains (global_get_addr_cache, loc); + *slot = x = nx; + } + } + + return x; +} + +/* Return the canonical address for LOC, that must be a VALUE, using a + cached local equivalence or computing it and storing it in the + local cache. */ + +static rtx +get_addr_from_local_cache (dataflow_set *set, rtx const loc) +{ + rtx x; + void **slot; + decl_or_value dv; + variable var; + location_chain l; + + gcc_checking_assert (GET_CODE (loc) == VALUE); + + slot = pointer_map_insert (local_get_addr_cache, loc); + if (*slot) + return (rtx)*slot; + + x = get_addr_from_global_cache (loc); + + /* Tentative, avoiding infinite recursion. */ + *slot = x; + + /* Recurse to cache local expansion of X, or if we need to search + for a VALUE in the expansion. */ + if (x != loc) + { + rtx nx = vt_canonicalize_addr (set, x); + if (nx != x) + { + slot = pointer_map_contains (local_get_addr_cache, loc); + *slot = x = nx; + } + return x; + } + + dv = dv_from_rtx (x); + var = (variable) htab_find_with_hash (shared_hash_htab (set->vars), + dv, dv_htab_hash (dv)); + if (!var) + return x; + + /* Look for an improved equivalent expression. */ + for (l = var->var_part[0].loc_chain; l; l = l->next) + { + rtx base = vt_get_canonicalize_base (l->loc); + if (GET_CODE (base) == REG + || (GET_CODE (base) == VALUE + && canon_value_cmp (base, loc))) + { + rtx nx = vt_canonicalize_addr (set, l->loc); + if (x != nx) + { + slot = pointer_map_contains (local_get_addr_cache, loc); + *slot = x = nx; + } + break; + } + } + + return x; +} + /* Canonicalize LOC using equivalences from SET in addition to those in the cselib static table. */ @@ -1972,69 +2091,56 @@ vt_canonicalize_addr (dataflow_set *set, rtx oloc) { HOST_WIDE_INT ofst = 0; enum machine_mode mode = GET_MODE (oloc); - rtx loc = canon_rtx (get_addr (oloc)); + rtx loc = oloc; + rtx x; + bool retry = true; - /* Try to substitute a base VALUE for equivalent expressions as much - as possible. The goal here is to expand stack-related addresses - to one of the stack base registers, so that we can compare - addresses for overlaps. */ - while (GET_CODE (vt_get_canonicalize_base (loc)) == VALUE) + while (retry) { - rtx x; - decl_or_value dv; - variable var; - location_chain l; - - while (GET_CODE (loc) == PLUS) + while (GET_CODE (loc) == PLUS + && GET_CODE (XEXP (loc, 1)) == CONST_INT) { ofst += INTVAL (XEXP (loc, 1)); loc = XEXP (loc, 0); - continue; } /* Alignment operations can't normally be combined, so just canonicalize the base and we're done. We'll normally have only one stack alignment anyway. */ - if (GET_CODE (loc) == AND) + if (GET_CODE (loc) == AND + && GET_CODE (XEXP (loc, 1)) == CONST_INT + && negative_power_of_two_p (INTVAL (XEXP (loc, 1)))) { x = vt_canonicalize_addr (set, XEXP (loc, 0)); if (x != XEXP (loc, 0)) loc = gen_rtx_AND (mode, x, XEXP (loc, 1)); - loc = canon_rtx (get_addr (loc)); - break; + retry = false; } - x = canon_rtx (get_addr (loc)); - - /* We've made progress! Start over. */ - if (x != loc || GET_CODE (x) != VALUE) + if (GET_CODE (loc) == VALUE) { - loc = x; - continue; - } - - dv = dv_from_rtx (x); - var = (variable) htab_find_with_hash (shared_hash_htab (set->vars), - dv, dv_htab_hash (dv)); - if (!var) - break; + if (set) + loc = get_addr_from_local_cache (set, loc); + else + loc = get_addr_from_global_cache (loc); - /* Look for an improved equivalent expression. */ - for (l = var->var_part[0].loc_chain; l; l = l->next) - { - rtx base = vt_get_canonicalize_base (l->loc); - if (GET_CODE (base) == REG - || (GET_CODE (base) == VALUE - && canon_value_cmp (base, loc))) + /* Consolidate plus_constants. */ + while (ofst && GET_CODE (loc) == PLUS + && GET_CODE (XEXP (loc, 1)) == CONST_INT) { - loc = l->loc; - break; + ofst += INTVAL (XEXP (loc, 1)); + loc = XEXP (loc, 0); } - } - /* No luck with the dataflow set, so we're done. */ - if (!l) - break; + retry = false; + } + else + { + x = canon_rtx (loc); + if (retry) + retry = (x != loc); + loc = x; + } } /* Add OFST back in. */ @@ -2358,6 +2464,17 @@ val_store (dataflow_set *set, rtx val, rtx loc, rtx insn, bool modified) val_bind (set, val, loc, modified); } +/* Clear (canonical address) slots that reference X. */ + +static bool +local_get_addr_clear_given_value (const void *v ATTRIBUTE_UNUSED, + void **slot, void *x) +{ + if (vt_get_canonicalize_base ((rtx)*slot) == x) + *slot = NULL; + return true; +} + /* Reset this node, detaching all its equivalences. Return the slot in the variable hash table that holds dv, if there is one. */ @@ -2373,6 +2490,28 @@ val_reset (dataflow_set *set, decl_or_value dv) gcc_assert (var->n_var_parts == 1); + if (var->onepart == ONEPART_VALUE) + { + rtx x = dv_as_value (dv); + void **slot; + + /* Relationships in the global cache don't change, so reset the + local cache entry only. */ + slot = pointer_map_contains (local_get_addr_cache, x); + if (slot) + { + /* If the value resolved back to itself, odds are that other + values may have cached it too. These entries now refer + to the old X, so detach them too. Entries that used the + old X but resolved to something else remain ok as long as + that something else isn't also reset. */ + if (*slot == x) + pointer_map_traverse (local_get_addr_cache, + local_get_addr_clear_given_value, x); + *slot = NULL; + } + } + cval = NULL; for (node = var->var_part[0].loc_chain; node; node = node->next) if (GET_CODE (node->loc) == VALUE @@ -6420,6 +6559,9 @@ compute_bb_dataflow (basic_block bb) dataflow_set_copy (&old_out, out); dataflow_set_copy (out, in); + if (MAY_HAVE_DEBUG_INSNS) + local_get_addr_cache = pointer_map_create (); + FOR_EACH_VEC_ELT (VTI (bb)->mos, i, mo) { rtx insn = mo->insn; @@ -6696,6 +6838,9 @@ compute_bb_dataflow (basic_block bb) if (MAY_HAVE_DEBUG_INSNS) { + pointer_map_destroy (local_get_addr_cache); + local_get_addr_cache = NULL; + dataflow_set_equiv_regs (out); htab_traverse (shared_hash_htab (out->vars), canonicalize_values_mark, out); @@ -9236,9 +9381,16 @@ vt_emit_notes (void) subsequent basic blocks. */ emit_notes_for_differences (BB_HEAD (bb), &cur, &VTI (bb)->in); + if (MAY_HAVE_DEBUG_INSNS) + local_get_addr_cache = pointer_map_create (); + /* Emit the notes for the changes in the basic block itself. */ emit_notes_in_bb (bb, &cur); + if (MAY_HAVE_DEBUG_INSNS) + pointer_map_destroy (local_get_addr_cache); + local_get_addr_cache = NULL; + /* Free memory occupied by the in hash table, we won't need it again. */ dataflow_set_clear (&VTI (bb)->in); @@ -9607,11 +9759,13 @@ vt_initialize (void) valvar_pool = create_alloc_pool ("small variable_def pool", sizeof (struct variable_def), 256); preserved_values.create (256); + global_get_addr_cache = pointer_map_create (); } else { scratch_regs = NULL; valvar_pool = NULL; + global_get_addr_cache = NULL; } if (MAY_HAVE_DEBUG_INSNS) @@ -9948,6 +10102,9 @@ vt_finalize (void) if (MAY_HAVE_DEBUG_INSNS) { + if (global_get_addr_cache) + pointer_map_destroy (global_get_addr_cache); + global_get_addr_cache = NULL; if (loc_exp_dep_pool) free_alloc_pool (loc_exp_dep_pool); loc_exp_dep_pool = NULL; --=-=-= Content-length: 258 -- Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Compiler Engineer --=-=-=--