* [RFC, WIP] tree-ssa-strlen optimization pass @ 2011-09-02 16:51 Jakub Jelinek 2011-09-02 17:15 ` Paolo Carlini 2011-09-05 8:55 ` Richard Guenther 0 siblings, 2 replies; 8+ messages in thread From: Jakub Jelinek @ 2011-09-02 16:51 UTC (permalink / raw) To: gcc-patches Hi! The following patch contains a WIP implementation of a new pass, which attempts to track C string lengths and perform various optimizations using that information. Optimizations it currently performs: 1) optimizing away strlen calls, if the string length is known at that point already (either constant or some expression) 2) replacement of strcpy calls with memcpy if the string length of the source is known 3) replacement of strcat with either memcpy (if source length is known) or strcpy (if not), if the destination length before the call is known - over the years I've seen way too much spaghetti code doing many strcat calls to the same string one after another During bootstrap/regtest (excluding the newly added testcases) 1) hits 184 times on x86_64-linux, 182 tmes on i686-linux, 2) hits 158 times on x86_64 and 159 times on i686, 3) into memcpy hits 33 times and 3) into strcpy 2 times. Example from gcc sources that is optimized: filename = (char *) alloca (strlen (module_name) + strlen (MODULE_EXTENSION) + 1); strcpy (filename, module_name); strcat (filename, MODULE_EXTENSION); which can be optimized into filename = (char *) alloca ((tmp1 = strlen (module_name)) + (tmp2 = strlen (MODULE_EXTENSION)) + 1); memcpy (filename, module_name, tmp1 + 1); memcpy (filename + tmp1, MODULE_EXTENSION, tmp2 + 1); Some further optimizations I'm currently considering for the pass: - handle *p = 0; stores like memcpy (p, "", 1) - lame coders often do *p = 0; strcat (p, str1); strcat (p, str2); - if a memcpy call (either original or strcpy/strcat transformed into it) copies known src string length + 1 and the immediately following .MEM use (or possibly non-immediately if there are only non-aliasing ones?) is a strcpy/(or to be optimized strcat or non-zero length memcpy) call which overwrites the final '\0', decrease the memcpy size by one - if source length for strcpy isn't known and the destination length is needed for optimizations, for -fhosted, glibc and with stpcpy compatible prototype in headers consider transforming that strcpy into stpcpy and use result - dst as string length (and see whether following optimizations are able to optimize series of unknown source length strcpy+strcat into a chain of stpcpy calls) - similarly if string length of strcat destination isn't known but is helpful for optimization consider optimizing strcat into strlen+stpcpy Any comments related to the implementation, or examples of real-world lame C string length code sequences that would be nice to optimize will be greatly appreciated. The patch has been bootstrapped/regtested on x86_64-linux and i686-linux, no regressions, but I'm not proposing it for trunk yet (would like to implement at least a few of the above mentioned optimizations), just am posting it early as a pass preview. 2011-09-02 Jakub Jelinek <jakub@redhat.com> * common.opt: Add -ftree-strlen option. * Makefile.in (OBJS): Add tree-ssa-strlen.o. (tree-sssa-strlen.o): Add dependencies. * opts.c (default_options_table): Enable -ftree-strlen by default at -O2 if not -Os. * passes.c (init_optimization_passes): Add pass_strlen after pass_object_sizes. * timevar.def (TV_TREE_STRLEN): New timevar. * tree-pass.h (pass_strlen): Declare. * tree-ssa-strlen.c: New file. * gcc.dg/strlenopt-1.c: New test. * gcc.dg/strlenopt-2.c: New test. * gcc.dg/strlenopt-3.c: New test. * gcc.dg/strlenopt.h: New file. --- gcc/common.opt.jj 2011-08-26 18:41:44.000000000 +0200 +++ gcc/common.opt 2011-08-30 10:57:36.000000000 +0200 @@ -1953,6 +1953,10 @@ ftree-fre Common Report Var(flag_tree_fre) Optimization Enable Full Redundancy Elimination (FRE) on trees +ftree-strlen +Common Report Var(flag_tree_strlen) Optimization +Enable string length optimizations on trees + ftree-loop-distribution Common Report Var(flag_tree_loop_distribution) Optimization Enable loop distribution on trees --- gcc/Makefile.in.jj 2011-08-26 18:41:44.000000000 +0200 +++ gcc/Makefile.in 2011-09-02 15:43:02.000000000 +0200 @@ -1472,6 +1472,7 @@ OBJS = \ tree-ssa-reassoc.o \ tree-ssa-sccvn.o \ tree-ssa-sink.o \ + tree-ssa-strlen.o \ tree-ssa-structalias.o \ tree-ssa-ter.o \ tree-ssa-threadedge.o \ @@ -3157,6 +3158,9 @@ tree-ssa-ccp.o : tree-ssa-ccp.c $(TREE_F $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h \ tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(DIAGNOSTIC_CORE_H) \ $(DBGCNT_H) tree-pretty-print.h gimple-pretty-print.h gimple-fold.h +tree-ssa-strlen.o : tree-ssa-strlen.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h alloc-pool.h tree-ssa-propagate.h \ + gimple-pretty-print.h tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h alloc-pool.h \ $(TM_H) $(TREE_H) $(GIMPLE_H) $(CGRAPH_H) $(TREE_FLOW_H) \ $(IPA_PROP_H) $(DIAGNOSTIC_H) statistics.h $(TREE_DUMP_H) $(TIMEVAR_H) \ --- gcc/opts.c.jj 2011-06-30 17:58:03.000000000 +0200 +++ gcc/opts.c 2011-09-02 15:53:06.000000000 +0200 @@ -484,6 +484,7 @@ static const struct default_options defa { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 }, + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 }, /* -O3 optimizations. */ { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, --- gcc/passes.c.jj 2011-07-12 07:58:48.000000000 +0200 +++ gcc/passes.c 2011-08-31 12:03:29.000000000 +0200 @@ -1,7 +1,7 @@ /* Top level of GCC compilers (cc1, cc1plus, etc.) Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, - 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 - Free Software Foundation, Inc. + 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, + 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -1321,6 +1321,7 @@ init_optimization_passes (void) NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_object_sizes); + NEXT_PASS (pass_strlen); NEXT_PASS (pass_ccp); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_cse_sincos); --- gcc/timevar.def.jj 2011-05-04 10:14:08.000000000 +0200 +++ gcc/timevar.def 2011-08-30 10:54:32.000000000 +0200 @@ -1,7 +1,7 @@ /* This file contains the definitions for timing variables used to measure run-time performance of the compiler. Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, - 2009, 2010 + 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Alex Samuel <samuel@codesourcery.com> @@ -183,6 +183,7 @@ DEFTIMEVAR (TV_TREE_COPY_RENAME , " DEFTIMEVAR (TV_TREE_SSA_VERIFY , "tree SSA verifier") DEFTIMEVAR (TV_TREE_STMT_VERIFY , "tree STMT verifier") DEFTIMEVAR (TV_TREE_SWITCH_CONVERSION, "tree switch initialization conversion") +DEFTIMEVAR (TV_TREE_STRLEN , "tree strlen optimization") DEFTIMEVAR (TV_CGRAPH_VERIFY , "callgraph verifier") DEFTIMEVAR (TV_DOM_FRONTIERS , "dominance frontiers") DEFTIMEVAR (TV_DOMINANCE , "dominance computation") --- gcc/tree-pass.h.jj 2011-07-08 15:09:38.000000000 +0200 +++ gcc/tree-pass.h 2011-08-30 10:55:27.000000000 +0200 @@ -412,6 +412,7 @@ extern struct gimple_opt_pass pass_diagn extern struct gimple_opt_pass pass_expand_omp; extern struct gimple_opt_pass pass_expand_omp_ssa; extern struct gimple_opt_pass pass_object_sizes; +extern struct gimple_opt_pass pass_strlen; extern struct gimple_opt_pass pass_fold_builtins; extern struct gimple_opt_pass pass_stdarg; extern struct gimple_opt_pass pass_early_warn_uninitialized; --- gcc/tree-ssa-strlen.c.jj 2011-08-30 10:56:51.000000000 +0200 +++ gcc/tree-ssa-strlen.c 2011-09-02 15:50:55.000000000 +0200 @@ -0,0 +1,1288 @@ +/* String length optimization + Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Jakub Jelinek <jakub@redhat.com> + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tree-flow.h" +#include "tree-pass.h" +#include "domwalk.h" +#include "alloc-pool.h" +#include "tree-ssa-propagate.h" +#include "gimple-pretty-print.h" + +/* Array indexed by SSA_NAME_VERSION. 0 means unknown, positive value + is an index into strinfo vector, negative value stands for + string length of a string literal (~strlen). */ +static int *ssa_ver_to_stridx; + +/* Number of currently active string indexes plus one. */ +static int max_stridx; + +/* String information record. */ +typedef struct strinfo_struct +{ + /* String length of this string. */ + tree length; + /* Any of the corresponding pointers for querying alias oracle. */ + tree ptr; + /* Reference count. Any changes to strinfo entry possibly shared + with dominating basic blocks need unshare_strinfo first, except + for dont_invalidate which affects only the immediately next + maybe_invalidate. */ + int refcount; + /* Copy of index. get_strinfo (si->idx) should return si; */ + int idx; + /* These 3 fields are for chaining related string pointers together. + E.g. for + bl = strlen (b); dl = strlen (d); strcpy (a, b); c = a + bl; + strcpy (c, d); e = c + dl; + strinfo(a) -> strinfo(c) -> strinfo(e) + All have ->first field equal to strinfo(a)->idx and are doubly + chained through prev/next fields. The later strinfos are required + to point into the same string with zero or more bytes after + the previous pointer and all bytes in between the two pointers + must be non-zero. Functions like strcpy or memcpy are supposed + to adjust all previous strinfo lengths, but not following strinfo + lengths (those are uncertain, usually invalidated during + maybe_invalidate, except when the alias oracle knows better). + Functions like strcat on the other side adjust the whole + related strinfo chain. + They are updated lazily, so to use the chain the same first fields + and si->prev->next == si->idx needs to be verified. */ + int first; + int next; + int prev; + /* A flag for the next maybe_invalidate that this strinfo shouldn't + be invalidated. Always cleared by maybe_invalidate. */ + bool dont_invalidate; +} *strinfo; +DEF_VEC_P(strinfo); +DEF_VEC_ALLOC_P(strinfo,heap); + +/* Pool for allocating strinfo_struct entries. */ +static alloc_pool strinfo_pool; + +/* Vector mapping positive string indexes to strinfo, for the + current basic block. The first pointer in the vector is special, + it is either NULL, meaning the vector isn't shared, or it is + a basic block pointer to the owner basic_block if shared. + If some other bb wants to modify the vector, the vector needs + to be unshared first, and only the owner bb is supposed to free it. */ +static VEC(strinfo, heap) *stridx_to_strinfo; + +struct stridxlist +{ + struct stridxlist *next; + HOST_WIDE_INT offset; + int idx; +}; + +struct decl_stridxlist_map +{ + struct tree_map_base base; + struct stridxlist list; +}; + +/* Hash table for mapping decls to a chained list of offset -> idx + mappings. */ +static htab_t decl_to_stridxlist_htab; + +/* Hash a from tree in a decl_stridxlist_map. */ + +static unsigned int +decl_to_stridxlist_hash (const void *item) +{ + return DECL_UID (((const struct decl_stridxlist_map *) item)->base.from); +} + +/* Free a decl_stridxlist_map. Callback for htab_delete. */ + +static void +decl_to_stridxlist_free (void *item) +{ + struct stridxlist *next; + struct stridxlist *list = ((struct decl_stridxlist_map *) item)->list.next; + + while (list) + { + next = list->next; + XDELETE (list); + list = next; + } + XDELETE (item); +} + +/* Return string index for EXP. */ + +static int +get_stridx (tree exp) +{ + tree l; + + if (TREE_CODE (exp) == SSA_NAME) + return ssa_ver_to_stridx[SSA_NAME_VERSION (exp)]; + + if (TREE_CODE (exp) == ADDR_EXPR && decl_to_stridxlist_htab) + { + HOST_WIDE_INT off; + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), + &off); + if (base && DECL_P (base)) + { + struct decl_stridxlist_map ent, *e; + ent.base.from = base; + e = (struct decl_stridxlist_map *) + htab_find_with_hash (decl_to_stridxlist_htab, &ent, + DECL_UID (base)); + if (e) + { + struct stridxlist *list = &e->list; + do + { + if (list->offset == off) + return list->idx; + list = list->next; + } + while (list); + } + } + } + + l = c_strlen (exp, 0); + if (l != NULL_TREE + && host_integerp (l, 1)) + { + unsigned HOST_WIDE_INT len = tree_low_cst (l, 1); + if (len == (unsigned int) len + && (int) len >= 0) + return ~(int) len; + } + return 0; +} + +/* Return true if strinfo vector is shared with the immediate dominator. */ + +static inline bool +strinfo_shared (void) +{ + return VEC_length (strinfo, stridx_to_strinfo) + && VEC_index (strinfo, stridx_to_strinfo, 0) != NULL; +} + +/* Unshare strinfo vector that is shared with the immediate dominator. */ + +static void +unshare_strinfo_vec (void) +{ + strinfo si; + unsigned int i = 0; + + gcc_assert (strinfo_shared ()); + stridx_to_strinfo = VEC_copy (strinfo, heap, stridx_to_strinfo); + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) + if (si != NULL) + si->refcount++; + VEC_replace (strinfo, stridx_to_strinfo, 0, NULL); +} + +/* Attempt to create a string index for ADDR_EXPR exp. + Return a pointer to the location where the string index can + be stored (if 0) or is stored, or NULL if this can't be tracked. */ + +static int * +addr_stridxptr (tree exp) +{ + void **slot; + struct decl_stridxlist_map ent; + struct stridxlist *list; + HOST_WIDE_INT off; + + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), &off); + if (base == NULL_TREE || !DECL_P (base)) + return NULL; + + if (decl_to_stridxlist_htab == NULL) + decl_to_stridxlist_htab + = htab_create (64, decl_to_stridxlist_hash, tree_map_base_eq, + decl_to_stridxlist_free); + ent.base.from = base; + slot = htab_find_slot_with_hash (decl_to_stridxlist_htab, &ent, + DECL_UID (base), INSERT); + if (*slot) + { + int i; + list = &((struct decl_stridxlist_map *)*slot)->list; + for (i = 0; i < 16; i++) + { + if (list->offset == off) + return &list->idx; + if (list->next == NULL) + break; + } + if (i == 16) + return NULL; + list->next = XNEW (struct stridxlist); + list = list->next; + } + else + { + struct decl_stridxlist_map *e = XNEW (struct decl_stridxlist_map); + e->base.from = base; + *slot = (void *) e; + list = &e->list; + } + list->next = NULL; + list->offset = off; + list->idx = 0; + return &list->idx; +} + +/* Create a new string index, or return 0 if reached limit. */ + +static int +new_stridx (tree exp) +{ + int idx; + if (max_stridx == 1000) + return 0; + if (TREE_CODE (exp) == SSA_NAME) + { + idx = max_stridx++; + ssa_ver_to_stridx[SSA_NAME_VERSION (exp)] = idx; + return idx; + } + if (TREE_CODE (exp) == ADDR_EXPR) + { + int *pidx = addr_stridxptr (exp); + if (pidx != NULL) + { + gcc_assert (*pidx == 0); + *pidx = max_stridx++; + return *pidx; + } + } + return 0; +} + +/* Create a new strinfo. */ + +static strinfo +new_strinfo (tree ptr, int idx, tree length) +{ + strinfo si = (strinfo) pool_alloc (strinfo_pool); + si->length = length; + si->ptr = ptr; + si->refcount = 1; + si->idx = idx; + si->first = 0; + si->prev = 0; + si->next = 0; + si->dont_invalidate = false; + return si; +} + +/* Decrease strinfo refcount and free it if not referenced anymore. */ + +static inline void +free_strinfo (strinfo si) +{ + if (si && --si->refcount == 0) + pool_free (strinfo_pool, si); +} + +/* Return strinfo vector entry IDX. */ + +static inline strinfo +get_strinfo (int idx) +{ + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx) + return NULL; + return VEC_index (strinfo, stridx_to_strinfo, idx); +} + +/* Set strinfo in the vector entry IDX to SI. */ + +static inline void +set_strinfo (int idx, strinfo si) +{ + if (VEC_length (strinfo, stridx_to_strinfo) && VEC_index (strinfo, stridx_to_strinfo, 0)) + unshare_strinfo_vec (); + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx) + VEC_safe_grow_cleared (strinfo, heap, stridx_to_strinfo, idx + 1); + VEC_replace (strinfo, stridx_to_strinfo, idx, si); +} + +/* Invalidate string length information for strings whose length + might change due to stores in stmt. */ + +static bool +maybe_invalidate (gimple stmt) +{ + strinfo si; + unsigned int i; + bool nonempty = false; + + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) + if (si != NULL) + { + if (!si->dont_invalidate) + { + ao_ref r; + ao_ref_init_from_ptr_and_size (&r, si->ptr, NULL_TREE); + if (stmt_may_clobber_ref_p_1 (stmt, &r)) + { + set_strinfo (i, NULL); + free_strinfo (si); + continue; + } + } + si->dont_invalidate = false; + nonempty = true; + } + return nonempty; +} + +/* Unshare strinfo record SI, if it has recount > 1 or + if stridx_to_strinfo vector is shared with some other + bbs. */ + +static strinfo +unshare_strinfo (strinfo si) +{ + strinfo nsi; + + if (si->refcount == 1 && !strinfo_shared ()) + return si; + + nsi = new_strinfo (si->ptr, si->idx, si->length); + nsi->first = si->first; + nsi->prev = si->prev; + nsi->next = si->next; + set_strinfo (si->idx, nsi); + free_strinfo (si); + return nsi; +} + +/* Return first strinfo in the related strinfo chain + if all strinfos in between belong to the chain, otherwise + NULL. */ + +static strinfo +verify_related_strinfos (strinfo origsi) +{ + strinfo si = origsi, psi; + + if (origsi->first == 0) + return NULL; + for (; si->prev; si = psi) + { + if (si->first != origsi->first) + return NULL; + psi = get_strinfo (si->prev); + if (psi == NULL) + return NULL; + if (psi->next != si->idx) + return NULL; + } + if (si->idx != si->first) + return NULL; + return si; +} + +/* Note that PTR, a pointer SSA_NAME initialized in the current stmt, points + to a zero-length string and if possible chain it to a related strinfo + chain whose part is or might be CHAINSI. */ + +static strinfo +zero_length_string (tree ptr, strinfo chainsi) +{ + strinfo si; + int idx; + gcc_checking_assert (TREE_CODE (ptr) == SSA_NAME + && get_stridx (ptr) == 0); + + if (chainsi != NULL) + { + if (verify_related_strinfos (chainsi)) + { + for (; chainsi->next; chainsi = si) + { + si = get_strinfo (chainsi->next); + if (si == NULL + || si->first != chainsi->first + || si->prev != chainsi->idx) + break; + } + if (integer_zerop (chainsi->length)) + { + if (chainsi->next) + { + chainsi = unshare_strinfo (chainsi); + chainsi->next = 0; + } + ssa_ver_to_stridx [SSA_NAME_VERSION (ptr)] = chainsi->idx; + return chainsi; + } + } + else if (chainsi->first || chainsi->prev || chainsi->next) + { + chainsi = unshare_strinfo (chainsi); + chainsi->first = 0; + chainsi->prev = 0; + chainsi->next = 0; + } + } + idx = new_stridx (ptr); + if (idx == 0) + return NULL; + si = new_strinfo (ptr, idx, build_int_cst (size_type_node, 0)); + set_strinfo (idx, si); + if (chainsi != NULL) + { + chainsi = unshare_strinfo (chainsi); + if (chainsi->first == 0) + chainsi->first = chainsi->idx; + chainsi->next = idx; + si->prev = chainsi->idx; + si->first = chainsi->first; + } + return si; +} + +/* For strinfo ORIGSI whose length has been just updated + update also related strinfo lengths (add ADJ to each, + but don't adjust ORIGSI). */ + +static void +adjust_related_strinfos (location_t loc, strinfo origsi, tree adj) +{ + strinfo si = verify_related_strinfos (origsi); + + if (si == NULL) + return; + + while (1) + { + strinfo nsi; + + if (si != origsi) + { + tree tem; + + si = unshare_strinfo (si); + tem = fold_convert_loc (loc, TREE_TYPE (si->length), adj); + si->length = fold_build2_loc (loc, PLUS_EXPR, + TREE_TYPE (si->length), si->length, + tem); + si->dont_invalidate = true; + } + if (si->next == 0) + return; + nsi = get_strinfo (si->next); + if (nsi == NULL + || nsi->first != si->first + || nsi->prev != si->idx) + return; + si = nsi; + } +} + +/* Find if there are other SSA_NAME pointers equal to PTR + for which we don't track their string lengths yet. If so, use + IDX for them. */ + +static void +find_equal_ptrs (tree ptr, int idx) +{ + if (TREE_CODE (ptr) != SSA_NAME) + return; + while (1) + { + gimple stmt = SSA_NAME_DEF_STMT (ptr); + if (!gimple_assign_single_p (stmt) + && (!gimple_assign_cast_p (stmt) + || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) + return; + ptr = gimple_assign_rhs1 (stmt); + if (TREE_CODE (ptr) != SSA_NAME) + { + if (TREE_CODE (ptr) == ADDR_EXPR) + { + int *pidx = addr_stridxptr (ptr); + if (pidx != NULL && *pidx == 0) + *pidx = idx; + return; + } + return; + } + if (ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] != 0) + return; + ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] = idx; + } +} + +/* Handle a strlen call. If strlen of the argument is known, replace + the strlen call with the known value, otherwise remember that strlen + of the argument is stored in the lhs SSA_NAME. */ + +static void +handle_builtin_strlen (gimple_stmt_iterator *gsi) +{ + int idx; + tree src; + gimple stmt = gsi_stmt (*gsi); + tree lhs = gimple_call_lhs (stmt); + + if (lhs == NULL_TREE) + return; + + src = gimple_call_arg (stmt, 0); + idx = get_stridx (src); + if (idx) + { + if (idx < 0) + { + tree rhs = build_int_cst (TREE_TYPE (lhs), ~idx); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (update_call_from_tree (gsi, rhs)) + { + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + } + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) + fprintf (dump_file, "not possible.\n"); + } + else + { + strinfo si = get_strinfo (idx); + if (si != NULL) + { + tree rhs = si->length; + if (!useless_type_conversion_p (TREE_TYPE (lhs), + TREE_TYPE (rhs))) + rhs = fold_convert_loc (gimple_location (stmt), + TREE_TYPE (lhs), si->length); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (!update_call_from_tree (gsi, rhs)) + { + rhs = force_gimple_operand_gsi (gsi, rhs, true, NULL_TREE, + true, GSI_SAME_STMT); + if (!update_call_from_tree (gsi, rhs)) + { + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + fprintf (dump_file, "not possible.\n"); + return; + } + } + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + if (TREE_CODE (si->length) != SSA_NAME + && TREE_CODE (si->length) != INTEGER_CST + && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + { + si = unshare_strinfo (si); + si->length = lhs; + } + } + } + } + else if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + { + idx = new_stridx (src); + if (idx) + { + strinfo si = new_strinfo (src, idx, lhs); + set_strinfo (idx, si); + find_equal_ptrs (src, idx); + } + } +} + +/* Handle a strcpy-like ({st{r,p}cpy,__st{r,p}cpy_chk}) call. + If strlen of the second argument is known, strlen of the first argument + is the same after this call. Furthermore, attempt to convert it to + memcpy. */ + +static void +handle_builtin_strcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) +{ + int idx, didx; + tree src, dst, len, lhs, rhs, args, type, fn, oldlen; + gimple stmt = gsi_stmt (*gsi); + strinfo si, dsi, olddsi, zsi; + location_t loc; + + src = gimple_call_arg (stmt, 1); + dst = gimple_call_arg (stmt, 0); + idx = get_stridx (src); + if (idx <= 0) + return; + + si = get_strinfo (idx); + if (si == NULL) + return; + + didx = get_stridx (dst); + olddsi = NULL; + oldlen = NULL_TREE; + if (didx > 0) + olddsi = get_strinfo (didx); + else if (didx < 0) + return; + else + { + didx = new_stridx (dst); + if (didx == 0) + return; + } + if (olddsi != NULL) + { + oldlen = olddsi->length; + dsi = unshare_strinfo (olddsi); + dsi->length = si->length; + /* Break the chain, so adjust_related_strinfo on later pointers in + the chain won't adjust this one anymore. */ + dsi->next = 0; + } + else + { + dsi = new_strinfo (dst, didx, si->length); + set_strinfo (didx, dsi); + find_equal_ptrs (dst, didx); + } + dsi->dont_invalidate = true; + loc = gimple_location (stmt); + if (olddsi != NULL) + { + tree adj = NULL_TREE; + if (integer_zerop (oldlen)) + adj = si->length; + else if (TREE_CODE (oldlen) == INTEGER_CST + || TREE_CODE (si->length) == INTEGER_CST) + adj = fold_build2_loc (loc, MINUS_EXPR, + TREE_TYPE (si->length), si->length, + fold_convert_loc (loc, TREE_TYPE (si->length), + oldlen)); + if (adj != NULL_TREE) + adjust_related_strinfos (loc, dsi, adj); + } + /* strcpy src may not overlap dst, so src doesn't need to be + invalidated either. */ + si->dont_invalidate = true; + + lhs = gimple_call_lhs (stmt); + fn = NULL_TREE; + zsi = NULL; + switch (bcode) + { + case BUILT_IN_STRCPY: + fn = implicit_built_in_decls[BUILT_IN_MEMCPY]; + if (lhs) + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx; + break; + case BUILT_IN_STRCPY_CHK: + fn = built_in_decls[BUILT_IN_MEMCPY_CHK]; + if (lhs) + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx; + break; + case BUILT_IN_STPCPY: + /* This would need adjustment of the lhs (subtract one), + or detection that the trailing '\0' doesn't need to be + written, if it will be immediately overwritten. + fn = built_in_decls[BUILT_IN_MEMPCPY]; */ + if (lhs) + zsi = zero_length_string (lhs, dsi); + break; + case BUILT_IN_STPCPY_CHK: + /* This would need adjustment of the lhs (subtract one), + or detection that the trailing '\0' doesn't need to be + written, if it will be immediately overwritten. + fn = built_in_decls[BUILT_IN_MEMPCPY_CHK]; */ + if (lhs) + zsi = zero_length_string (lhs, dsi); + break; + default: + gcc_unreachable (); + } + if (zsi != NULL) + zsi->dont_invalidate = true; + + if (fn == NULL_TREE) + return; + + args = TYPE_ARG_TYPES (TREE_TYPE (fn)); + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args))); + + len = fold_convert_loc (loc, type, si->length); + len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 1)); + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, + GSI_SAME_STMT); + if (gimple_call_num_args (stmt) == 2) + rhs = build_call_expr_loc (loc, fn, 3, dst, src, len); + else + rhs = build_call_expr_loc (loc, fn, 4, dst, src, len, + gimple_call_arg (stmt, 2)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (update_call_from_tree (gsi, rhs)) + { + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + } + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) + fprintf (dump_file, "not possible.\n"); +} + +/* Handle a memcpy-like ({mem{,p}cpy,__mem{,p}cpy_chk}) call. + If strlen of the second argument is known and length of the third argument + is that plus one, strlen of the first argument is the same after this + call. */ + +static void +handle_builtin_memcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) +{ + int idx, didx; + tree src, dst, len, lhs, oldlen, newlen; + gimple stmt = gsi_stmt (*gsi); + strinfo si, dsi, olddsi; + + len = gimple_call_arg (stmt, 2); + src = gimple_call_arg (stmt, 1); + dst = gimple_call_arg (stmt, 0); + idx = get_stridx (src); + if (idx == 0) + return; + + if (idx > 0) + { + gimple def_stmt; + + /* Handle memcpy (x, y, l) where l is strlen (y) + 1. */ + si = get_strinfo (idx); + if (si == NULL) + return; + if (TREE_CODE (len) != SSA_NAME) + return; + def_stmt = SSA_NAME_DEF_STMT (len); + if (!is_gimple_assign (def_stmt) + || gimple_assign_rhs_code (def_stmt) != PLUS_EXPR + || gimple_assign_rhs1 (def_stmt) != si->length + || !integer_onep (gimple_assign_rhs2 (def_stmt))) + return; + } + else + { + si = NULL; + /* Handle memcpy (x, "abcd", 5) or + memcpy (x, "abc\0uvw", 7). */ + if (!host_integerp (len, 1) + || (unsigned HOST_WIDE_INT) tree_low_cst (len, 1) + <= (unsigned HOST_WIDE_INT) ~idx) + return; + } + + didx = get_stridx (dst); + olddsi = NULL; + if (didx > 0) + olddsi = get_strinfo (didx); + else if (didx < 0) + return; + else + { + didx = new_stridx (dst); + if (didx == 0) + return; + } + if (si != NULL) + newlen = si->length; + else + newlen = build_int_cst (TREE_TYPE (len), ~idx); + oldlen = NULL_TREE; + if (olddsi != NULL) + { + dsi = unshare_strinfo (olddsi); + oldlen = olddsi->length; + dsi->length = newlen; + /* Break the chain, so adjust_related_strinfo on later pointers in + the chain won't adjust this one anymore. */ + dsi->next = 0; + } + else + { + dsi = new_strinfo (dst, didx, newlen); + set_strinfo (didx, dsi); + find_equal_ptrs (dst, didx); + } + dsi->dont_invalidate = true; + if (olddsi != NULL) + { + tree adj = NULL_TREE; + location_t loc = gimple_location (stmt); + if (integer_zerop (oldlen)) + adj = dsi->length; + else if (TREE_CODE (oldlen) == INTEGER_CST + || TREE_CODE (dsi->length) == INTEGER_CST) + adj = fold_build2_loc (loc, MINUS_EXPR, + TREE_TYPE (dsi->length), dsi->length, + fold_convert_loc (loc, TREE_TYPE (dsi->length), + oldlen)); + if (adj != NULL_TREE) + adjust_related_strinfos (loc, dsi, adj); + } + /* memcpy src may not overlap dst, so src doesn't need to be + invalidated either. */ + if (si != NULL) + si->dont_invalidate = true; + + lhs = gimple_call_lhs (stmt); + switch (bcode) + { + case BUILT_IN_MEMCPY: + case BUILT_IN_MEMCPY_CHK: + if (lhs) + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx; + break; + case BUILT_IN_MEMPCPY: + case BUILT_IN_MEMPCPY_CHK: + break; + default: + gcc_unreachable (); + } +} + +/* Handle a strcat-like ({strcat,__strcat_chk}) call. + If strlen of the second argument is known, strlen of the first argument + is increased by the length of the second argument. Furthermore, attempt + to convert it to memcpy/strcpy if the length of the first argument + is known. */ + +static void +handle_builtin_strcat (enum built_in_function bcode, gimple_stmt_iterator *gsi) +{ + int idx, didx; + tree src, dst, dstlen, len, lhs, rhs, args, type, fn, objsz; + gimple stmt = gsi_stmt (*gsi); + strinfo si, dsi; + location_t loc; + + src = gimple_call_arg (stmt, 1); + dst = gimple_call_arg (stmt, 0); + + didx = get_stridx (dst); + if (didx <= 0) + return; + + dsi = get_strinfo (didx); + if (dsi == NULL) + return; + + /* This should have been folded, don't handle it here. */ + idx = get_stridx (src); + if (idx < 0) + return; + + si = NULL; + if (idx) + si = get_strinfo (idx); + + loc = gimple_location (stmt); + dstlen = dsi->length; + if (si != NULL) + { + dsi = unshare_strinfo (dsi); + dsi->dont_invalidate = true; + dsi->length = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (dsi->length), + dsi->length, si->length); + adjust_related_strinfos (loc, dsi, dstlen); + } + else + { + set_strinfo (didx, NULL); + free_strinfo (dsi); + } + if (si != NULL) + /* strcat src may not overlap dst, so src doesn't need to be + invalidated either. */ + si->dont_invalidate = true; + + lhs = gimple_call_lhs (stmt); + /* For now. Could remove the lhs from the call and add + lhs = dst; afterwards. */ + if (lhs) + return; + + fn = NULL_TREE; + objsz = NULL_TREE; + switch (bcode) + { + case BUILT_IN_STRCAT: + if (si) + fn = implicit_built_in_decls[BUILT_IN_MEMCPY]; + else + fn = implicit_built_in_decls[BUILT_IN_STRCPY]; + break; + case BUILT_IN_STRCAT_CHK: + if (si) + fn = built_in_decls[BUILT_IN_MEMCPY_CHK]; + else + fn = built_in_decls[BUILT_IN_STRCPY_CHK]; + objsz = gimple_call_arg (stmt, 2); + break; + default: + gcc_unreachable (); + } + + if (fn == NULL_TREE) + return; + + len = NULL_TREE; + if (si) + { + args = TYPE_ARG_TYPES (TREE_TYPE (fn)); + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args))); + + len = fold_convert_loc (loc, type, si->length); + len = fold_build2_loc (loc, PLUS_EXPR, type, len, + build_int_cst (type, 1)); + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, + GSI_SAME_STMT); + } + dst = fold_build2_loc (loc, POINTER_PLUS_EXPR, + TREE_TYPE (dst), dst, + fold_convert_loc (loc, sizetype, dstlen)); + dst = force_gimple_operand_gsi (gsi, dst, true, NULL_TREE, true, + GSI_SAME_STMT); + if (si) + rhs = build_call_expr_loc (loc, fn, 3 + (objsz != NULL_TREE), + dst, src, len, objsz); + else + rhs = build_call_expr_loc (loc, fn, 2 + (objsz != NULL_TREE), + dst, src, objsz); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (update_call_from_tree (gsi, rhs)) + { + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + } + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) + fprintf (dump_file, "not possible.\n"); +} + +/* Attempt to optimize a single statement at *GSI using string length + knowledge. */ + +static void +strlen_optimize_stmt (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + tree lhs; + + if (is_gimple_call (stmt)) + { + tree callee = gimple_call_fndecl (stmt); + if (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL) + switch (DECL_FUNCTION_CODE (callee)) + { + case BUILT_IN_STRLEN: + handle_builtin_strlen (gsi); + break; + case BUILT_IN_STRCPY: + case BUILT_IN_STRCPY_CHK: + case BUILT_IN_STPCPY: + case BUILT_IN_STPCPY_CHK: + handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi); + break; + case BUILT_IN_MEMCPY: + case BUILT_IN_MEMCPY_CHK: + case BUILT_IN_MEMPCPY: + case BUILT_IN_MEMPCPY_CHK: + handle_builtin_memcpy (DECL_FUNCTION_CODE (callee), gsi); + break; + case BUILT_IN_STRCAT: + case BUILT_IN_STRCAT_CHK: + handle_builtin_strcat (DECL_FUNCTION_CODE (callee), gsi); + break; + default: + break; + } + } + else if (is_gimple_assign (stmt) + && (lhs = gimple_assign_lhs (stmt)) != NULL_TREE + && TREE_CODE (lhs) == SSA_NAME + && POINTER_TYPE_P (TREE_TYPE (lhs))) + { + if (gimple_assign_single_p (stmt) + || (gimple_assign_cast_p (stmt) + && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) + { + int idx = get_stridx (gimple_assign_rhs1 (stmt)); + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = idx; + } + else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR) + { + int idx = get_stridx (gimple_assign_rhs1 (stmt)); + if (idx > 0) + { + strinfo si = get_strinfo (idx); + if (si != NULL) + { + tree off = gimple_assign_rhs2 (stmt); + if (operand_equal_p (si->length, off, 0)) + zero_length_string (lhs, si); + else if (TREE_CODE (off) == SSA_NAME) + { + gimple def_stmt = SSA_NAME_DEF_STMT (off); + if (gimple_assign_single_p (def_stmt) + && operand_equal_p (si->length, + gimple_assign_rhs1 (def_stmt), + 0)) + zero_length_string (lhs, si); + } + } + } + else if (idx < 0) + { + tree off = gimple_assign_rhs2 (stmt); + if (host_integerp (off, 1) + && (unsigned HOST_WIDE_INT) tree_low_cst (off, 1) + <= (unsigned HOST_WIDE_INT) ~idx) + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] + = ~(int) tree_low_cst (off, 1); + } + } + } + + if (gimple_vdef (stmt)) + maybe_invalidate (stmt); +} + +/* Recursively call maybe_invalidate on stmts that might be executed + in between dombb and current bb and that contain a vdef. Stop when + *count stmts are inspected, or if the whole strinfo vector has + been invalidated. */ + +static void +do_invalidate (basic_block dombb, gimple phi, bitmap visited, int *count) +{ + unsigned int i, n = gimple_phi_num_args (phi); + + for (i = 0; i < n; i++) + { + tree vuse = gimple_phi_arg_def (phi, i); + gimple stmt = SSA_NAME_DEF_STMT (vuse); + basic_block bb = gimple_bb (stmt); + if (bb == NULL + || !bitmap_set_bit (visited, bb->index) + || !dominated_by_p (CDI_DOMINATORS, bb, dombb)) + continue; + while (1) + { + if (gimple_code (stmt) == GIMPLE_PHI) + { + do_invalidate (dombb, stmt, visited, count); + if (*count == 0) + return; + break; + } + if (--*count == 0) + return; + if (!maybe_invalidate (stmt)) + { + *count = 0; + return; + } + vuse = gimple_vuse (stmt); + stmt = SSA_NAME_DEF_STMT (vuse); + if (gimple_bb (stmt) != bb) + { + bb = gimple_bb (stmt); + if (bb == NULL + || !bitmap_set_bit (visited, bb->index) + || !dominated_by_p (CDI_DOMINATORS, bb, dombb)) + break; + } + } + } +} + +/* Callback for walk_dominator_tree. Attempt to optimize various + string ops by remembering string lenths pointed by pointer SSA_NAMEs. */ + +static void +strlen_enter_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED, + basic_block bb) +{ + gimple_stmt_iterator gsi; + basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb); + + if (dombb == NULL) + stridx_to_strinfo = NULL; + else + { + stridx_to_strinfo = (VEC(strinfo, heap) *) dombb->aux; + if (stridx_to_strinfo) + { + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple phi = gsi_stmt (gsi); + if (!is_gimple_reg (gimple_phi_result (phi))) + { + bitmap visited = BITMAP_ALLOC (NULL); + int count_vdef = 100; + do_invalidate (dombb, phi, visited, &count_vdef); + BITMAP_FREE (visited); + break; + } + } + } + } + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + strlen_optimize_stmt (&gsi); + + bb->aux = stridx_to_strinfo; + if (VEC_length (strinfo, stridx_to_strinfo) && !strinfo_shared ()) + VEC_replace (strinfo, stridx_to_strinfo, 0, (strinfo) bb); +} + +/* Callback for walk_dominator_tree. Free strinfo vector if it is + owned by the current bb, clear bb->aux. */ + +static void +strlen_leave_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED, + basic_block bb) +{ + if (bb->aux) + { + stridx_to_strinfo = (VEC(strinfo, heap) *) bb->aux; + if (VEC_length (strinfo, stridx_to_strinfo) + && VEC_index (strinfo, stridx_to_strinfo, 0) == (strinfo) bb) + { + unsigned int i; + strinfo si; + + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) + free_strinfo (si); + VEC_free (strinfo, heap, stridx_to_strinfo); + } + bb->aux = NULL; + } +} + +/* Main entry point. */ + +static unsigned int +tree_ssa_strlen (void) +{ + struct dom_walk_data walk_data; + + ssa_ver_to_stridx = XCNEWVEC (int, num_ssa_names); + max_stridx = 1; + strinfo_pool = create_alloc_pool ("strinfo_struct pool", + sizeof (struct strinfo_struct), 64); + + calculate_dominance_info (CDI_DOMINATORS); + + /* String length optimization is implemented as a walk of the dominator + tree and a forward walk of statements within each block. */ + walk_data.dom_direction = CDI_DOMINATORS; + walk_data.initialize_block_local_data = NULL; + walk_data.before_dom_children = strlen_enter_block; + walk_data.after_dom_children = strlen_leave_block; + walk_data.block_local_data_size = 0; + walk_data.global_data = NULL; + + /* Initialize the dominator walker. */ + init_walk_dominator_tree (&walk_data); + + /* Recursively walk the dominator tree. */ + walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR); + + /* Finalize the dominator walker. */ + fini_walk_dominator_tree (&walk_data); + + XDELETEVEC (ssa_ver_to_stridx); + free_alloc_pool (strinfo_pool); + if (decl_to_stridxlist_htab) + { + htab_delete (decl_to_stridxlist_htab); + decl_to_stridxlist_htab = NULL; + } + + return 0; +} + +static bool +gate_strlen (void) +{ + return flag_tree_strlen != 0; +} + +struct gimple_opt_pass pass_strlen = +{ + { + GIMPLE_PASS, + "strlen", /* name */ + gate_strlen, /* gate */ + tree_ssa_strlen, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_TREE_STRLEN, /* tv_id */ + PROP_cfg | PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_ggc_collect + | TODO_verify_ssa /* todo_flags_finish */ + } +}; --- gcc/testsuite/gcc.dg/strlenopt-1.c.jj 2011-09-02 10:39:12.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-1.c 2011-09-02 15:54:23.000000000 +0200 @@ -0,0 +1,45 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +#include "strlenopt.h" + +__attribute__((noinline, noclone)) char * +foo (char *p, char *r) +{ + char *q = malloc (strlen (p) + strlen (r) + 64); + if (q == NULL) return NULL; + /* This strcpy can be optimized into memcpy, using the remembered + strlen (p). */ + strcpy (q, p); + /* These two strcat can be optimized into memcpy. The first one + could be even optimized into a *ptr = '/'; store as the '\0' + is immediately overwritten. */ + strcat (q, "/"); + strcat (q, "abcde"); + /* Due to inefficient PTA (PR50262) the above calls invalidate + string length of r, so it is optimized just into strcpy instead + of memcpy. */ + strcat (q, r); + return q; +} + +int +main () +{ + char *volatile p = "string1"; + char *volatile r = "string2"; + char *q = foo (p, r); + if (q != NULL) + { + if (strcmp (q, "string1/abcdestring2")) + abort (); + free (q); + } + return 0; +} + +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 3 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ +/* { dg-final { cleanup-tree-dump "strlen" } } */ --- gcc/testsuite/gcc.dg/strlenopt-2.c.jj 2011-09-02 15:31:33.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-2.c 2011-09-02 15:54:28.000000000 +0200 @@ -0,0 +1,47 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +#include "strlenopt.h" + +__attribute__((noinline, noclone)) char * +foo (char *p, char *r) +{ + char buf[26]; + if (strlen (p) + strlen (r) + 9 > 26) + return NULL; + /* This strcpy can be optimized into memcpy, using the remembered + strlen (p). */ + strcpy (buf, p); + /* These two strcat can be optimized into memcpy. The first one + could be even optimized into a *ptr = '/'; store as the '\0' + is immediately overwritten. */ + strcat (buf, "/"); + strcat (buf, "abcde"); + /* This strcpy can be optimized into memcpy, using the remembered + strlen (r). */ + strcat (buf, r); + /* And this can be optimized into memcpy too. */ + strcat (buf, "fg"); + return strdup (buf); +} + +int +main () +{ + char *volatile p = "string1"; + char *volatile r = "string2"; + char *q = foo (p, r); + if (q != NULL) + { + if (strcmp (q, "string1/abcdestring2fg")) + abort (); + free (q); + } + return 0; +} + +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 5 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ +/* { dg-final { cleanup-tree-dump "strlen" } } */ --- gcc/testsuite/gcc.dg/strlenopt-3.c.jj 2011-09-02 11:18:52.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-3.c 2011-09-02 15:54:32.000000000 +0200 @@ -0,0 +1,63 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +__attribute__((noinline, noclone)) size_t +fn1 (char *p, char *q) +{ + size_t s = strlen (q); + strcpy (p, q); + return s - strlen (p); +} + +__attribute__((noinline, noclone)) size_t +fn2 (char *p, char *q) +{ + size_t s = strlen (q); + memcpy (p, q, s + 1); + return s - strlen (p); +} + +__attribute__((noinline, noclone)) size_t +fn3 (char *p) +{ + memcpy (p, "abcd", 5); + return strlen (p); +} + +__attribute__((noinline, noclone)) size_t +fn4 (char *p) +{ + memcpy (p, "efg\0hij", 6); + return strlen (p); +} + +int +main () +{ + char buf[64]; + char *volatile p = buf; + char *volatile q = "ABCDEF"; + buf[7] = 'G'; + if (fn1 (p, q) != 0 || memcmp (buf, "ABCDEF\0G", 8)) + abort (); + q = "HIJ"; + if (fn2 (p + 1, q) != 0 || memcmp (buf, "AHIJ\0F\0G", 8)) + abort (); + buf[6] = 'K'; + if (fn3 (p + 1) != 4 || memcmp (buf, "Aabcd\0KG", 8)) + abort (); + if (fn4 (p) != 3 || memcmp (buf, "efg\0hiKG", 8)) + abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 4 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ +/* { dg-final { cleanup-tree-dump "strlen" } } */ +/* { dg-final { scan-tree-dump-times "return 0" 3 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "return 4" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "return 3" 1 "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ --- gcc/testsuite/gcc.dg/strlenopt.h.jj 2011-09-02 10:36:04.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt.h 2011-09-02 12:39:08.000000000 +0200 @@ -0,0 +1,57 @@ +/* This is a replacement of needed parts from stdlib.h and string.h + for -ftree-strlen testing, to ensure we are testing the builtins + rather than whatever the OS has in its headers. */ + +#define NULL ((void *) 0) +typedef __SIZE_TYPE__ size_t; +extern void abort (void); +void *malloc (size_t); +void free (void *); +char *strdup (const char *); +size_t strlen (const char *); +void *memcpy (void *__restrict, const void *__restrict, size_t); +char *strcpy (char *__restrict, const char *__restrict); +char *strcat (char *__restrict, const char *__restrict); +int memcmp (const void *, const void *, size_t); +int strcmp (const char *, const char *); +#ifdef USE_GNU +void *mempcpy (void *__restrict, const void *__restrict, size_t); +char *stpcpy (char *__restrict, const char *__restrict); +#endif + +#if defined(FORTIFY_SOURCE) && FORTIFY_SOURCE > 0 && __OPTIMIZE__ +# define bos(ptr) __builtin_object_size (x, FORTIFY_SOURCE > 0) +# define bos0(ptr) __builtin_object_size (x, 0) + +extern inline __attribute__((gnu_inline, always_inline, artificial)) void * +memcpy (void *__restrict dest, const void *__restrict src, size_t len) +{ + return __builtin___memcpy_chk (dest, src, len, bos0 (dest)); +} + +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * +strcpy (char *__restrict dest, const char *__restrict src) +{ + return __builtin___strcpy_chk (dest, src, bos (dest)); +} + +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * +strcat (char *__restrict dest, const char *__restrict src) +{ + return __builtin___strcat_chk (dest, src, bos (dest)); +} + +# ifdef USE_GNU +extern inline __attribute__((gnu_inline, always_inline, artificial)) void * +mempcpy (void *__restrict dest, const void *__restrict src, size_t len) +{ + return __builtin___mempcpy_chk (dest, src, len, bos0 (dest)); +} + +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * +stpcpy (char *__restrict dest, const char *__restrict src) +{ + return __builtin___stpcpy_chk (dest, src, bos (dest)); +} +# endif +#endif Jakub ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC, WIP] tree-ssa-strlen optimization pass 2011-09-02 16:51 [RFC, WIP] tree-ssa-strlen optimization pass Jakub Jelinek @ 2011-09-02 17:15 ` Paolo Carlini 2011-09-02 18:23 ` Jakub Jelinek 2011-09-05 8:55 ` Richard Guenther 1 sibling, 1 reply; 8+ messages in thread From: Paolo Carlini @ 2011-09-02 17:15 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches Hi, > Any comments related to the implementation, or examples of real-world > lame C string length code sequences that would be nice to optimize > will be greatly appreciated. I'm only wondering how hard would be taking care more consistently of the wchar_t counterparts of all these library functions. I don't think user code using those is so uncommon, at least now that internationalization is taken more and more seriously. Paolo. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC, WIP] tree-ssa-strlen optimization pass 2011-09-02 17:15 ` Paolo Carlini @ 2011-09-02 18:23 ` Jakub Jelinek 2011-09-02 18:42 ` Paolo Carlini 0 siblings, 1 reply; 8+ messages in thread From: Jakub Jelinek @ 2011-09-02 18:23 UTC (permalink / raw) To: Paolo Carlini; +Cc: gcc-patches On Fri, Sep 02, 2011 at 07:14:58PM +0200, Paolo Carlini wrote: > >Any comments related to the implementation, or examples of real-world > >lame C string length code sequences that would be nice to optimize > >will be greatly appreciated. > I'm only wondering how hard would be taking care more consistently > of the wchar_t counterparts of all these library functions. I don't > think user code using those is so uncommon, at least now that > internationalization is taken more and more seriously. I'm fairly sure it is much less common than the narrow versions, we don't even handle wprintf and wscanf attributes. Tracking not just char strings, but also wchar_t (or either char, or wchar_t) would complicate the pass somewhat, but I'm more worried that it would not be stressed enough in real-world code and thus would be less tested than would be desirable. Plus it is very premature when none of the wcs* builtins are actually handled in builtins.c and we don't have infrastructure for that. Jakub ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC, WIP] tree-ssa-strlen optimization pass 2011-09-02 18:23 ` Jakub Jelinek @ 2011-09-02 18:42 ` Paolo Carlini 0 siblings, 0 replies; 8+ messages in thread From: Paolo Carlini @ 2011-09-02 18:42 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches Hi, > I'm fairly sure it is much less common than the narrow versions, > we don't even handle wprintf and wscanf attributes. I know. Actually, the reason I decided to say something, is that I noticed a few times in the past this "inequality" between char and wchar_t. Frankly, I instinctively think it's unfortunate, if asked, I would rather recommend systematically working on *pairs* of optimizations. I don't know if somebody ever discussed this idea in the open for GCC, as a matter of "policy" so to speak, but I also don't know what the other compilers are doing, I don't have much more to say at the moment. I hope somebody will give the general idea a thought. Thanks, Paolo. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC, WIP] tree-ssa-strlen optimization pass 2011-09-02 16:51 [RFC, WIP] tree-ssa-strlen optimization pass Jakub Jelinek 2011-09-02 17:15 ` Paolo Carlini @ 2011-09-05 8:55 ` Richard Guenther 2011-09-05 9:41 ` Jakub Jelinek 1 sibling, 1 reply; 8+ messages in thread From: Richard Guenther @ 2011-09-05 8:55 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches On Fri, Sep 2, 2011 at 6:50 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Hi! > > The following patch contains a WIP implementation of a new pass, > which attempts to track C string lengths and perform various > optimizations using that information. > > Optimizations it currently performs: > 1) optimizing away strlen calls, if the string length is known > at that point already (either constant or some expression) > 2) replacement of strcpy calls with memcpy if the string length > of the source is known > 3) replacement of strcat with either memcpy (if source length > is known) or strcpy (if not), if the destination length > before the call is known > - over the years I've seen way too much spaghetti code > doing many strcat calls to the same string one after another > During bootstrap/regtest (excluding the newly added testcases) > 1) hits 184 times on x86_64-linux, 182 tmes on i686-linux, > 2) hits 158 times on x86_64 and 159 times on i686, > 3) into memcpy hits 33 times and 3) into strcpy 2 times. > > Example from gcc sources that is optimized: > filename = (char *) alloca (strlen (module_name) + strlen (MODULE_EXTENSION) > + 1); > strcpy (filename, module_name); > strcat (filename, MODULE_EXTENSION); > which can be optimized into > filename = (char *) alloca ((tmp1 = strlen (module_name)) + (tmp2 = strlen (MODULE_EXTENSION)) > + 1); > memcpy (filename, module_name, tmp1 + 1); > memcpy (filename + tmp1, MODULE_EXTENSION, tmp2 + 1); > > Some further optimizations I'm currently considering for the pass: > - handle *p = 0; stores like memcpy (p, "", 1) > - lame coders often do *p = 0; strcat (p, str1); strcat (p, str2); > - if a memcpy call (either original or strcpy/strcat transformed into it) > copies known src string length + 1 and the immediately following > .MEM use (or possibly non-immediately if there are only non-aliasing > ones?) is a strcpy/(or to be optimized strcat or non-zero length memcpy) > call which overwrites the final '\0', decrease the memcpy size by one > - if source length for strcpy isn't known and the destination length > is needed for optimizations, for -fhosted, glibc and with stpcpy > compatible prototype in headers consider transforming that strcpy > into stpcpy and use result - dst as string length (and see whether > following optimizations are able to optimize series of unknown > source length strcpy+strcat into a chain of stpcpy calls) > - similarly if string length of strcat destination isn't known but is > helpful for optimization consider optimizing strcat into strlen+stpcpy > > Any comments related to the implementation, or examples of real-world > lame C string length code sequences that would be nice to optimize > will be greatly appreciated. > > The patch has been bootstrapped/regtested on x86_64-linux and i686-linux, > no regressions, but I'm not proposing it for trunk yet (would like to > implement at least a few of the above mentioned optimizations), just am > posting it early as a pass preview. > > 2011-09-02 Jakub Jelinek <jakub@redhat.com> > > * common.opt: Add -ftree-strlen option. Maybe sth more generic? -foptimize-string-ops? Eventually guard the existing string op foldings with that flag as well. > * Makefile.in (OBJS): Add tree-ssa-strlen.o. > (tree-sssa-strlen.o): Add dependencies. > * opts.c (default_options_table): Enable -ftree-strlen > by default at -O2 if not -Os. > * passes.c (init_optimization_passes): Add pass_strlen > after pass_object_sizes. > * timevar.def (TV_TREE_STRLEN): New timevar. > * tree-pass.h (pass_strlen): Declare. > * tree-ssa-strlen.c: New file. > > * gcc.dg/strlenopt-1.c: New test. > * gcc.dg/strlenopt-2.c: New test. > * gcc.dg/strlenopt-3.c: New test. > * gcc.dg/strlenopt.h: New file. > > --- gcc/common.opt.jj 2011-08-26 18:41:44.000000000 +0200 > +++ gcc/common.opt 2011-08-30 10:57:36.000000000 +0200 > @@ -1953,6 +1953,10 @@ ftree-fre > Common Report Var(flag_tree_fre) Optimization > Enable Full Redundancy Elimination (FRE) on trees > > +ftree-strlen > +Common Report Var(flag_tree_strlen) Optimization > +Enable string length optimizations on trees > + > ftree-loop-distribution > Common Report Var(flag_tree_loop_distribution) Optimization > Enable loop distribution on trees > --- gcc/Makefile.in.jj 2011-08-26 18:41:44.000000000 +0200 > +++ gcc/Makefile.in 2011-09-02 15:43:02.000000000 +0200 > @@ -1472,6 +1472,7 @@ OBJS = \ > tree-ssa-reassoc.o \ > tree-ssa-sccvn.o \ > tree-ssa-sink.o \ > + tree-ssa-strlen.o \ > tree-ssa-structalias.o \ > tree-ssa-ter.o \ > tree-ssa-threadedge.o \ > @@ -3157,6 +3158,9 @@ tree-ssa-ccp.o : tree-ssa-ccp.c $(TREE_F > $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h \ > tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(DIAGNOSTIC_CORE_H) \ > $(DBGCNT_H) tree-pretty-print.h gimple-pretty-print.h gimple-fold.h > +tree-ssa-strlen.o : tree-ssa-strlen.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ > + $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h alloc-pool.h tree-ssa-propagate.h \ > + gimple-pretty-print.h > tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h alloc-pool.h \ > $(TM_H) $(TREE_H) $(GIMPLE_H) $(CGRAPH_H) $(TREE_FLOW_H) \ > $(IPA_PROP_H) $(DIAGNOSTIC_H) statistics.h $(TREE_DUMP_H) $(TIMEVAR_H) \ > --- gcc/opts.c.jj 2011-06-30 17:58:03.000000000 +0200 > +++ gcc/opts.c 2011-09-02 15:53:06.000000000 +0200 > @@ -484,6 +484,7 @@ static const struct default_options defa > { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 }, > + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 }, Why not -Os? Doesn't it remove strlen calls? > /* -O3 optimizations. */ > { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, > --- gcc/passes.c.jj 2011-07-12 07:58:48.000000000 +0200 > +++ gcc/passes.c 2011-08-31 12:03:29.000000000 +0200 > @@ -1,7 +1,7 @@ > /* Top level of GCC compilers (cc1, cc1plus, etc.) > Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, > - 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 > - Free Software Foundation, Inc. > + 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, > + 2011 Free Software Foundation, Inc. > > This file is part of GCC. > > @@ -1321,6 +1321,7 @@ init_optimization_passes (void) > NEXT_PASS (pass_forwprop); > NEXT_PASS (pass_phiopt); > NEXT_PASS (pass_object_sizes); > + NEXT_PASS (pass_strlen); > NEXT_PASS (pass_ccp); > NEXT_PASS (pass_copy_prop); > NEXT_PASS (pass_cse_sincos); > --- gcc/timevar.def.jj 2011-05-04 10:14:08.000000000 +0200 > +++ gcc/timevar.def 2011-08-30 10:54:32.000000000 +0200 > @@ -1,7 +1,7 @@ > /* This file contains the definitions for timing variables used to > measure run-time performance of the compiler. > Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, > - 2009, 2010 > + 2009, 2010, 2011 > Free Software Foundation, Inc. > Contributed by Alex Samuel <samuel@codesourcery.com> > > @@ -183,6 +183,7 @@ DEFTIMEVAR (TV_TREE_COPY_RENAME , " > DEFTIMEVAR (TV_TREE_SSA_VERIFY , "tree SSA verifier") > DEFTIMEVAR (TV_TREE_STMT_VERIFY , "tree STMT verifier") > DEFTIMEVAR (TV_TREE_SWITCH_CONVERSION, "tree switch initialization conversion") > +DEFTIMEVAR (TV_TREE_STRLEN , "tree strlen optimization") > DEFTIMEVAR (TV_CGRAPH_VERIFY , "callgraph verifier") > DEFTIMEVAR (TV_DOM_FRONTIERS , "dominance frontiers") > DEFTIMEVAR (TV_DOMINANCE , "dominance computation") > --- gcc/tree-pass.h.jj 2011-07-08 15:09:38.000000000 +0200 > +++ gcc/tree-pass.h 2011-08-30 10:55:27.000000000 +0200 > @@ -412,6 +412,7 @@ extern struct gimple_opt_pass pass_diagn > extern struct gimple_opt_pass pass_expand_omp; > extern struct gimple_opt_pass pass_expand_omp_ssa; > extern struct gimple_opt_pass pass_object_sizes; > +extern struct gimple_opt_pass pass_strlen; > extern struct gimple_opt_pass pass_fold_builtins; > extern struct gimple_opt_pass pass_stdarg; > extern struct gimple_opt_pass pass_early_warn_uninitialized; > --- gcc/tree-ssa-strlen.c.jj 2011-08-30 10:56:51.000000000 +0200 > +++ gcc/tree-ssa-strlen.c 2011-09-02 15:50:55.000000000 +0200 > @@ -0,0 +1,1288 @@ > +/* String length optimization > + Copyright (C) 2011 Free Software Foundation, Inc. > + Contributed by Jakub Jelinek <jakub@redhat.com> > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify > +it under the terms of the GNU General Public License as published by > +the Free Software Foundation; either version 3, or (at your option) > +any later version. > + > +GCC is distributed in the hope that it will be useful, > +but WITHOUT ANY WARRANTY; without even the implied warranty of > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +GNU General Public License for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +<http://www.gnu.org/licenses/>. */ > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "tree-flow.h" > +#include "tree-pass.h" > +#include "domwalk.h" > +#include "alloc-pool.h" > +#include "tree-ssa-propagate.h" > +#include "gimple-pretty-print.h" > + > +/* Array indexed by SSA_NAME_VERSION. 0 means unknown, positive value > + is an index into strinfo vector, negative value stands for > + string length of a string literal (~strlen). */ > +static int *ssa_ver_to_stridx; > + > +/* Number of currently active string indexes plus one. */ > +static int max_stridx; USe a VEC? > +/* String information record. */ > +typedef struct strinfo_struct > +{ > + /* String length of this string. */ > + tree length; > + /* Any of the corresponding pointers for querying alias oracle. */ > + tree ptr; > + /* Reference count. Any changes to strinfo entry possibly shared > + with dominating basic blocks need unshare_strinfo first, except > + for dont_invalidate which affects only the immediately next > + maybe_invalidate. */ > + int refcount; > + /* Copy of index. get_strinfo (si->idx) should return si; */ > + int idx; > + /* These 3 fields are for chaining related string pointers together. > + E.g. for > + bl = strlen (b); dl = strlen (d); strcpy (a, b); c = a + bl; > + strcpy (c, d); e = c + dl; > + strinfo(a) -> strinfo(c) -> strinfo(e) > + All have ->first field equal to strinfo(a)->idx and are doubly > + chained through prev/next fields. The later strinfos are required > + to point into the same string with zero or more bytes after > + the previous pointer and all bytes in between the two pointers > + must be non-zero. Functions like strcpy or memcpy are supposed > + to adjust all previous strinfo lengths, but not following strinfo > + lengths (those are uncertain, usually invalidated during > + maybe_invalidate, except when the alias oracle knows better). > + Functions like strcat on the other side adjust the whole > + related strinfo chain. > + They are updated lazily, so to use the chain the same first fields > + and si->prev->next == si->idx needs to be verified. */ > + int first; > + int next; > + int prev; > + /* A flag for the next maybe_invalidate that this strinfo shouldn't > + be invalidated. Always cleared by maybe_invalidate. */ > + bool dont_invalidate; > +} *strinfo; > +DEF_VEC_P(strinfo); > +DEF_VEC_ALLOC_P(strinfo,heap); > + > +/* Pool for allocating strinfo_struct entries. */ > +static alloc_pool strinfo_pool; > + > +/* Vector mapping positive string indexes to strinfo, for the > + current basic block. The first pointer in the vector is special, > + it is either NULL, meaning the vector isn't shared, or it is > + a basic block pointer to the owner basic_block if shared. > + If some other bb wants to modify the vector, the vector needs > + to be unshared first, and only the owner bb is supposed to free it. */ > +static VEC(strinfo, heap) *stridx_to_strinfo; > + > +struct stridxlist > +{ > + struct stridxlist *next; > + HOST_WIDE_INT offset; > + int idx; > +}; > + > +struct decl_stridxlist_map > +{ > + struct tree_map_base base; > + struct stridxlist list; > +}; > + > +/* Hash table for mapping decls to a chained list of offset -> idx > + mappings. */ > +static htab_t decl_to_stridxlist_htab; > + > +/* Hash a from tree in a decl_stridxlist_map. */ > + > +static unsigned int > +decl_to_stridxlist_hash (const void *item) > +{ > + return DECL_UID (((const struct decl_stridxlist_map *) item)->base.from); > +} > + > +/* Free a decl_stridxlist_map. Callback for htab_delete. */ > + > +static void > +decl_to_stridxlist_free (void *item) > +{ > + struct stridxlist *next; > + struct stridxlist *list = ((struct decl_stridxlist_map *) item)->list.next; > + > + while (list) > + { > + next = list->next; > + XDELETE (list); > + list = next; > + } > + XDELETE (item); Maybe use an obstack or alloc-pool dependent on re-use? > +} > + > +/* Return string index for EXP. */ > + > +static int > +get_stridx (tree exp) > +{ > + tree l; > + > + if (TREE_CODE (exp) == SSA_NAME) > + return ssa_ver_to_stridx[SSA_NAME_VERSION (exp)]; > + > + if (TREE_CODE (exp) == ADDR_EXPR && decl_to_stridxlist_htab) > + { > + HOST_WIDE_INT off; > + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), > + &off); > + if (base && DECL_P (base)) > + { > + struct decl_stridxlist_map ent, *e; > + ent.base.from = base; > + e = (struct decl_stridxlist_map *) > + htab_find_with_hash (decl_to_stridxlist_htab, &ent, > + DECL_UID (base)); > + if (e) > + { > + struct stridxlist *list = &e->list; > + do > + { > + if (list->offset == off) > + return list->idx; > + list = list->next; > + } > + while (list); > + } > + } > + } > + > + l = c_strlen (exp, 0); > + if (l != NULL_TREE > + && host_integerp (l, 1)) > + { > + unsigned HOST_WIDE_INT len = tree_low_cst (l, 1); > + if (len == (unsigned int) len > + && (int) len >= 0) > + return ~(int) len; > + } > + return 0; > +} > + > +/* Return true if strinfo vector is shared with the immediate dominator. */ > + > +static inline bool > +strinfo_shared (void) > +{ > + return VEC_length (strinfo, stridx_to_strinfo) > + && VEC_index (strinfo, stridx_to_strinfo, 0) != NULL; > +} > + > +/* Unshare strinfo vector that is shared with the immediate dominator. */ > + > +static void > +unshare_strinfo_vec (void) > +{ > + strinfo si; > + unsigned int i = 0; > + > + gcc_assert (strinfo_shared ()); > + stridx_to_strinfo = VEC_copy (strinfo, heap, stridx_to_strinfo); > + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) > + if (si != NULL) > + si->refcount++; > + VEC_replace (strinfo, stridx_to_strinfo, 0, NULL); > +} > + > +/* Attempt to create a string index for ADDR_EXPR exp. > + Return a pointer to the location where the string index can > + be stored (if 0) or is stored, or NULL if this can't be tracked. */ > + > +static int * > +addr_stridxptr (tree exp) > +{ > + void **slot; > + struct decl_stridxlist_map ent; > + struct stridxlist *list; > + HOST_WIDE_INT off; > + > + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), &off); > + if (base == NULL_TREE || !DECL_P (base)) > + return NULL; > + > + if (decl_to_stridxlist_htab == NULL) > + decl_to_stridxlist_htab > + = htab_create (64, decl_to_stridxlist_hash, tree_map_base_eq, > + decl_to_stridxlist_free); > + ent.base.from = base; > + slot = htab_find_slot_with_hash (decl_to_stridxlist_htab, &ent, > + DECL_UID (base), INSERT); > + if (*slot) > + { > + int i; > + list = &((struct decl_stridxlist_map *)*slot)->list; > + for (i = 0; i < 16; i++) > + { > + if (list->offset == off) > + return &list->idx; > + if (list->next == NULL) > + break; > + } > + if (i == 16) > + return NULL; > + list->next = XNEW (struct stridxlist); > + list = list->next; > + } > + else > + { > + struct decl_stridxlist_map *e = XNEW (struct decl_stridxlist_map); > + e->base.from = base; > + *slot = (void *) e; > + list = &e->list; > + } > + list->next = NULL; > + list->offset = off; > + list->idx = 0; > + return &list->idx; > +} > + > +/* Create a new string index, or return 0 if reached limit. */ > + > +static int > +new_stridx (tree exp) > +{ > + int idx; > + if (max_stridx == 1000) I suppose make this a #define or --param > + return 0; > + if (TREE_CODE (exp) == SSA_NAME) > + { > + idx = max_stridx++; > + ssa_ver_to_stridx[SSA_NAME_VERSION (exp)] = idx; > + return idx; > + } > + if (TREE_CODE (exp) == ADDR_EXPR) > + { > + int *pidx = addr_stridxptr (exp); > + if (pidx != NULL) > + { > + gcc_assert (*pidx == 0); > + *pidx = max_stridx++; > + return *pidx; > + } > + } > + return 0; > +} > + > +/* Create a new strinfo. */ > + > +static strinfo > +new_strinfo (tree ptr, int idx, tree length) > +{ > + strinfo si = (strinfo) pool_alloc (strinfo_pool); > + si->length = length; > + si->ptr = ptr; > + si->refcount = 1; > + si->idx = idx; > + si->first = 0; > + si->prev = 0; > + si->next = 0; > + si->dont_invalidate = false; > + return si; > +} > + > +/* Decrease strinfo refcount and free it if not referenced anymore. */ > + > +static inline void > +free_strinfo (strinfo si) > +{ > + if (si && --si->refcount == 0) > + pool_free (strinfo_pool, si); > +} > + > +/* Return strinfo vector entry IDX. */ > + > +static inline strinfo > +get_strinfo (int idx) > +{ > + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx) > + return NULL; > + return VEC_index (strinfo, stridx_to_strinfo, idx); > +} > + > +/* Set strinfo in the vector entry IDX to SI. */ > + > +static inline void > +set_strinfo (int idx, strinfo si) > +{ > + if (VEC_length (strinfo, stridx_to_strinfo) && VEC_index (strinfo, stridx_to_strinfo, 0)) > + unshare_strinfo_vec (); > + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx) > + VEC_safe_grow_cleared (strinfo, heap, stridx_to_strinfo, idx + 1); > + VEC_replace (strinfo, stridx_to_strinfo, idx, si); > +} > + > +/* Invalidate string length information for strings whose length > + might change due to stores in stmt. */ > + > +static bool > +maybe_invalidate (gimple stmt) > +{ > + strinfo si; > + unsigned int i; > + bool nonempty = false; > + > + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) > + if (si != NULL) > + { > + if (!si->dont_invalidate) > + { > + ao_ref r; > + ao_ref_init_from_ptr_and_size (&r, si->ptr, NULL_TREE); > + if (stmt_may_clobber_ref_p_1 (stmt, &r)) > + { > + set_strinfo (i, NULL); > + free_strinfo (si); > + continue; > + } > + } > + si->dont_invalidate = false; > + nonempty = true; > + } > + return nonempty; > +} > + > +/* Unshare strinfo record SI, if it has recount > 1 or > + if stridx_to_strinfo vector is shared with some other > + bbs. */ > + > +static strinfo > +unshare_strinfo (strinfo si) > +{ > + strinfo nsi; > + > + if (si->refcount == 1 && !strinfo_shared ()) > + return si; > + > + nsi = new_strinfo (si->ptr, si->idx, si->length); > + nsi->first = si->first; > + nsi->prev = si->prev; > + nsi->next = si->next; > + set_strinfo (si->idx, nsi); > + free_strinfo (si); > + return nsi; > +} > + > +/* Return first strinfo in the related strinfo chain > + if all strinfos in between belong to the chain, otherwise > + NULL. */ > + > +static strinfo > +verify_related_strinfos (strinfo origsi) > +{ > + strinfo si = origsi, psi; > + > + if (origsi->first == 0) > + return NULL; > + for (; si->prev; si = psi) > + { > + if (si->first != origsi->first) > + return NULL; > + psi = get_strinfo (si->prev); > + if (psi == NULL) > + return NULL; > + if (psi->next != si->idx) > + return NULL; > + } > + if (si->idx != si->first) > + return NULL; > + return si; > +} > + > +/* Note that PTR, a pointer SSA_NAME initialized in the current stmt, points > + to a zero-length string and if possible chain it to a related strinfo > + chain whose part is or might be CHAINSI. */ > + > +static strinfo > +zero_length_string (tree ptr, strinfo chainsi) > +{ > + strinfo si; > + int idx; > + gcc_checking_assert (TREE_CODE (ptr) == SSA_NAME > + && get_stridx (ptr) == 0); > + > + if (chainsi != NULL) > + { > + if (verify_related_strinfos (chainsi)) > + { > + for (; chainsi->next; chainsi = si) > + { > + si = get_strinfo (chainsi->next); > + if (si == NULL > + || si->first != chainsi->first > + || si->prev != chainsi->idx) > + break; > + } > + if (integer_zerop (chainsi->length)) > + { > + if (chainsi->next) > + { > + chainsi = unshare_strinfo (chainsi); > + chainsi->next = 0; > + } > + ssa_ver_to_stridx [SSA_NAME_VERSION (ptr)] = chainsi->idx; > + return chainsi; > + } > + } > + else if (chainsi->first || chainsi->prev || chainsi->next) > + { > + chainsi = unshare_strinfo (chainsi); > + chainsi->first = 0; > + chainsi->prev = 0; > + chainsi->next = 0; > + } > + } > + idx = new_stridx (ptr); > + if (idx == 0) > + return NULL; > + si = new_strinfo (ptr, idx, build_int_cst (size_type_node, 0)); > + set_strinfo (idx, si); > + if (chainsi != NULL) > + { > + chainsi = unshare_strinfo (chainsi); > + if (chainsi->first == 0) > + chainsi->first = chainsi->idx; > + chainsi->next = idx; > + si->prev = chainsi->idx; > + si->first = chainsi->first; > + } > + return si; > +} > + > +/* For strinfo ORIGSI whose length has been just updated > + update also related strinfo lengths (add ADJ to each, > + but don't adjust ORIGSI). */ > + > +static void > +adjust_related_strinfos (location_t loc, strinfo origsi, tree adj) > +{ > + strinfo si = verify_related_strinfos (origsi); > + > + if (si == NULL) > + return; > + > + while (1) > + { > + strinfo nsi; > + > + if (si != origsi) > + { > + tree tem; > + > + si = unshare_strinfo (si); > + tem = fold_convert_loc (loc, TREE_TYPE (si->length), adj); > + si->length = fold_build2_loc (loc, PLUS_EXPR, > + TREE_TYPE (si->length), si->length, > + tem); > + si->dont_invalidate = true; > + } > + if (si->next == 0) > + return; > + nsi = get_strinfo (si->next); > + if (nsi == NULL > + || nsi->first != si->first > + || nsi->prev != si->idx) > + return; > + si = nsi; > + } > +} > + > +/* Find if there are other SSA_NAME pointers equal to PTR > + for which we don't track their string lengths yet. If so, use > + IDX for them. */ > + > +static void > +find_equal_ptrs (tree ptr, int idx) > +{ > + if (TREE_CODE (ptr) != SSA_NAME) > + return; > + while (1) > + { > + gimple stmt = SSA_NAME_DEF_STMT (ptr); > + if (!gimple_assign_single_p (stmt) > + && (!gimple_assign_cast_p (stmt) > + || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) > + return; I'd prefer postive checks to guard code below. So, you're handling SSA name copies, conversions from pointers and assignments from invariants. You could simply do if (!is+gimple_assign (stmt)) return; switch (gimple_assign_rhs_code (stmt)) { case SSA_NAME: .. case ADDR_EXPR: .. CASE_CONVERT: ... default: return; } > + ptr = gimple_assign_rhs1 (stmt); > + if (TREE_CODE (ptr) != SSA_NAME) > + { > + if (TREE_CODE (ptr) == ADDR_EXPR) > + { > + int *pidx = addr_stridxptr (ptr); > + if (pidx != NULL && *pidx == 0) > + *pidx = idx; > + return; > + } > + return; > + } > + if (ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] != 0) > + return; > + ssa_ver_to_stridx[SSA_NAME_VERSION (ptr)] = idx; > + } > +} > + > +/* Handle a strlen call. If strlen of the argument is known, replace > + the strlen call with the known value, otherwise remember that strlen > + of the argument is stored in the lhs SSA_NAME. */ > + > +static void > +handle_builtin_strlen (gimple_stmt_iterator *gsi) > +{ > + int idx; > + tree src; > + gimple stmt = gsi_stmt (*gsi); > + tree lhs = gimple_call_lhs (stmt); > + > + if (lhs == NULL_TREE) > + return; > + > + src = gimple_call_arg (stmt, 0); > + idx = get_stridx (src); > + if (idx) > + { > + if (idx < 0) > + { > + tree rhs = build_int_cst (TREE_TYPE (lhs), ~idx); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "Optimizing: "); > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > + } > + if (update_call_from_tree (gsi, rhs)) > + { > + update_stmt (gsi_stmt (*gsi)); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "into: "); > + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); > + } > + } > + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + fprintf (dump_file, "not possible.\n"); > + } > + else > + { > + strinfo si = get_strinfo (idx); > + if (si != NULL) > + { > + tree rhs = si->length; > + if (!useless_type_conversion_p (TREE_TYPE (lhs), > + TREE_TYPE (rhs))) > + rhs = fold_convert_loc (gimple_location (stmt), > + TREE_TYPE (lhs), si->length); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "Optimizing: "); > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > + } > + if (!update_call_from_tree (gsi, rhs)) > + { > + rhs = force_gimple_operand_gsi (gsi, rhs, true, NULL_TREE, > + true, GSI_SAME_STMT); > + if (!update_call_from_tree (gsi, rhs)) if update_call_from_tree fails then gimplify_and_update_call_from_tree will always succeed. See gimple_fold_call. > + { > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + fprintf (dump_file, "not possible.\n"); > + return; > + } > + } > + update_stmt (gsi_stmt (*gsi)); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "into: "); > + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); > + } > + if (TREE_CODE (si->length) != SSA_NAME > + && TREE_CODE (si->length) != INTEGER_CST > + && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) > + { > + si = unshare_strinfo (si); > + si->length = lhs; > + } > + } > + } > + } > + else if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) > + { > + idx = new_stridx (src); > + if (idx) > + { > + strinfo si = new_strinfo (src, idx, lhs); > + set_strinfo (idx, si); > + find_equal_ptrs (src, idx); > + } > + } > +} > + > +/* Handle a strcpy-like ({st{r,p}cpy,__st{r,p}cpy_chk}) call. > + If strlen of the second argument is known, strlen of the first argument > + is the same after this call. Furthermore, attempt to convert it to > + memcpy. */ > + > +static void > +handle_builtin_strcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) > +{ > + int idx, didx; > + tree src, dst, len, lhs, rhs, args, type, fn, oldlen; > + gimple stmt = gsi_stmt (*gsi); > + strinfo si, dsi, olddsi, zsi; > + location_t loc; > + > + src = gimple_call_arg (stmt, 1); > + dst = gimple_call_arg (stmt, 0); > + idx = get_stridx (src); > + if (idx <= 0) > + return; > + > + si = get_strinfo (idx); > + if (si == NULL) > + return; > + > + didx = get_stridx (dst); > + olddsi = NULL; > + oldlen = NULL_TREE; > + if (didx > 0) > + olddsi = get_strinfo (didx); > + else if (didx < 0) > + return; > + else > + { > + didx = new_stridx (dst); > + if (didx == 0) > + return; > + } > + if (olddsi != NULL) > + { > + oldlen = olddsi->length; > + dsi = unshare_strinfo (olddsi); > + dsi->length = si->length; > + /* Break the chain, so adjust_related_strinfo on later pointers in > + the chain won't adjust this one anymore. */ > + dsi->next = 0; > + } > + else > + { > + dsi = new_strinfo (dst, didx, si->length); > + set_strinfo (didx, dsi); > + find_equal_ptrs (dst, didx); > + } > + dsi->dont_invalidate = true; > + loc = gimple_location (stmt); > + if (olddsi != NULL) > + { > + tree adj = NULL_TREE; > + if (integer_zerop (oldlen)) > + adj = si->length; > + else if (TREE_CODE (oldlen) == INTEGER_CST > + || TREE_CODE (si->length) == INTEGER_CST) > + adj = fold_build2_loc (loc, MINUS_EXPR, > + TREE_TYPE (si->length), si->length, > + fold_convert_loc (loc, TREE_TYPE (si->length), > + oldlen)); > + if (adj != NULL_TREE) > + adjust_related_strinfos (loc, dsi, adj); > + } > + /* strcpy src may not overlap dst, so src doesn't need to be > + invalidated either. */ > + si->dont_invalidate = true; > + > + lhs = gimple_call_lhs (stmt); > + fn = NULL_TREE; > + zsi = NULL; > + switch (bcode) > + { > + case BUILT_IN_STRCPY: > + fn = implicit_built_in_decls[BUILT_IN_MEMCPY]; > + if (lhs) > + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx; > + break; > + case BUILT_IN_STRCPY_CHK: > + fn = built_in_decls[BUILT_IN_MEMCPY_CHK]; > + if (lhs) > + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx; > + break; > + case BUILT_IN_STPCPY: > + /* This would need adjustment of the lhs (subtract one), > + or detection that the trailing '\0' doesn't need to be > + written, if it will be immediately overwritten. > + fn = built_in_decls[BUILT_IN_MEMPCPY]; */ > + if (lhs) > + zsi = zero_length_string (lhs, dsi); > + break; > + case BUILT_IN_STPCPY_CHK: > + /* This would need adjustment of the lhs (subtract one), > + or detection that the trailing '\0' doesn't need to be > + written, if it will be immediately overwritten. > + fn = built_in_decls[BUILT_IN_MEMPCPY_CHK]; */ > + if (lhs) > + zsi = zero_length_string (lhs, dsi); > + break; > + default: > + gcc_unreachable (); > + } > + if (zsi != NULL) > + zsi->dont_invalidate = true; > + > + if (fn == NULL_TREE) > + return; > + > + args = TYPE_ARG_TYPES (TREE_TYPE (fn)); > + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args))); > + > + len = fold_convert_loc (loc, type, si->length); > + len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 1)); > + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, > + GSI_SAME_STMT); > + if (gimple_call_num_args (stmt) == 2) > + rhs = build_call_expr_loc (loc, fn, 3, dst, src, len); > + else > + rhs = build_call_expr_loc (loc, fn, 4, dst, src, len, > + gimple_call_arg (stmt, 2)); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "Optimizing: "); > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > + } > + if (update_call_from_tree (gsi, rhs)) Btw, it would be nice if you'd not use build_call_expr and then gimplify the call but instead build a new gimple call directly ... or modify it in-place. > + { > + update_stmt (gsi_stmt (*gsi)); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "into: "); > + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); > + } > + } > + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + fprintf (dump_file, "not possible.\n"); > +} > + > +/* Handle a memcpy-like ({mem{,p}cpy,__mem{,p}cpy_chk}) call. > + If strlen of the second argument is known and length of the third argument > + is that plus one, strlen of the first argument is the same after this > + call. */ > + > +static void > +handle_builtin_memcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) > +{ > + int idx, didx; > + tree src, dst, len, lhs, oldlen, newlen; > + gimple stmt = gsi_stmt (*gsi); > + strinfo si, dsi, olddsi; > + > + len = gimple_call_arg (stmt, 2); > + src = gimple_call_arg (stmt, 1); > + dst = gimple_call_arg (stmt, 0); > + idx = get_stridx (src); > + if (idx == 0) > + return; > + > + if (idx > 0) > + { > + gimple def_stmt; > + > + /* Handle memcpy (x, y, l) where l is strlen (y) + 1. */ > + si = get_strinfo (idx); > + if (si == NULL) > + return; > + if (TREE_CODE (len) != SSA_NAME) > + return; > + def_stmt = SSA_NAME_DEF_STMT (len); > + if (!is_gimple_assign (def_stmt) > + || gimple_assign_rhs_code (def_stmt) != PLUS_EXPR > + || gimple_assign_rhs1 (def_stmt) != si->length > + || !integer_onep (gimple_assign_rhs2 (def_stmt))) > + return; > + } > + else > + { > + si = NULL; > + /* Handle memcpy (x, "abcd", 5) or > + memcpy (x, "abc\0uvw", 7). */ > + if (!host_integerp (len, 1) > + || (unsigned HOST_WIDE_INT) tree_low_cst (len, 1) > + <= (unsigned HOST_WIDE_INT) ~idx) > + return; > + } > + > + didx = get_stridx (dst); > + olddsi = NULL; > + if (didx > 0) > + olddsi = get_strinfo (didx); > + else if (didx < 0) > + return; > + else > + { > + didx = new_stridx (dst); > + if (didx == 0) > + return; > + } > + if (si != NULL) > + newlen = si->length; > + else > + newlen = build_int_cst (TREE_TYPE (len), ~idx); > + oldlen = NULL_TREE; > + if (olddsi != NULL) > + { > + dsi = unshare_strinfo (olddsi); > + oldlen = olddsi->length; > + dsi->length = newlen; > + /* Break the chain, so adjust_related_strinfo on later pointers in > + the chain won't adjust this one anymore. */ > + dsi->next = 0; > + } > + else > + { > + dsi = new_strinfo (dst, didx, newlen); > + set_strinfo (didx, dsi); > + find_equal_ptrs (dst, didx); > + } > + dsi->dont_invalidate = true; > + if (olddsi != NULL) > + { > + tree adj = NULL_TREE; > + location_t loc = gimple_location (stmt); > + if (integer_zerop (oldlen)) > + adj = dsi->length; > + else if (TREE_CODE (oldlen) == INTEGER_CST > + || TREE_CODE (dsi->length) == INTEGER_CST) > + adj = fold_build2_loc (loc, MINUS_EXPR, > + TREE_TYPE (dsi->length), dsi->length, > + fold_convert_loc (loc, TREE_TYPE (dsi->length), > + oldlen)); > + if (adj != NULL_TREE) > + adjust_related_strinfos (loc, dsi, adj); > + } > + /* memcpy src may not overlap dst, so src doesn't need to be > + invalidated either. */ > + if (si != NULL) > + si->dont_invalidate = true; > + > + lhs = gimple_call_lhs (stmt); > + switch (bcode) > + { > + case BUILT_IN_MEMCPY: > + case BUILT_IN_MEMCPY_CHK: > + if (lhs) > + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = didx; > + break; > + case BUILT_IN_MEMPCPY: > + case BUILT_IN_MEMPCPY_CHK: > + break; > + default: > + gcc_unreachable (); > + } > +} > + > +/* Handle a strcat-like ({strcat,__strcat_chk}) call. > + If strlen of the second argument is known, strlen of the first argument > + is increased by the length of the second argument. Furthermore, attempt > + to convert it to memcpy/strcpy if the length of the first argument > + is known. */ > + > +static void > +handle_builtin_strcat (enum built_in_function bcode, gimple_stmt_iterator *gsi) > +{ > + int idx, didx; > + tree src, dst, dstlen, len, lhs, rhs, args, type, fn, objsz; > + gimple stmt = gsi_stmt (*gsi); > + strinfo si, dsi; > + location_t loc; > + > + src = gimple_call_arg (stmt, 1); > + dst = gimple_call_arg (stmt, 0); > + > + didx = get_stridx (dst); > + if (didx <= 0) > + return; > + > + dsi = get_strinfo (didx); > + if (dsi == NULL) > + return; > + > + /* This should have been folded, don't handle it here. */ > + idx = get_stridx (src); > + if (idx < 0) > + return; > + > + si = NULL; > + if (idx) > + si = get_strinfo (idx); > + > + loc = gimple_location (stmt); > + dstlen = dsi->length; > + if (si != NULL) > + { > + dsi = unshare_strinfo (dsi); > + dsi->dont_invalidate = true; > + dsi->length = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (dsi->length), > + dsi->length, si->length); > + adjust_related_strinfos (loc, dsi, dstlen); > + } > + else > + { > + set_strinfo (didx, NULL); > + free_strinfo (dsi); > + } > + if (si != NULL) > + /* strcat src may not overlap dst, so src doesn't need to be > + invalidated either. */ > + si->dont_invalidate = true; > + > + lhs = gimple_call_lhs (stmt); > + /* For now. Could remove the lhs from the call and add > + lhs = dst; afterwards. */ > + if (lhs) > + return; > + > + fn = NULL_TREE; > + objsz = NULL_TREE; > + switch (bcode) > + { > + case BUILT_IN_STRCAT: > + if (si) > + fn = implicit_built_in_decls[BUILT_IN_MEMCPY]; > + else > + fn = implicit_built_in_decls[BUILT_IN_STRCPY]; > + break; > + case BUILT_IN_STRCAT_CHK: > + if (si) > + fn = built_in_decls[BUILT_IN_MEMCPY_CHK]; > + else > + fn = built_in_decls[BUILT_IN_STRCPY_CHK]; > + objsz = gimple_call_arg (stmt, 2); > + break; > + default: > + gcc_unreachable (); > + } > + > + if (fn == NULL_TREE) > + return; > + > + len = NULL_TREE; > + if (si) > + { > + args = TYPE_ARG_TYPES (TREE_TYPE (fn)); > + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args))); > + > + len = fold_convert_loc (loc, type, si->length); > + len = fold_build2_loc (loc, PLUS_EXPR, type, len, > + build_int_cst (type, 1)); > + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, > + GSI_SAME_STMT); > + } > + dst = fold_build2_loc (loc, POINTER_PLUS_EXPR, > + TREE_TYPE (dst), dst, > + fold_convert_loc (loc, sizetype, dstlen)); > + dst = force_gimple_operand_gsi (gsi, dst, true, NULL_TREE, true, > + GSI_SAME_STMT); > + if (si) > + rhs = build_call_expr_loc (loc, fn, 3 + (objsz != NULL_TREE), > + dst, src, len, objsz); > + else > + rhs = build_call_expr_loc (loc, fn, 2 + (objsz != NULL_TREE), > + dst, src, objsz); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "Optimizing: "); > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > + } > + if (update_call_from_tree (gsi, rhs)) > + { > + update_stmt (gsi_stmt (*gsi)); > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + { > + fprintf (dump_file, "into: "); > + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); > + } > + } > + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) > + fprintf (dump_file, "not possible.\n"); > +} > + > +/* Attempt to optimize a single statement at *GSI using string length > + knowledge. */ > + > +static void > +strlen_optimize_stmt (gimple_stmt_iterator *gsi) > +{ > + gimple stmt = gsi_stmt (*gsi); > + tree lhs; > + > + if (is_gimple_call (stmt)) > + { > + tree callee = gimple_call_fndecl (stmt); > + if (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL) > + switch (DECL_FUNCTION_CODE (callee)) > + { > + case BUILT_IN_STRLEN: > + handle_builtin_strlen (gsi); > + break; > + case BUILT_IN_STRCPY: > + case BUILT_IN_STRCPY_CHK: > + case BUILT_IN_STPCPY: > + case BUILT_IN_STPCPY_CHK: > + handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi); > + break; > + case BUILT_IN_MEMCPY: > + case BUILT_IN_MEMCPY_CHK: > + case BUILT_IN_MEMPCPY: > + case BUILT_IN_MEMPCPY_CHK: > + handle_builtin_memcpy (DECL_FUNCTION_CODE (callee), gsi); > + break; > + case BUILT_IN_STRCAT: > + case BUILT_IN_STRCAT_CHK: > + handle_builtin_strcat (DECL_FUNCTION_CODE (callee), gsi); > + break; > + default: > + break; > + } > + } > + else if (is_gimple_assign (stmt) > + && (lhs = gimple_assign_lhs (stmt)) != NULL_TREE > + && TREE_CODE (lhs) == SSA_NAME > + && POINTER_TYPE_P (TREE_TYPE (lhs))) > + { > + if (gimple_assign_single_p (stmt) > + || (gimple_assign_cast_p (stmt) > + && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) > + { > + int idx = get_stridx (gimple_assign_rhs1 (stmt)); > + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] = idx; > + } > + else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR) > + { > + int idx = get_stridx (gimple_assign_rhs1 (stmt)); > + if (idx > 0) > + { > + strinfo si = get_strinfo (idx); > + if (si != NULL) > + { > + tree off = gimple_assign_rhs2 (stmt); > + if (operand_equal_p (si->length, off, 0)) > + zero_length_string (lhs, si); > + else if (TREE_CODE (off) == SSA_NAME) > + { > + gimple def_stmt = SSA_NAME_DEF_STMT (off); > + if (gimple_assign_single_p (def_stmt) > + && operand_equal_p (si->length, > + gimple_assign_rhs1 (def_stmt), > + 0)) > + zero_length_string (lhs, si); > + } > + } > + } > + else if (idx < 0) > + { > + tree off = gimple_assign_rhs2 (stmt); > + if (host_integerp (off, 1) > + && (unsigned HOST_WIDE_INT) tree_low_cst (off, 1) > + <= (unsigned HOST_WIDE_INT) ~idx) > + ssa_ver_to_stridx[SSA_NAME_VERSION (lhs)] > + = ~(int) tree_low_cst (off, 1); > + } > + } > + } > + > + if (gimple_vdef (stmt)) > + maybe_invalidate (stmt); > +} > + > +/* Recursively call maybe_invalidate on stmts that might be executed > + in between dombb and current bb and that contain a vdef. Stop when > + *count stmts are inspected, or if the whole strinfo vector has > + been invalidated. */ > + > +static void > +do_invalidate (basic_block dombb, gimple phi, bitmap visited, int *count) > +{ > + unsigned int i, n = gimple_phi_num_args (phi); > + > + for (i = 0; i < n; i++) > + { > + tree vuse = gimple_phi_arg_def (phi, i); > + gimple stmt = SSA_NAME_DEF_STMT (vuse); > + basic_block bb = gimple_bb (stmt); > + if (bb == NULL > + || !bitmap_set_bit (visited, bb->index) > + || !dominated_by_p (CDI_DOMINATORS, bb, dombb)) > + continue; > + while (1) > + { > + if (gimple_code (stmt) == GIMPLE_PHI) > + { > + do_invalidate (dombb, stmt, visited, count); > + if (*count == 0) > + return; > + break; > + } > + if (--*count == 0) > + return; > + if (!maybe_invalidate (stmt)) > + { > + *count = 0; > + return; > + } > + vuse = gimple_vuse (stmt); > + stmt = SSA_NAME_DEF_STMT (vuse); > + if (gimple_bb (stmt) != bb) > + { > + bb = gimple_bb (stmt); > + if (bb == NULL > + || !bitmap_set_bit (visited, bb->index) > + || !dominated_by_p (CDI_DOMINATORS, bb, dombb)) > + break; > + } > + } > + } > +} > + > +/* Callback for walk_dominator_tree. Attempt to optimize various > + string ops by remembering string lenths pointed by pointer SSA_NAMEs. */ > + > +static void > +strlen_enter_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED, > + basic_block bb) > +{ > + gimple_stmt_iterator gsi; > + basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb); > + > + if (dombb == NULL) > + stridx_to_strinfo = NULL; > + else > + { > + stridx_to_strinfo = (VEC(strinfo, heap) *) dombb->aux; > + if (stridx_to_strinfo) > + { > + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple phi = gsi_stmt (gsi); > + if (!is_gimple_reg (gimple_phi_result (phi))) > + { > + bitmap visited = BITMAP_ALLOC (NULL); > + int count_vdef = 100; > + do_invalidate (dombb, phi, visited, &count_vdef); > + BITMAP_FREE (visited); > + break; > + } > + } > + } > + } > + > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + strlen_optimize_stmt (&gsi); > + > + bb->aux = stridx_to_strinfo; > + if (VEC_length (strinfo, stridx_to_strinfo) && !strinfo_shared ()) > + VEC_replace (strinfo, stridx_to_strinfo, 0, (strinfo) bb); > +} > + > +/* Callback for walk_dominator_tree. Free strinfo vector if it is > + owned by the current bb, clear bb->aux. */ > + > +static void > +strlen_leave_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED, > + basic_block bb) > +{ > + if (bb->aux) > + { > + stridx_to_strinfo = (VEC(strinfo, heap) *) bb->aux; > + if (VEC_length (strinfo, stridx_to_strinfo) > + && VEC_index (strinfo, stridx_to_strinfo, 0) == (strinfo) bb) > + { > + unsigned int i; > + strinfo si; > + > + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) > + free_strinfo (si); > + VEC_free (strinfo, heap, stridx_to_strinfo); > + } > + bb->aux = NULL; > + } > +} > + > +/* Main entry point. */ > + > +static unsigned int > +tree_ssa_strlen (void) > +{ > + struct dom_walk_data walk_data; > + > + ssa_ver_to_stridx = XCNEWVEC (int, num_ssa_names); > + max_stridx = 1; > + strinfo_pool = create_alloc_pool ("strinfo_struct pool", > + sizeof (struct strinfo_struct), 64); > + > + calculate_dominance_info (CDI_DOMINATORS); > + > + /* String length optimization is implemented as a walk of the dominator > + tree and a forward walk of statements within each block. */ > + walk_data.dom_direction = CDI_DOMINATORS; > + walk_data.initialize_block_local_data = NULL; > + walk_data.before_dom_children = strlen_enter_block; > + walk_data.after_dom_children = strlen_leave_block; > + walk_data.block_local_data_size = 0; > + walk_data.global_data = NULL; > + > + /* Initialize the dominator walker. */ > + init_walk_dominator_tree (&walk_data); > + > + /* Recursively walk the dominator tree. */ > + walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR); > + > + /* Finalize the dominator walker. */ > + fini_walk_dominator_tree (&walk_data); > + > + XDELETEVEC (ssa_ver_to_stridx); > + free_alloc_pool (strinfo_pool); > + if (decl_to_stridxlist_htab) > + { > + htab_delete (decl_to_stridxlist_htab); > + decl_to_stridxlist_htab = NULL; > + } > + > + return 0; > +} > + > +static bool > +gate_strlen (void) > +{ > + return flag_tree_strlen != 0; > +} Overall this looks good - it feels a bit tree-ish, but I suppose support for non-constant lengths requires this. It would be nice to avoid building new tree calls - yeah, all our call folding still does this ... :/ Thanks, Richard. > +struct gimple_opt_pass pass_strlen = > +{ > + { > + GIMPLE_PASS, > + "strlen", /* name */ > + gate_strlen, /* gate */ > + tree_ssa_strlen, /* execute */ > + NULL, /* sub */ > + NULL, /* next */ > + 0, /* static_pass_number */ > + TV_TREE_STRLEN, /* tv_id */ > + PROP_cfg | PROP_ssa, /* properties_required */ > + 0, /* properties_provided */ > + 0, /* properties_destroyed */ > + 0, /* todo_flags_start */ > + TODO_ggc_collect > + | TODO_verify_ssa /* todo_flags_finish */ > + } > +}; > --- gcc/testsuite/gcc.dg/strlenopt-1.c.jj 2011-09-02 10:39:12.000000000 +0200 > +++ gcc/testsuite/gcc.dg/strlenopt-1.c 2011-09-02 15:54:23.000000000 +0200 > @@ -0,0 +1,45 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fdump-tree-strlen" } */ > + > +#include "strlenopt.h" > + > +__attribute__((noinline, noclone)) char * > +foo (char *p, char *r) > +{ > + char *q = malloc (strlen (p) + strlen (r) + 64); > + if (q == NULL) return NULL; > + /* This strcpy can be optimized into memcpy, using the remembered > + strlen (p). */ > + strcpy (q, p); > + /* These two strcat can be optimized into memcpy. The first one > + could be even optimized into a *ptr = '/'; store as the '\0' > + is immediately overwritten. */ > + strcat (q, "/"); > + strcat (q, "abcde"); > + /* Due to inefficient PTA (PR50262) the above calls invalidate > + string length of r, so it is optimized just into strcpy instead > + of memcpy. */ > + strcat (q, r); > + return q; > +} > + > +int > +main () > +{ > + char *volatile p = "string1"; > + char *volatile r = "string2"; > + char *q = foo (p, r); > + if (q != NULL) > + { > + if (strcmp (q, "string1/abcdestring2")) > + abort (); > + free (q); > + } > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "memcpy \\(" 3 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ > +/* { dg-final { cleanup-tree-dump "strlen" } } */ > --- gcc/testsuite/gcc.dg/strlenopt-2.c.jj 2011-09-02 15:31:33.000000000 +0200 > +++ gcc/testsuite/gcc.dg/strlenopt-2.c 2011-09-02 15:54:28.000000000 +0200 > @@ -0,0 +1,47 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fdump-tree-strlen" } */ > + > +#include "strlenopt.h" > + > +__attribute__((noinline, noclone)) char * > +foo (char *p, char *r) > +{ > + char buf[26]; > + if (strlen (p) + strlen (r) + 9 > 26) > + return NULL; > + /* This strcpy can be optimized into memcpy, using the remembered > + strlen (p). */ > + strcpy (buf, p); > + /* These two strcat can be optimized into memcpy. The first one > + could be even optimized into a *ptr = '/'; store as the '\0' > + is immediately overwritten. */ > + strcat (buf, "/"); > + strcat (buf, "abcde"); > + /* This strcpy can be optimized into memcpy, using the remembered > + strlen (r). */ > + strcat (buf, r); > + /* And this can be optimized into memcpy too. */ > + strcat (buf, "fg"); > + return strdup (buf); > +} > + > +int > +main () > +{ > + char *volatile p = "string1"; > + char *volatile r = "string2"; > + char *q = foo (p, r); > + if (q != NULL) > + { > + if (strcmp (q, "string1/abcdestring2fg")) > + abort (); > + free (q); > + } > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "memcpy \\(" 5 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ > +/* { dg-final { cleanup-tree-dump "strlen" } } */ > --- gcc/testsuite/gcc.dg/strlenopt-3.c.jj 2011-09-02 11:18:52.000000000 +0200 > +++ gcc/testsuite/gcc.dg/strlenopt-3.c 2011-09-02 15:54:32.000000000 +0200 > @@ -0,0 +1,63 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -fdump-tree-strlen -fdump-tree-optimized" } */ > + > +#include "strlenopt.h" > + > +__attribute__((noinline, noclone)) size_t > +fn1 (char *p, char *q) > +{ > + size_t s = strlen (q); > + strcpy (p, q); > + return s - strlen (p); > +} > + > +__attribute__((noinline, noclone)) size_t > +fn2 (char *p, char *q) > +{ > + size_t s = strlen (q); > + memcpy (p, q, s + 1); > + return s - strlen (p); > +} > + > +__attribute__((noinline, noclone)) size_t > +fn3 (char *p) > +{ > + memcpy (p, "abcd", 5); > + return strlen (p); > +} > + > +__attribute__((noinline, noclone)) size_t > +fn4 (char *p) > +{ > + memcpy (p, "efg\0hij", 6); > + return strlen (p); > +} > + > +int > +main () > +{ > + char buf[64]; > + char *volatile p = buf; > + char *volatile q = "ABCDEF"; > + buf[7] = 'G'; > + if (fn1 (p, q) != 0 || memcmp (buf, "ABCDEF\0G", 8)) > + abort (); > + q = "HIJ"; > + if (fn2 (p + 1, q) != 0 || memcmp (buf, "AHIJ\0F\0G", 8)) > + abort (); > + buf[6] = 'K'; > + if (fn3 (p + 1) != 4 || memcmp (buf, "Aabcd\0KG", 8)) > + abort (); > + if (fn4 (p) != 3 || memcmp (buf, "efg\0hiKG", 8)) > + abort (); > + return 0; > +} > + > +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "memcpy \\(" 4 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ > +/* { dg-final { cleanup-tree-dump "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "return 0" 3 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "return 4" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "return 3" 1 "optimized" } } */ > +/* { dg-final { cleanup-tree-dump "optimized" } } */ > --- gcc/testsuite/gcc.dg/strlenopt.h.jj 2011-09-02 10:36:04.000000000 +0200 > +++ gcc/testsuite/gcc.dg/strlenopt.h 2011-09-02 12:39:08.000000000 +0200 > @@ -0,0 +1,57 @@ > +/* This is a replacement of needed parts from stdlib.h and string.h > + for -ftree-strlen testing, to ensure we are testing the builtins > + rather than whatever the OS has in its headers. */ > + > +#define NULL ((void *) 0) > +typedef __SIZE_TYPE__ size_t; > +extern void abort (void); > +void *malloc (size_t); > +void free (void *); > +char *strdup (const char *); > +size_t strlen (const char *); > +void *memcpy (void *__restrict, const void *__restrict, size_t); > +char *strcpy (char *__restrict, const char *__restrict); > +char *strcat (char *__restrict, const char *__restrict); > +int memcmp (const void *, const void *, size_t); > +int strcmp (const char *, const char *); > +#ifdef USE_GNU > +void *mempcpy (void *__restrict, const void *__restrict, size_t); > +char *stpcpy (char *__restrict, const char *__restrict); > +#endif > + > +#if defined(FORTIFY_SOURCE) && FORTIFY_SOURCE > 0 && __OPTIMIZE__ > +# define bos(ptr) __builtin_object_size (x, FORTIFY_SOURCE > 0) > +# define bos0(ptr) __builtin_object_size (x, 0) > + > +extern inline __attribute__((gnu_inline, always_inline, artificial)) void * > +memcpy (void *__restrict dest, const void *__restrict src, size_t len) > +{ > + return __builtin___memcpy_chk (dest, src, len, bos0 (dest)); > +} > + > +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * > +strcpy (char *__restrict dest, const char *__restrict src) > +{ > + return __builtin___strcpy_chk (dest, src, bos (dest)); > +} > + > +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * > +strcat (char *__restrict dest, const char *__restrict src) > +{ > + return __builtin___strcat_chk (dest, src, bos (dest)); > +} > + > +# ifdef USE_GNU > +extern inline __attribute__((gnu_inline, always_inline, artificial)) void * > +mempcpy (void *__restrict dest, const void *__restrict src, size_t len) > +{ > + return __builtin___mempcpy_chk (dest, src, len, bos0 (dest)); > +} > + > +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * > +stpcpy (char *__restrict dest, const char *__restrict src) > +{ > + return __builtin___stpcpy_chk (dest, src, bos (dest)); > +} > +# endif > +#endif > > Jakub > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC, WIP] tree-ssa-strlen optimization pass 2011-09-05 8:55 ` Richard Guenther @ 2011-09-05 9:41 ` Jakub Jelinek 2011-09-05 9:51 ` Richard Guenther 0 siblings, 1 reply; 8+ messages in thread From: Jakub Jelinek @ 2011-09-05 9:41 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches On Mon, Sep 05, 2011 at 10:54:47AM +0200, Richard Guenther wrote: > > 2011-09-02  Jakub Jelinek  <jakub@redhat.com> > > > >     * common.opt: Add -ftree-strlen option. > > Maybe sth more generic? -foptimize-string-ops? Eventually guard > the existing string op foldings with that flag as well. I don't think other string op foldings should be dependent on that flag, those are already guarded with -fbuiltin-strcpy etc. I think we should have a switch just for this pass, and IMHO -foptimize-string-ops is way too generic, but of course it can be -ftrack-string-lengths or -foptimize-using-string-lengths or similar. > > --- gcc/opts.c.jj    2011-06-30 17:58:03.000000000 +0200 > > +++ gcc/opts.c  2011-09-02 15:53:06.000000000 +0200 > > @@ -484,6 +484,7 @@ static const struct default_options defa > >   { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 }, > >   { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 }, > >   { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 }, > > +   { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 }, > > Why not -Os? Doesn't it remove strlen calls? Sometimes it does (though rarer than e.g. with -O2, because with -O2 we e.g. fold strcat with constant literal as second argument to strlen plus memcpy, but don't do that for -Os). On the other side it transforms strcpy calls into memcpy (where the latter is usually longer), etc. And strcat is almost always shorter too. So, an alternative would be for -Os enable this pass, but only do strlen removal in it (in addition to tracking lengths), nothing else. Even the strlen removal can grow the size if the string length expression isn't constant, or variable, or arithmetic operation on two operands. > > +/* Array indexed by SSA_NAME_VERSION.  0 means unknown, positive value > > +  is an index into strinfo vector, negative value stands for > > +  string length of a string literal (~strlen).  */ > > +static int *ssa_ver_to_stridx; > > + > > +/* Number of currently active string indexes plus one.  */ > > +static int max_stridx; > > USe a VEC? For what? ssa_ver_to_stridx isn't growing, we shouldn't use it on the SSA_NAMEs added during the pass. And max_stridx isn't the length of the ssa_ver_to_stridx array, but how many string indexes are in use (i.e. maximum current length of stridx_to_strinfo vector). > > +/* Free a decl_stridxlist_map.  Callback for htab_delete.  */ > > + > > +static void > > +decl_to_stridxlist_free (void *item) > > +{ > > +  struct stridxlist *next; > > +  struct stridxlist *list = ((struct decl_stridxlist_map *) item)->list.next; > > + > > +  while (list) > > +   { > > +    next = list->next; > > +    XDELETE (list); > > +    list = next; > > +   } > > +  XDELETE (item); > > Maybe use an obstack or alloc-pool dependent on re-use? The number of those is limited by the string index limit, so it doesn't grow without bounds. It isn't reused, so yes, obstack could be used and avoid the above routine. If you prefer that, I'll change it. > > +/* Create a new string index, or return 0 if reached limit.  */ > > + > > +static int > > +new_stridx (tree exp) > > +{ > > +  int idx; > > +  if (max_stridx == 1000) > > I suppose make this a #define or --param Yeah. I guess this one could be a --param. > > +static void > > +find_equal_ptrs (tree ptr, int idx) > > +{ > > +  if (TREE_CODE (ptr) != SSA_NAME) > > +   return; > > +  while (1) > > +   { > > +    gimple stmt = SSA_NAME_DEF_STMT (ptr); > > +    if (!gimple_assign_single_p (stmt) > > +     && (!gimple_assign_cast_p (stmt) > > +       || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) > > +    return; > > I'd prefer postive checks to guard code below. So, you're handling > SSA name copies, conversions from pointers and assignments from > invariants. You could simply do > > if (!is+gimple_assign (stmt)) > return; > switch (gimple_assign_rhs_code (stmt)) > { > case SSA_NAME: > .. > case ADDR_EXPR: > .. > CASE_CONVERT: > ... > default: > return; > } Ok, will change that. > > +       if (!update_call_from_tree (gsi, rhs)) > > +        { > > +         rhs = force_gimple_operand_gsi (gsi, rhs, true, NULL_TREE, > > +                         true, GSI_SAME_STMT); > > +         if (!update_call_from_tree (gsi, rhs)) > > if update_call_from_tree fails then gimplify_and_update_call_from_tree > will always succeed. See gimple_fold_call. I saw that call, I was worried that fold_all_builtins pass if gimplify_and_update_call_from_tree is called forces TODO_update_address_taken. rhs will be just a sum of some constants and/or SSA_NAMEs though, and the call here is a builtin that ought to be non-throwing. So do you think just if (!update_call_from_tree gimplify_and_update_call_from_tree is safe for that case? > > +  len = fold_convert_loc (loc, type, si->length); > > +  len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 1)); > > +  len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, > > +                 GSI_SAME_STMT); > > +  if (gimple_call_num_args (stmt) == 2) > > +   rhs = build_call_expr_loc (loc, fn, 3, dst, src, len); > > +  else > > +   rhs = build_call_expr_loc (loc, fn, 4, dst, src, len, > > +                gimple_call_arg (stmt, 2)); > > +  if (dump_file && (dump_flags & TDF_DETAILS) != 0) > > +   { > > +    fprintf (dump_file, "Optimizing: "); > > +    print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > > +   } > > +  if (update_call_from_tree (gsi, rhs)) > > Btw, it would be nice if you'd not use build_call_expr and then gimplify > the call but instead build a new gimple call directly ... or modify it > in-place. But then I'd need to set the lhs in a different stmt (the new call - as the number of arguments grows), adjust SSA_NAME_DEF_STMT, remove the old call, etc. If there is a helper for all of that, why not, but update_call_from_tree looked like a helper which does that kind of thing. In particular I need something like new_stmt = gimple_build_call (fn, nops, ...); gimple_call_set_lhs (new_stmt, lhs); move_ssa_defining_stmt_for_defs (new_stmt, stmt); gimple_set_vuse (new_stmt, gimple_vuse (stmt)); gimple_set_vdef (new_stmt, gimple_vdef (stmt)); gimple_set_location (new_stmt, gimple_location (stmt)); gsi_replace (si_p, new_stmt, false); Jakub ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC, WIP] tree-ssa-strlen optimization pass 2011-09-05 9:41 ` Jakub Jelinek @ 2011-09-05 9:51 ` Richard Guenther 2011-09-05 18:45 ` Jakub Jelinek 0 siblings, 1 reply; 8+ messages in thread From: Richard Guenther @ 2011-09-05 9:51 UTC (permalink / raw) To: Jakub Jelinek; +Cc: gcc-patches On Mon, Sep 5, 2011 at 11:41 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Mon, Sep 05, 2011 at 10:54:47AM +0200, Richard Guenther wrote: >> > 2011-09-02 Jakub Jelinek <jakub@redhat.com> >> > >> > * common.opt: Add -ftree-strlen option. >> >> Maybe sth more generic? -foptimize-string-ops? Eventually guard >> the existing string op foldings with that flag as well. > > I don't think other string op foldings should be dependent on that flag, > those are already guarded with -fbuiltin-strcpy etc. I think we should have > a switch just for this pass, and IMHO -foptimize-string-ops is way too > generic, but of course it can be -ftrack-string-lengths or > -foptimize-using-string-lengths or similar. Ok, I guess we can bikeshed a bit about the flag ;) -ftree-strlen at least doesn't sound very informative (nor do I like the tree- prefix too much). -foptimize-strlen would be fine with me. >> > --- gcc/opts.c.jj 2011-06-30 17:58:03.000000000 +0200 >> > +++ gcc/opts.c 2011-09-02 15:53:06.000000000 +0200 >> > @@ -484,6 +484,7 @@ static const struct default_options defa >> > { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 }, >> > { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 }, >> > { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 }, >> > + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_ftree_strlen, NULL, 1 }, >> >> Why not -Os? Doesn't it remove strlen calls? > > Sometimes it does (though rarer than e.g. with -O2, because with -O2 we e.g. > fold strcat with constant literal as second argument to strlen plus memcpy, > but don't do that for -Os). On the other side it transforms strcpy calls > into memcpy (where the latter is usually longer), etc. And strcat is almost > always shorter too. So, an alternative would be for -Os enable this pass, > but only do strlen removal in it (in addition to tracking lengths), nothing > else. Even the strlen removal can grow the size if the string length > expression isn't constant, or variable, or arithmetic operation on two > operands. Hm, ok. I guess we can leave it for -O2+ for now. >> > +/* Array indexed by SSA_NAME_VERSION. 0 means unknown, positive value >> > + is an index into strinfo vector, negative value stands for >> > + string length of a string literal (~strlen). */ >> > +static int *ssa_ver_to_stridx; >> > + >> > +/* Number of currently active string indexes plus one. */ >> > +static int max_stridx; >> >> USe a VEC? > > For what? ssa_ver_to_stridx isn't growing, we shouldn't use it on the > SSA_NAMEs added during the pass. And max_stridx isn't the length of > the ssa_ver_to_stridx array, but how many string indexes are in use > (i.e. maximum current length of stridx_to_strinfo vector). Just for consistency across the compiler - I have no strong preferences, esp. if we don't grow it (but you'd get bounds checking, etc. for free ...) >> > +/* Free a decl_stridxlist_map. Callback for htab_delete. */ >> > + >> > +static void >> > +decl_to_stridxlist_free (void *item) >> > +{ >> > + struct stridxlist *next; >> > + struct stridxlist *list = ((struct decl_stridxlist_map *) item)->list.next; >> > + >> > + while (list) >> > + { >> > + next = list->next; >> > + XDELETE (list); >> > + list = next; >> > + } >> > + XDELETE (item); >> >> Maybe use an obstack or alloc-pool dependent on re-use? > > The number of those is limited by the string index limit, so it doesn't grow > without bounds. It isn't reused, so yes, obstack could be used and avoid > the above routine. If you prefer that, I'll change it. optimizing malloc/free calls is cheap, so yes, I think we should use an obstack here. >> > +/* Create a new string index, or return 0 if reached limit. */ >> > + >> > +static int >> > +new_stridx (tree exp) >> > +{ >> > + int idx; >> > + if (max_stridx == 1000) >> >> I suppose make this a #define or --param > > Yeah. I guess this one could be a --param. > >> > +static void >> > +find_equal_ptrs (tree ptr, int idx) >> > +{ >> > + if (TREE_CODE (ptr) != SSA_NAME) >> > + return; >> > + while (1) >> > + { >> > + gimple stmt = SSA_NAME_DEF_STMT (ptr); >> > + if (!gimple_assign_single_p (stmt) >> > + && (!gimple_assign_cast_p (stmt) >> > + || !POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) >> > + return; >> >> I'd prefer postive checks to guard code below. So, you're handling >> SSA name copies, conversions from pointers and assignments from >> invariants. You could simply do >> >> if (!is+gimple_assign (stmt)) >> return; >> switch (gimple_assign_rhs_code (stmt)) >> { >> case SSA_NAME: >> .. >> case ADDR_EXPR: >> .. >> CASE_CONVERT: >> ... >> default: >> return; >> } > > Ok, will change that. > >> > + if (!update_call_from_tree (gsi, rhs)) >> > + { >> > + rhs = force_gimple_operand_gsi (gsi, rhs, true, NULL_TREE, >> > + true, GSI_SAME_STMT); >> > + if (!update_call_from_tree (gsi, rhs)) >> >> if update_call_from_tree fails then gimplify_and_update_call_from_tree >> will always succeed. See gimple_fold_call. > > I saw that call, I was worried that fold_all_builtins pass if > gimplify_and_update_call_from_tree is called forces > TODO_update_address_taken. rhs will be just a sum of some constants and/or > SSA_NAMEs though, and the call here is a builtin that ought to be > non-throwing. So do you think just > if (!update_call_from_tree > gimplify_and_update_call_from_tree > is safe for that case? Yes. No idea why we force TODO_update_address_taken here - I suppose it's because of memcpy folding exposing the possibility of removing some TREE_ADDRESSABLE bits. It's clearly for optimization only. >> > + len = fold_convert_loc (loc, type, si->length); >> > + len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 1)); >> > + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, >> > + GSI_SAME_STMT); >> > + if (gimple_call_num_args (stmt) == 2) >> > + rhs = build_call_expr_loc (loc, fn, 3, dst, src, len); >> > + else >> > + rhs = build_call_expr_loc (loc, fn, 4, dst, src, len, >> > + gimple_call_arg (stmt, 2)); >> > + if (dump_file && (dump_flags & TDF_DETAILS) != 0) >> > + { >> > + fprintf (dump_file, "Optimizing: "); >> > + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); >> > + } >> > + if (update_call_from_tree (gsi, rhs)) >> >> Btw, it would be nice if you'd not use build_call_expr and then gimplify >> the call but instead build a new gimple call directly ... or modify it >> in-place. > > But then I'd need to set the lhs in a different stmt (the new call - as the > number of arguments grows), adjust SSA_NAME_DEF_STMT, remove the old call, etc. > If there is a helper for all of that, why not, but update_call_from_tree > looked like a helper which does that kind of thing. > In particular I need something like > new_stmt = gimple_build_call (fn, nops, ...); > gimple_call_set_lhs (new_stmt, lhs); > move_ssa_defining_stmt_for_defs (new_stmt, stmt); > gimple_set_vuse (new_stmt, gimple_vuse (stmt)); > gimple_set_vdef (new_stmt, gimple_vdef (stmt)); > gimple_set_location (new_stmt, gimple_location (stmt)); > gsi_replace (si_p, new_stmt, false); Yeah, I suppose update_call_from_tree could use a piecewise variant ... ok, let's defer this as a cleanup for whoever feels like updating some more code. Thanks, Richard. > Jakub > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC, WIP] tree-ssa-strlen optimization pass 2011-09-05 9:51 ` Richard Guenther @ 2011-09-05 18:45 ` Jakub Jelinek 0 siblings, 0 replies; 8+ messages in thread From: Jakub Jelinek @ 2011-09-05 18:45 UTC (permalink / raw) To: Richard Guenther; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 892 bytes --] On Mon, Sep 05, 2011 at 11:50:52AM +0200, Richard Guenther wrote: > Yeah, I suppose update_call_from_tree could use a piecewise variant ... > ok, let's defer this as a cleanup for whoever feels like updating some > more code. Attached are two patches, the first one contains two small changes, one to make sure gimplify_and_update_call_from_tree doesn't need TODO_update_ssa and another which adds a function similar to update_call_from_tree, which just replaces one call with another one from given fn, nargs and the arguments passed to ... plus new functions this new function needs. This patch can be applied independently, has been bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? The second patch is updated version of tree-ssa-strlen.c pass, with incorporated feedback from you, but so far no further functional changes (which I'd still like to work on). Jakub [-- Attachment #2: X239 --] [-- Type: text/plain, Size: 6540 bytes --] 2011-09-05 Jakub Jelinek <jakub@redhat.com> * gimple-fold.c (gimplify_and_update_call_from_tree): Set gctx.into_ssa after push_gimplify_context. * gimple.c (gimple_build_call_valist): New function. * gimple.h (gimple_build_call_valist): New prototype. * tree-ssa-propagate.c (finish_update_gimple_call): New function. (update_gimple_call): Likewise. (update_call_from_tree): Use finish_update_gimple_call. * tree-ssa-propagate.h (update_gimple_call): New prototype. --- gcc/gimple-fold.c.jj 2011-09-02 16:29:39.000000000 +0200 +++ gcc/gimple-fold.c 2011-09-05 14:51:02.000000000 +0200 @@ -551,6 +551,7 @@ gimplify_and_update_call_from_tree (gimp reaching_vuse = gimple_vuse (stmt); push_gimplify_context (&gctx); + gctx.into_ssa = gimple_in_ssa_p (cfun); if (lhs == NULL_TREE) { --- gcc/gimple.c.jj 2011-09-02 16:29:39.000000000 +0200 +++ gcc/gimple.c 2011-09-05 14:55:34.000000000 +0200 @@ -215,9 +215,10 @@ gimple_call_reset_alias_info (gimple s) pt_solution_reset (gimple_call_clobber_set (s)); } -/* Helper for gimple_build_call, gimple_build_call_vec and - gimple_build_call_from_tree. Build the basic components of a - GIMPLE_CALL statement to function FN with NARGS arguments. */ +/* Helper for gimple_build_call, gimple_build_call_valist, + gimple_build_call_vec and gimple_build_call_from_tree. Build the basic + components of a GIMPLE_CALL statement to function FN with NARGS + arguments. */ static inline gimple gimple_build_call_1 (tree fn, unsigned nargs) @@ -272,6 +273,26 @@ gimple_build_call (tree fn, unsigned nar } +/* Build a GIMPLE_CALL statement to function FN. NARGS is the number of + arguments. AP contains the arguments. */ + +gimple +gimple_build_call_valist (tree fn, unsigned nargs, va_list ap) +{ + gimple call; + unsigned i; + + gcc_assert (TREE_CODE (fn) == FUNCTION_DECL || is_gimple_call_addr (fn)); + + call = gimple_build_call_1 (fn, nargs); + + for (i = 0; i < nargs; i++) + gimple_call_set_arg (call, i, va_arg (ap, tree)); + + return call; +} + + /* Helper for gimple_build_call_internal and gimple_build_call_internal_vec. Build the basic components of a GIMPLE_CALL statement to internal function FN with NARGS arguments. */ --- gcc/gimple.h.jj 2011-09-02 16:29:39.000000000 +0200 +++ gcc/gimple.h 2011-09-05 14:56:15.000000000 +0200 @@ -831,6 +831,7 @@ gimple gimple_build_debug_source_bind_st gimple gimple_build_call_vec (tree, VEC(tree, heap) *); gimple gimple_build_call (tree, unsigned, ...); +gimple gimple_build_call_valist (tree, unsigned, va_list); gimple gimple_build_call_internal (enum internal_fn, unsigned, ...); gimple gimple_build_call_internal_vec (enum internal_fn, VEC(tree, heap) *); gimple gimple_build_call_from_tree (tree); --- gcc/tree-ssa-propagate.c.jj 2011-07-22 22:15:02.000000000 +0200 +++ gcc/tree-ssa-propagate.c 2011-09-05 15:08:45.000000000 +0200 @@ -1,5 +1,5 @@ /* Generic SSA value propagation engine. - Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010 + Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Diego Novillo <dnovillo@redhat.com> @@ -673,6 +673,38 @@ move_ssa_defining_stmt_for_defs (gimple } } +/* Helper function for update_gimple_call and update_call_from_tree. + A GIMPLE_CALL STMT is being replaced with GIMPLE_CALL NEW_STMT. */ + +static void +finish_update_gimple_call (gimple_stmt_iterator *si_p, gimple new_stmt, + gimple stmt) +{ + gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); + move_ssa_defining_stmt_for_defs (new_stmt, stmt); + gimple_set_vuse (new_stmt, gimple_vuse (stmt)); + gimple_set_vdef (new_stmt, gimple_vdef (stmt)); + gimple_set_location (new_stmt, gimple_location (stmt)); + gsi_replace (si_p, new_stmt, false); +} + +/* Update a GIMPLE_CALL statement at iterator *SI_P to call to FN + with number of arguments NARGS, where the arguments in GIMPLE form + follow NARGS argument. */ + +bool +update_gimple_call (gimple_stmt_iterator *si_p, tree fn, int nargs, ...) +{ + va_list ap; + gimple new_stmt, stmt = gsi_stmt (*si_p); + + gcc_assert (is_gimple_call (stmt)); + va_start (ap, nargs); + new_stmt = gimple_build_call_valist (fn, nargs, ap); + finish_update_gimple_call (si_p, new_stmt, stmt); + va_end (ap); + return true; +} /* Update a GIMPLE_CALL statement at iterator *SI_P to reflect the value of EXPR, which is expected to be the result of folding the @@ -689,14 +721,8 @@ move_ssa_defining_stmt_for_defs (gimple bool update_call_from_tree (gimple_stmt_iterator *si_p, tree expr) { - tree lhs; - gimple stmt = gsi_stmt (*si_p); - gcc_assert (is_gimple_call (stmt)); - - lhs = gimple_call_lhs (stmt); - if (valid_gimple_call_p (expr)) { /* The call has simplified to another call. */ @@ -716,18 +742,14 @@ update_call_from_tree (gimple_stmt_itera } new_stmt = gimple_build_call_vec (fn, args); - gimple_call_set_lhs (new_stmt, lhs); - move_ssa_defining_stmt_for_defs (new_stmt, stmt); - gimple_set_vuse (new_stmt, gimple_vuse (stmt)); - gimple_set_vdef (new_stmt, gimple_vdef (stmt)); - gimple_set_location (new_stmt, gimple_location (stmt)); - gsi_replace (si_p, new_stmt, false); + finish_update_gimple_call (si_p, new_stmt, stmt); VEC_free (tree, heap, args); return true; } else if (valid_gimple_rhs_p (expr)) { + tree lhs = gimple_call_lhs (stmt); gimple new_stmt; /* The call has simplified to an expression --- gcc/tree-ssa-propagate.h.jj 2010-08-11 21:08:06.000000000 +0200 +++ gcc/tree-ssa-propagate.h 2011-09-05 15:09:26.000000000 +0200 @@ -1,6 +1,7 @@ /* Data structures and function declarations for the SSA value propagation engine. - Copyright (C) 2004, 2005, 2007, 2008, 2010 Free Software Foundation, Inc. + Copyright (C) 2004, 2005, 2007, 2008, 2010, 2011 + Free Software Foundation, Inc. Contributed by Diego Novillo <dnovillo@redhat.com> This file is part of GCC. @@ -72,6 +73,7 @@ typedef tree (*ssa_prop_get_value_fn) (t void ssa_propagate (ssa_prop_visit_stmt_fn, ssa_prop_visit_phi_fn); bool valid_gimple_rhs_p (tree); void move_ssa_defining_stmt_for_defs (gimple, gimple); +bool update_gimple_call (gimple_stmt_iterator *, tree, int, ...); bool update_call_from_tree (gimple_stmt_iterator *, tree); bool stmt_makes_single_store (gimple); bool substitute_and_fold (ssa_prop_get_value_fn, ssa_prop_fold_stmt_fn, bool); [-- Attachment #3: X236m --] [-- Type: text/plain, Size: 47972 bytes --] 2011-09-05 Jakub Jelinek <jakub@redhat.com> * common.opt: Add -foptimize-strlen option. * Makefile.in (OBJS): Add tree-ssa-strlen.o. (tree-sssa-strlen.o): Add dependencies. * opts.c (default_options_table): Enable -foptimize-strlen by default at -O2 if not -Os. * passes.c (init_optimization_passes): Add pass_strlen after pass_object_sizes. * timevar.def (TV_TREE_STRLEN): New timevar. * params.def (PARAM_MAX_TRACKED_STRLENS): New parameter. * tree-pass.h (pass_strlen): Declare. * tree-ssa-strlen.c: New file. * gcc.dg/strlenopt-1.c: New test. * gcc.dg/strlenopt-2.c: New test. * gcc.dg/strlenopt-3.c: New test. * gcc.dg/strlenopt.h: New file. --- gcc/common.opt.jj 2011-09-02 16:29:20.000000000 +0200 +++ gcc/common.opt 2011-09-05 13:10:53.000000000 +0200 @@ -1953,6 +1953,10 @@ ftree-fre Common Report Var(flag_tree_fre) Optimization Enable Full Redundancy Elimination (FRE) on trees +foptimize-strlen +Common Report Var(flag_optimize_strlen) Optimization +Enable string length optimizations on trees + ftree-loop-distribution Common Report Var(flag_tree_loop_distribution) Optimization Enable loop distribution on trees --- gcc/Makefile.in.jj 2011-09-02 16:29:39.000000000 +0200 +++ gcc/Makefile.in 2011-09-05 14:12:21.000000000 +0200 @@ -1472,6 +1472,7 @@ OBJS = \ tree-ssa-reassoc.o \ tree-ssa-sccvn.o \ tree-ssa-sink.o \ + tree-ssa-strlen.o \ tree-ssa-structalias.o \ tree-ssa-ter.o \ tree-ssa-threadedge.o \ @@ -3157,6 +3158,9 @@ tree-ssa-ccp.o : tree-ssa-ccp.c $(TREE_F $(TREE_DUMP_H) $(BASIC_BLOCK_H) $(TREE_PASS_H) langhooks.h $(PARAMS_H) \ tree-ssa-propagate.h value-prof.h $(FLAGS_H) $(TARGET_H) $(DIAGNOSTIC_CORE_H) \ $(DBGCNT_H) tree-pretty-print.h gimple-pretty-print.h gimple-fold.h +tree-ssa-strlen.o : tree-ssa-strlen.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h alloc-pool.h tree-ssa-propagate.h \ + gimple-pretty-print.h $(PARAMS_H) tree-sra.o : tree-sra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h alloc-pool.h \ $(TM_H) $(TREE_H) $(GIMPLE_H) $(CGRAPH_H) $(TREE_FLOW_H) \ $(IPA_PROP_H) $(DIAGNOSTIC_H) statistics.h $(TREE_DUMP_H) $(TIMEVAR_H) \ --- gcc/opts.c.jj 2011-09-05 12:28:54.000000000 +0200 +++ gcc/opts.c 2011-09-05 13:10:53.000000000 +0200 @@ -484,6 +484,7 @@ static const struct default_options defa { OPT_LEVELS_2_PLUS, OPT_falign_jumps, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_falign_labels, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_falign_functions, NULL, 1 }, + { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 }, /* -O3 optimizations. */ { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, --- gcc/params.def.jj 2011-07-27 23:25:36.000000000 +0200 +++ gcc/params.def 2011-09-05 14:28:31.000000000 +0200 @@ -914,6 +914,14 @@ DEFPARAM (PARAM_ALLOW_STORE_DATA_RACES, "Allow new data races on stores to be introduced", 1, 0, 1) +/* Maximum number of strings for which strlen optimization pass will + track string lenths. */ +DEFPARAM (PARAM_MAX_TRACKED_STRLENS, + "max-tracked-strlens", + "Maximum number of strings for which strlen optimization pass will " + "track string lengths", + 1000, 0, 0) + /* Local variables: --- gcc/passes.c.jj 2011-09-02 16:29:20.000000000 +0200 +++ gcc/passes.c 2011-09-05 13:10:53.000000000 +0200 @@ -1,7 +1,7 @@ /* Top level of GCC compilers (cc1, cc1plus, etc.) Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, - 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 - Free Software Foundation, Inc. + 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, + 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -1321,6 +1321,7 @@ init_optimization_passes (void) NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_object_sizes); + NEXT_PASS (pass_strlen); NEXT_PASS (pass_ccp); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_cse_sincos); --- gcc/timevar.def.jj 2011-09-02 16:29:20.000000000 +0200 +++ gcc/timevar.def 2011-09-05 13:10:53.000000000 +0200 @@ -1,7 +1,7 @@ /* This file contains the definitions for timing variables used to measure run-time performance of the compiler. Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, - 2009, 2010 + 2009, 2010, 2011 Free Software Foundation, Inc. Contributed by Alex Samuel <samuel@codesourcery.com> @@ -183,6 +183,7 @@ DEFTIMEVAR (TV_TREE_COPY_RENAME , " DEFTIMEVAR (TV_TREE_SSA_VERIFY , "tree SSA verifier") DEFTIMEVAR (TV_TREE_STMT_VERIFY , "tree STMT verifier") DEFTIMEVAR (TV_TREE_SWITCH_CONVERSION, "tree switch initialization conversion") +DEFTIMEVAR (TV_TREE_STRLEN , "tree strlen optimization") DEFTIMEVAR (TV_CGRAPH_VERIFY , "callgraph verifier") DEFTIMEVAR (TV_DOM_FRONTIERS , "dominance frontiers") DEFTIMEVAR (TV_DOMINANCE , "dominance computation") --- gcc/tree-pass.h.jj 2011-09-02 16:29:20.000000000 +0200 +++ gcc/tree-pass.h 2011-09-05 13:10:53.000000000 +0200 @@ -412,6 +412,7 @@ extern struct gimple_opt_pass pass_diagn extern struct gimple_opt_pass pass_expand_omp; extern struct gimple_opt_pass pass_expand_omp_ssa; extern struct gimple_opt_pass pass_object_sizes; +extern struct gimple_opt_pass pass_strlen; extern struct gimple_opt_pass pass_fold_builtins; extern struct gimple_opt_pass pass_stdarg; extern struct gimple_opt_pass pass_early_warn_uninitialized; --- gcc/tree-ssa-strlen.c.jj 2011-09-05 13:10:53.000000000 +0200 +++ gcc/tree-ssa-strlen.c 2011-09-05 15:29:05.000000000 +0200 @@ -0,0 +1,1280 @@ +/* String length optimization + Copyright (C) 2011 Free Software Foundation, Inc. + Contributed by Jakub Jelinek <jakub@redhat.com> + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tree-flow.h" +#include "tree-pass.h" +#include "domwalk.h" +#include "alloc-pool.h" +#include "tree-ssa-propagate.h" +#include "gimple-pretty-print.h" +#include "params.h" + +/* A vector indexed by SSA_NAME_VERSION. 0 means unknown, positive value + is an index into strinfo vector, negative value stands for + string length of a string literal (~strlen). */ +static VEC (int, heap) *ssa_ver_to_stridx; + +/* Number of currently active string indexes plus one. */ +static int max_stridx; + +/* String information record. */ +typedef struct strinfo_struct +{ + /* String length of this string. */ + tree length; + /* Any of the corresponding pointers for querying alias oracle. */ + tree ptr; + /* Reference count. Any changes to strinfo entry possibly shared + with dominating basic blocks need unshare_strinfo first, except + for dont_invalidate which affects only the immediately next + maybe_invalidate. */ + int refcount; + /* Copy of index. get_strinfo (si->idx) should return si; */ + int idx; + /* These 3 fields are for chaining related string pointers together. + E.g. for + bl = strlen (b); dl = strlen (d); strcpy (a, b); c = a + bl; + strcpy (c, d); e = c + dl; + strinfo(a) -> strinfo(c) -> strinfo(e) + All have ->first field equal to strinfo(a)->idx and are doubly + chained through prev/next fields. The later strinfos are required + to point into the same string with zero or more bytes after + the previous pointer and all bytes in between the two pointers + must be non-zero. Functions like strcpy or memcpy are supposed + to adjust all previous strinfo lengths, but not following strinfo + lengths (those are uncertain, usually invalidated during + maybe_invalidate, except when the alias oracle knows better). + Functions like strcat on the other side adjust the whole + related strinfo chain. + They are updated lazily, so to use the chain the same first fields + and si->prev->next == si->idx needs to be verified. */ + int first; + int next; + int prev; + /* A flag for the next maybe_invalidate that this strinfo shouldn't + be invalidated. Always cleared by maybe_invalidate. */ + bool dont_invalidate; +} *strinfo; +DEF_VEC_P(strinfo); +DEF_VEC_ALLOC_P(strinfo,heap); + +/* Pool for allocating strinfo_struct entries. */ +static alloc_pool strinfo_pool; + +/* Vector mapping positive string indexes to strinfo, for the + current basic block. The first pointer in the vector is special, + it is either NULL, meaning the vector isn't shared, or it is + a basic block pointer to the owner basic_block if shared. + If some other bb wants to modify the vector, the vector needs + to be unshared first, and only the owner bb is supposed to free it. */ +static VEC(strinfo, heap) *stridx_to_strinfo; + +/* One OFFSET->IDX mapping. */ +struct stridxlist +{ + struct stridxlist *next; + HOST_WIDE_INT offset; + int idx; +}; + +/* Hash table entry, mapping a DECL to a chain of OFFSET->IDX mappings. */ +struct decl_stridxlist_map +{ + struct tree_map_base base; + struct stridxlist list; +}; + +/* Hash table for mapping decls to a chained list of offset -> idx + mappings. */ +static htab_t decl_to_stridxlist_htab; + +/* Obstack for struct stridxlist and struct decl_stridxlist_map. */ +static struct obstack stridx_obstack; + +/* Hash a from tree in a decl_stridxlist_map. */ + +static unsigned int +decl_to_stridxlist_hash (const void *item) +{ + return DECL_UID (((const struct decl_stridxlist_map *) item)->base.from); +} + +/* Return string index for EXP. */ + +static int +get_stridx (tree exp) +{ + tree l; + + if (TREE_CODE (exp) == SSA_NAME) + return VEC_index (int, ssa_ver_to_stridx, SSA_NAME_VERSION (exp)); + + if (TREE_CODE (exp) == ADDR_EXPR && decl_to_stridxlist_htab) + { + HOST_WIDE_INT off; + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), + &off); + if (base && DECL_P (base)) + { + struct decl_stridxlist_map ent, *e; + ent.base.from = base; + e = (struct decl_stridxlist_map *) + htab_find_with_hash (decl_to_stridxlist_htab, &ent, + DECL_UID (base)); + if (e) + { + struct stridxlist *list = &e->list; + do + { + if (list->offset == off) + return list->idx; + list = list->next; + } + while (list); + } + } + } + + l = c_strlen (exp, 0); + if (l != NULL_TREE + && host_integerp (l, 1)) + { + unsigned HOST_WIDE_INT len = tree_low_cst (l, 1); + if (len == (unsigned int) len + && (int) len >= 0) + return ~(int) len; + } + return 0; +} + +/* Return true if strinfo vector is shared with the immediate dominator. */ + +static inline bool +strinfo_shared (void) +{ + return VEC_length (strinfo, stridx_to_strinfo) + && VEC_index (strinfo, stridx_to_strinfo, 0) != NULL; +} + +/* Unshare strinfo vector that is shared with the immediate dominator. */ + +static void +unshare_strinfo_vec (void) +{ + strinfo si; + unsigned int i = 0; + + gcc_assert (strinfo_shared ()); + stridx_to_strinfo = VEC_copy (strinfo, heap, stridx_to_strinfo); + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) + if (si != NULL) + si->refcount++; + VEC_replace (strinfo, stridx_to_strinfo, 0, NULL); +} + +/* Attempt to create a string index for ADDR_EXPR exp. + Return a pointer to the location where the string index can + be stored (if 0) or is stored, or NULL if this can't be tracked. */ + +static int * +addr_stridxptr (tree exp) +{ + void **slot; + struct decl_stridxlist_map ent; + struct stridxlist *list; + HOST_WIDE_INT off; + + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (exp, 0), &off); + if (base == NULL_TREE || !DECL_P (base)) + return NULL; + + if (decl_to_stridxlist_htab == NULL) + { + decl_to_stridxlist_htab + = htab_create (64, decl_to_stridxlist_hash, tree_map_base_eq, NULL); + gcc_obstack_init (&stridx_obstack); + } + ent.base.from = base; + slot = htab_find_slot_with_hash (decl_to_stridxlist_htab, &ent, + DECL_UID (base), INSERT); + if (*slot) + { + int i; + list = &((struct decl_stridxlist_map *)*slot)->list; + for (i = 0; i < 16; i++) + { + if (list->offset == off) + return &list->idx; + if (list->next == NULL) + break; + } + if (i == 16) + return NULL; + list->next = XOBNEW (&stridx_obstack, struct stridxlist); + list = list->next; + } + else + { + struct decl_stridxlist_map *e + = XOBNEW (&stridx_obstack, struct decl_stridxlist_map); + e->base.from = base; + *slot = (void *) e; + list = &e->list; + } + list->next = NULL; + list->offset = off; + list->idx = 0; + return &list->idx; +} + +/* Create a new string index, or return 0 if reached limit. */ + +static int +new_stridx (tree exp) +{ + int idx; + if (max_stridx >= PARAM_VALUE (PARAM_MAX_TRACKED_STRLENS)) + return 0; + if (TREE_CODE (exp) == SSA_NAME) + { + idx = max_stridx++; + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (exp), idx); + return idx; + } + if (TREE_CODE (exp) == ADDR_EXPR) + { + int *pidx = addr_stridxptr (exp); + if (pidx != NULL) + { + gcc_assert (*pidx == 0); + *pidx = max_stridx++; + return *pidx; + } + } + return 0; +} + +/* Create a new strinfo. */ + +static strinfo +new_strinfo (tree ptr, int idx, tree length) +{ + strinfo si = (strinfo) pool_alloc (strinfo_pool); + si->length = length; + si->ptr = ptr; + si->refcount = 1; + si->idx = idx; + si->first = 0; + si->prev = 0; + si->next = 0; + si->dont_invalidate = false; + return si; +} + +/* Decrease strinfo refcount and free it if not referenced anymore. */ + +static inline void +free_strinfo (strinfo si) +{ + if (si && --si->refcount == 0) + pool_free (strinfo_pool, si); +} + +/* Return strinfo vector entry IDX. */ + +static inline strinfo +get_strinfo (int idx) +{ + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx) + return NULL; + return VEC_index (strinfo, stridx_to_strinfo, idx); +} + +/* Set strinfo in the vector entry IDX to SI. */ + +static inline void +set_strinfo (int idx, strinfo si) +{ + if (VEC_length (strinfo, stridx_to_strinfo) && VEC_index (strinfo, stridx_to_strinfo, 0)) + unshare_strinfo_vec (); + if (VEC_length (strinfo, stridx_to_strinfo) <= (unsigned int) idx) + VEC_safe_grow_cleared (strinfo, heap, stridx_to_strinfo, idx + 1); + VEC_replace (strinfo, stridx_to_strinfo, idx, si); +} + +/* Invalidate string length information for strings whose length + might change due to stores in stmt. */ + +static bool +maybe_invalidate (gimple stmt) +{ + strinfo si; + unsigned int i; + bool nonempty = false; + + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) + if (si != NULL) + { + if (!si->dont_invalidate) + { + ao_ref r; + ao_ref_init_from_ptr_and_size (&r, si->ptr, NULL_TREE); + if (stmt_may_clobber_ref_p_1 (stmt, &r)) + { + set_strinfo (i, NULL); + free_strinfo (si); + continue; + } + } + si->dont_invalidate = false; + nonempty = true; + } + return nonempty; +} + +/* Unshare strinfo record SI, if it has recount > 1 or + if stridx_to_strinfo vector is shared with some other + bbs. */ + +static strinfo +unshare_strinfo (strinfo si) +{ + strinfo nsi; + + if (si->refcount == 1 && !strinfo_shared ()) + return si; + + nsi = new_strinfo (si->ptr, si->idx, si->length); + nsi->first = si->first; + nsi->prev = si->prev; + nsi->next = si->next; + set_strinfo (si->idx, nsi); + free_strinfo (si); + return nsi; +} + +/* Return first strinfo in the related strinfo chain + if all strinfos in between belong to the chain, otherwise + NULL. */ + +static strinfo +verify_related_strinfos (strinfo origsi) +{ + strinfo si = origsi, psi; + + if (origsi->first == 0) + return NULL; + for (; si->prev; si = psi) + { + if (si->first != origsi->first) + return NULL; + psi = get_strinfo (si->prev); + if (psi == NULL) + return NULL; + if (psi->next != si->idx) + return NULL; + } + if (si->idx != si->first) + return NULL; + return si; +} + +/* Note that PTR, a pointer SSA_NAME initialized in the current stmt, points + to a zero-length string and if possible chain it to a related strinfo + chain whose part is or might be CHAINSI. */ + +static strinfo +zero_length_string (tree ptr, strinfo chainsi) +{ + strinfo si; + int idx; + gcc_checking_assert (TREE_CODE (ptr) == SSA_NAME + && get_stridx (ptr) == 0); + + if (chainsi != NULL) + { + if (verify_related_strinfos (chainsi)) + { + for (; chainsi->next; chainsi = si) + { + si = get_strinfo (chainsi->next); + if (si == NULL + || si->first != chainsi->first + || si->prev != chainsi->idx) + break; + } + if (integer_zerop (chainsi->length)) + { + if (chainsi->next) + { + chainsi = unshare_strinfo (chainsi); + chainsi->next = 0; + } + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (ptr), + chainsi->idx); + return chainsi; + } + } + else if (chainsi->first || chainsi->prev || chainsi->next) + { + chainsi = unshare_strinfo (chainsi); + chainsi->first = 0; + chainsi->prev = 0; + chainsi->next = 0; + } + } + idx = new_stridx (ptr); + if (idx == 0) + return NULL; + si = new_strinfo (ptr, idx, build_int_cst (size_type_node, 0)); + set_strinfo (idx, si); + if (chainsi != NULL) + { + chainsi = unshare_strinfo (chainsi); + if (chainsi->first == 0) + chainsi->first = chainsi->idx; + chainsi->next = idx; + si->prev = chainsi->idx; + si->first = chainsi->first; + } + return si; +} + +/* For strinfo ORIGSI whose length has been just updated + update also related strinfo lengths (add ADJ to each, + but don't adjust ORIGSI). */ + +static void +adjust_related_strinfos (location_t loc, strinfo origsi, tree adj) +{ + strinfo si = verify_related_strinfos (origsi); + + if (si == NULL) + return; + + while (1) + { + strinfo nsi; + + if (si != origsi) + { + tree tem; + + si = unshare_strinfo (si); + tem = fold_convert_loc (loc, TREE_TYPE (si->length), adj); + si->length = fold_build2_loc (loc, PLUS_EXPR, + TREE_TYPE (si->length), si->length, + tem); + si->dont_invalidate = true; + } + if (si->next == 0) + return; + nsi = get_strinfo (si->next); + if (nsi == NULL + || nsi->first != si->first + || nsi->prev != si->idx) + return; + si = nsi; + } +} + +/* Find if there are other SSA_NAME pointers equal to PTR + for which we don't track their string lengths yet. If so, use + IDX for them. */ + +static void +find_equal_ptrs (tree ptr, int idx) +{ + if (TREE_CODE (ptr) != SSA_NAME) + return; + while (1) + { + gimple stmt = SSA_NAME_DEF_STMT (ptr); + if (!is_gimple_assign (stmt)) + return; + ptr = gimple_assign_rhs1 (stmt); + switch (gimple_assign_rhs_code (stmt)) + { + case SSA_NAME: + break; + case ADDR_EXPR: + { + int *pidx = addr_stridxptr (ptr); + if (pidx != NULL && *pidx == 0) + *pidx = idx; + return; + } + CASE_CONVERT: + if (POINTER_TYPE_P (TREE_TYPE (ptr))) + break; + return; + default: + return; + } + if (VEC_index (int, ssa_ver_to_stridx, SSA_NAME_VERSION (ptr)) != 0) + return; + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (ptr), idx); + } +} + +/* Handle a strlen call. If strlen of the argument is known, replace + the strlen call with the known value, otherwise remember that strlen + of the argument is stored in the lhs SSA_NAME. */ + +static void +handle_builtin_strlen (gimple_stmt_iterator *gsi) +{ + int idx; + tree src; + gimple stmt = gsi_stmt (*gsi); + tree lhs = gimple_call_lhs (stmt); + + if (lhs == NULL_TREE) + return; + + src = gimple_call_arg (stmt, 0); + idx = get_stridx (src); + if (idx) + { + if (idx < 0) + { + tree rhs = build_int_cst (TREE_TYPE (lhs), ~idx); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (update_call_from_tree (gsi, rhs)) + { + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + } + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) + fprintf (dump_file, "not possible.\n"); + } + else + { + strinfo si = get_strinfo (idx); + if (si != NULL) + { + tree rhs = si->length; + if (!useless_type_conversion_p (TREE_TYPE (lhs), + TREE_TYPE (rhs))) + rhs = fold_convert_loc (gimple_location (stmt), + TREE_TYPE (lhs), si->length); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (!update_call_from_tree (gsi, rhs)) + gimplify_and_update_call_from_tree (gsi, rhs); + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + if (TREE_CODE (si->length) != SSA_NAME + && TREE_CODE (si->length) != INTEGER_CST + && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + { + si = unshare_strinfo (si); + si->length = lhs; + } + } + } + } + else if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + { + idx = new_stridx (src); + if (idx) + { + strinfo si = new_strinfo (src, idx, lhs); + set_strinfo (idx, si); + find_equal_ptrs (src, idx); + } + } +} + +/* Handle a strcpy-like ({st{r,p}cpy,__st{r,p}cpy_chk}) call. + If strlen of the second argument is known, strlen of the first argument + is the same after this call. Furthermore, attempt to convert it to + memcpy. */ + +static void +handle_builtin_strcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) +{ + int idx, didx; + tree src, dst, len, lhs, args, type, fn, oldlen; + bool success; + gimple stmt = gsi_stmt (*gsi); + strinfo si, dsi, olddsi, zsi; + location_t loc; + + src = gimple_call_arg (stmt, 1); + dst = gimple_call_arg (stmt, 0); + idx = get_stridx (src); + if (idx <= 0) + return; + + si = get_strinfo (idx); + if (si == NULL) + return; + + didx = get_stridx (dst); + olddsi = NULL; + oldlen = NULL_TREE; + if (didx > 0) + olddsi = get_strinfo (didx); + else if (didx < 0) + return; + else + { + didx = new_stridx (dst); + if (didx == 0) + return; + } + if (olddsi != NULL) + { + oldlen = olddsi->length; + dsi = unshare_strinfo (olddsi); + dsi->length = si->length; + /* Break the chain, so adjust_related_strinfo on later pointers in + the chain won't adjust this one anymore. */ + dsi->next = 0; + } + else + { + dsi = new_strinfo (dst, didx, si->length); + set_strinfo (didx, dsi); + find_equal_ptrs (dst, didx); + } + dsi->dont_invalidate = true; + loc = gimple_location (stmt); + if (olddsi != NULL) + { + tree adj = NULL_TREE; + if (integer_zerop (oldlen)) + adj = si->length; + else if (TREE_CODE (oldlen) == INTEGER_CST + || TREE_CODE (si->length) == INTEGER_CST) + adj = fold_build2_loc (loc, MINUS_EXPR, + TREE_TYPE (si->length), si->length, + fold_convert_loc (loc, TREE_TYPE (si->length), + oldlen)); + if (adj != NULL_TREE) + adjust_related_strinfos (loc, dsi, adj); + } + /* strcpy src may not overlap dst, so src doesn't need to be + invalidated either. */ + si->dont_invalidate = true; + + lhs = gimple_call_lhs (stmt); + fn = NULL_TREE; + zsi = NULL; + switch (bcode) + { + case BUILT_IN_STRCPY: + fn = implicit_built_in_decls[BUILT_IN_MEMCPY]; + if (lhs) + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (lhs), didx); + break; + case BUILT_IN_STRCPY_CHK: + fn = built_in_decls[BUILT_IN_MEMCPY_CHK]; + if (lhs) + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (lhs), didx); + break; + case BUILT_IN_STPCPY: + /* This would need adjustment of the lhs (subtract one), + or detection that the trailing '\0' doesn't need to be + written, if it will be immediately overwritten. + fn = built_in_decls[BUILT_IN_MEMPCPY]; */ + if (lhs) + zsi = zero_length_string (lhs, dsi); + break; + case BUILT_IN_STPCPY_CHK: + /* This would need adjustment of the lhs (subtract one), + or detection that the trailing '\0' doesn't need to be + written, if it will be immediately overwritten. + fn = built_in_decls[BUILT_IN_MEMPCPY_CHK]; */ + if (lhs) + zsi = zero_length_string (lhs, dsi); + break; + default: + gcc_unreachable (); + } + if (zsi != NULL) + zsi->dont_invalidate = true; + + if (fn == NULL_TREE) + return; + + args = TYPE_ARG_TYPES (TREE_TYPE (fn)); + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args))); + + len = fold_convert_loc (loc, type, si->length); + len = fold_build2_loc (loc, PLUS_EXPR, type, len, build_int_cst (type, 1)); + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, + GSI_SAME_STMT); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (gimple_call_num_args (stmt) == 2) + success = update_gimple_call (gsi, fn, 3, dst, src, len); + else + success = update_gimple_call (gsi, fn, 4, dst, src, len, + gimple_call_arg (stmt, 2)); + if (success) + { + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + } + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) + fprintf (dump_file, "not possible.\n"); +} + +/* Handle a memcpy-like ({mem{,p}cpy,__mem{,p}cpy_chk}) call. + If strlen of the second argument is known and length of the third argument + is that plus one, strlen of the first argument is the same after this + call. */ + +static void +handle_builtin_memcpy (enum built_in_function bcode, gimple_stmt_iterator *gsi) +{ + int idx, didx; + tree src, dst, len, lhs, oldlen, newlen; + gimple stmt = gsi_stmt (*gsi); + strinfo si, dsi, olddsi; + + len = gimple_call_arg (stmt, 2); + src = gimple_call_arg (stmt, 1); + dst = gimple_call_arg (stmt, 0); + idx = get_stridx (src); + if (idx == 0) + return; + + if (idx > 0) + { + gimple def_stmt; + + /* Handle memcpy (x, y, l) where l is strlen (y) + 1. */ + si = get_strinfo (idx); + if (si == NULL) + return; + if (TREE_CODE (len) != SSA_NAME) + return; + def_stmt = SSA_NAME_DEF_STMT (len); + if (!is_gimple_assign (def_stmt) + || gimple_assign_rhs_code (def_stmt) != PLUS_EXPR + || gimple_assign_rhs1 (def_stmt) != si->length + || !integer_onep (gimple_assign_rhs2 (def_stmt))) + return; + } + else + { + si = NULL; + /* Handle memcpy (x, "abcd", 5) or + memcpy (x, "abc\0uvw", 7). */ + if (!host_integerp (len, 1) + || (unsigned HOST_WIDE_INT) tree_low_cst (len, 1) + <= (unsigned HOST_WIDE_INT) ~idx) + return; + } + + didx = get_stridx (dst); + olddsi = NULL; + if (didx > 0) + olddsi = get_strinfo (didx); + else if (didx < 0) + return; + else + { + didx = new_stridx (dst); + if (didx == 0) + return; + } + if (si != NULL) + newlen = si->length; + else + newlen = build_int_cst (TREE_TYPE (len), ~idx); + oldlen = NULL_TREE; + if (olddsi != NULL) + { + dsi = unshare_strinfo (olddsi); + oldlen = olddsi->length; + dsi->length = newlen; + /* Break the chain, so adjust_related_strinfo on later pointers in + the chain won't adjust this one anymore. */ + dsi->next = 0; + } + else + { + dsi = new_strinfo (dst, didx, newlen); + set_strinfo (didx, dsi); + find_equal_ptrs (dst, didx); + } + dsi->dont_invalidate = true; + if (olddsi != NULL) + { + tree adj = NULL_TREE; + location_t loc = gimple_location (stmt); + if (integer_zerop (oldlen)) + adj = dsi->length; + else if (TREE_CODE (oldlen) == INTEGER_CST + || TREE_CODE (dsi->length) == INTEGER_CST) + adj = fold_build2_loc (loc, MINUS_EXPR, + TREE_TYPE (dsi->length), dsi->length, + fold_convert_loc (loc, TREE_TYPE (dsi->length), + oldlen)); + if (adj != NULL_TREE) + adjust_related_strinfos (loc, dsi, adj); + } + /* memcpy src may not overlap dst, so src doesn't need to be + invalidated either. */ + if (si != NULL) + si->dont_invalidate = true; + + lhs = gimple_call_lhs (stmt); + switch (bcode) + { + case BUILT_IN_MEMCPY: + case BUILT_IN_MEMCPY_CHK: + if (lhs) + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (lhs), didx); + break; + case BUILT_IN_MEMPCPY: + case BUILT_IN_MEMPCPY_CHK: + break; + default: + gcc_unreachable (); + } +} + +/* Handle a strcat-like ({strcat,__strcat_chk}) call. + If strlen of the second argument is known, strlen of the first argument + is increased by the length of the second argument. Furthermore, attempt + to convert it to memcpy/strcpy if the length of the first argument + is known. */ + +static void +handle_builtin_strcat (enum built_in_function bcode, gimple_stmt_iterator *gsi) +{ + int idx, didx; + tree src, dst, dstlen, len, lhs, args, type, fn, objsz; + bool success; + gimple stmt = gsi_stmt (*gsi); + strinfo si, dsi; + location_t loc; + + src = gimple_call_arg (stmt, 1); + dst = gimple_call_arg (stmt, 0); + + didx = get_stridx (dst); + if (didx <= 0) + return; + + dsi = get_strinfo (didx); + if (dsi == NULL) + return; + + /* This should have been folded, don't handle it here. */ + idx = get_stridx (src); + if (idx < 0) + return; + + si = NULL; + if (idx) + si = get_strinfo (idx); + + loc = gimple_location (stmt); + dstlen = dsi->length; + if (si != NULL) + { + dsi = unshare_strinfo (dsi); + dsi->dont_invalidate = true; + dsi->length = fold_build2_loc (loc, PLUS_EXPR, TREE_TYPE (dsi->length), + dsi->length, si->length); + adjust_related_strinfos (loc, dsi, dstlen); + } + else + { + set_strinfo (didx, NULL); + free_strinfo (dsi); + } + if (si != NULL) + /* strcat src may not overlap dst, so src doesn't need to be + invalidated either. */ + si->dont_invalidate = true; + + lhs = gimple_call_lhs (stmt); + /* For now. Could remove the lhs from the call and add + lhs = dst; afterwards. */ + if (lhs) + return; + + fn = NULL_TREE; + objsz = NULL_TREE; + switch (bcode) + { + case BUILT_IN_STRCAT: + if (si) + fn = implicit_built_in_decls[BUILT_IN_MEMCPY]; + else + fn = implicit_built_in_decls[BUILT_IN_STRCPY]; + break; + case BUILT_IN_STRCAT_CHK: + if (si) + fn = built_in_decls[BUILT_IN_MEMCPY_CHK]; + else + fn = built_in_decls[BUILT_IN_STRCPY_CHK]; + objsz = gimple_call_arg (stmt, 2); + break; + default: + gcc_unreachable (); + } + + if (fn == NULL_TREE) + return; + + len = NULL_TREE; + if (si) + { + args = TYPE_ARG_TYPES (TREE_TYPE (fn)); + type = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (args))); + + len = fold_convert_loc (loc, type, si->length); + len = fold_build2_loc (loc, PLUS_EXPR, type, len, + build_int_cst (type, 1)); + len = force_gimple_operand_gsi (gsi, len, true, NULL_TREE, true, + GSI_SAME_STMT); + } + dst = fold_build2_loc (loc, POINTER_PLUS_EXPR, + TREE_TYPE (dst), dst, + fold_convert_loc (loc, sizetype, dstlen)); + dst = force_gimple_operand_gsi (gsi, dst, true, NULL_TREE, true, + GSI_SAME_STMT); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "Optimizing: "); + print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); + } + if (si) + success = update_gimple_call (gsi, fn, 3 + (objsz != NULL_TREE), + dst, src, len, objsz); + else + success = update_gimple_call (gsi, fn, 2 + (objsz != NULL_TREE), + dst, src, objsz); + if (success) + { + update_stmt (gsi_stmt (*gsi)); + if (dump_file && (dump_flags & TDF_DETAILS) != 0) + { + fprintf (dump_file, "into: "); + print_gimple_stmt (dump_file, gsi_stmt (*gsi), 0, TDF_SLIM); + } + } + else if (dump_file && (dump_flags & TDF_DETAILS) != 0) + fprintf (dump_file, "not possible.\n"); +} + +/* Attempt to optimize a single statement at *GSI using string length + knowledge. */ + +static void +strlen_optimize_stmt (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + tree lhs; + + if (is_gimple_call (stmt)) + { + tree callee = gimple_call_fndecl (stmt); + if (callee && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL) + switch (DECL_FUNCTION_CODE (callee)) + { + case BUILT_IN_STRLEN: + handle_builtin_strlen (gsi); + break; + case BUILT_IN_STRCPY: + case BUILT_IN_STRCPY_CHK: + case BUILT_IN_STPCPY: + case BUILT_IN_STPCPY_CHK: + handle_builtin_strcpy (DECL_FUNCTION_CODE (callee), gsi); + break; + case BUILT_IN_MEMCPY: + case BUILT_IN_MEMCPY_CHK: + case BUILT_IN_MEMPCPY: + case BUILT_IN_MEMPCPY_CHK: + handle_builtin_memcpy (DECL_FUNCTION_CODE (callee), gsi); + break; + case BUILT_IN_STRCAT: + case BUILT_IN_STRCAT_CHK: + handle_builtin_strcat (DECL_FUNCTION_CODE (callee), gsi); + break; + default: + break; + } + } + else if (is_gimple_assign (stmt) + && (lhs = gimple_assign_lhs (stmt)) != NULL_TREE + && TREE_CODE (lhs) == SSA_NAME + && POINTER_TYPE_P (TREE_TYPE (lhs))) + { + if (gimple_assign_single_p (stmt) + || (gimple_assign_cast_p (stmt) + && POINTER_TYPE_P (TREE_TYPE (gimple_assign_rhs1 (stmt))))) + { + int idx = get_stridx (gimple_assign_rhs1 (stmt)); + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (lhs), idx); + } + else if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR) + { + int idx = get_stridx (gimple_assign_rhs1 (stmt)); + if (idx > 0) + { + strinfo si = get_strinfo (idx); + if (si != NULL) + { + tree off = gimple_assign_rhs2 (stmt); + if (operand_equal_p (si->length, off, 0)) + zero_length_string (lhs, si); + else if (TREE_CODE (off) == SSA_NAME) + { + gimple def_stmt = SSA_NAME_DEF_STMT (off); + if (gimple_assign_single_p (def_stmt) + && operand_equal_p (si->length, + gimple_assign_rhs1 (def_stmt), + 0)) + zero_length_string (lhs, si); + } + } + } + else if (idx < 0) + { + tree off = gimple_assign_rhs2 (stmt); + if (host_integerp (off, 1) + && (unsigned HOST_WIDE_INT) tree_low_cst (off, 1) + <= (unsigned HOST_WIDE_INT) ~idx) + VEC_replace (int, ssa_ver_to_stridx, SSA_NAME_VERSION (lhs), + ~(int) tree_low_cst (off, 1)); + } + } + } + + if (gimple_vdef (stmt)) + maybe_invalidate (stmt); +} + +/* Recursively call maybe_invalidate on stmts that might be executed + in between dombb and current bb and that contain a vdef. Stop when + *count stmts are inspected, or if the whole strinfo vector has + been invalidated. */ + +static void +do_invalidate (basic_block dombb, gimple phi, bitmap visited, int *count) +{ + unsigned int i, n = gimple_phi_num_args (phi); + + for (i = 0; i < n; i++) + { + tree vuse = gimple_phi_arg_def (phi, i); + gimple stmt = SSA_NAME_DEF_STMT (vuse); + basic_block bb = gimple_bb (stmt); + if (bb == NULL + || !bitmap_set_bit (visited, bb->index) + || !dominated_by_p (CDI_DOMINATORS, bb, dombb)) + continue; + while (1) + { + if (gimple_code (stmt) == GIMPLE_PHI) + { + do_invalidate (dombb, stmt, visited, count); + if (*count == 0) + return; + break; + } + if (--*count == 0) + return; + if (!maybe_invalidate (stmt)) + { + *count = 0; + return; + } + vuse = gimple_vuse (stmt); + stmt = SSA_NAME_DEF_STMT (vuse); + if (gimple_bb (stmt) != bb) + { + bb = gimple_bb (stmt); + if (bb == NULL + || !bitmap_set_bit (visited, bb->index) + || !dominated_by_p (CDI_DOMINATORS, bb, dombb)) + break; + } + } + } +} + +/* Callback for walk_dominator_tree. Attempt to optimize various + string ops by remembering string lenths pointed by pointer SSA_NAMEs. */ + +static void +strlen_enter_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED, + basic_block bb) +{ + gimple_stmt_iterator gsi; + basic_block dombb = get_immediate_dominator (CDI_DOMINATORS, bb); + + if (dombb == NULL) + stridx_to_strinfo = NULL; + else + { + stridx_to_strinfo = (VEC(strinfo, heap) *) dombb->aux; + if (stridx_to_strinfo) + { + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple phi = gsi_stmt (gsi); + if (!is_gimple_reg (gimple_phi_result (phi))) + { + bitmap visited = BITMAP_ALLOC (NULL); + int count_vdef = 100; + do_invalidate (dombb, phi, visited, &count_vdef); + BITMAP_FREE (visited); + break; + } + } + } + } + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + strlen_optimize_stmt (&gsi); + + bb->aux = stridx_to_strinfo; + if (VEC_length (strinfo, stridx_to_strinfo) && !strinfo_shared ()) + VEC_replace (strinfo, stridx_to_strinfo, 0, (strinfo) bb); +} + +/* Callback for walk_dominator_tree. Free strinfo vector if it is + owned by the current bb, clear bb->aux. */ + +static void +strlen_leave_block (struct dom_walk_data *walk_data ATTRIBUTE_UNUSED, + basic_block bb) +{ + if (bb->aux) + { + stridx_to_strinfo = (VEC(strinfo, heap) *) bb->aux; + if (VEC_length (strinfo, stridx_to_strinfo) + && VEC_index (strinfo, stridx_to_strinfo, 0) == (strinfo) bb) + { + unsigned int i; + strinfo si; + + for (i = 1; VEC_iterate (strinfo, stridx_to_strinfo, i, si); ++i) + free_strinfo (si); + VEC_free (strinfo, heap, stridx_to_strinfo); + } + bb->aux = NULL; + } +} + +/* Main entry point. */ + +static unsigned int +tree_ssa_strlen (void) +{ + struct dom_walk_data walk_data; + + VEC_safe_grow_cleared (int, heap, ssa_ver_to_stridx, num_ssa_names); + max_stridx = 1; + strinfo_pool = create_alloc_pool ("strinfo_struct pool", + sizeof (struct strinfo_struct), 64); + + calculate_dominance_info (CDI_DOMINATORS); + + /* String length optimization is implemented as a walk of the dominator + tree and a forward walk of statements within each block. */ + walk_data.dom_direction = CDI_DOMINATORS; + walk_data.initialize_block_local_data = NULL; + walk_data.before_dom_children = strlen_enter_block; + walk_data.after_dom_children = strlen_leave_block; + walk_data.block_local_data_size = 0; + walk_data.global_data = NULL; + + /* Initialize the dominator walker. */ + init_walk_dominator_tree (&walk_data); + + /* Recursively walk the dominator tree. */ + walk_dominator_tree (&walk_data, ENTRY_BLOCK_PTR); + + /* Finalize the dominator walker. */ + fini_walk_dominator_tree (&walk_data); + + VEC_free (int, heap, ssa_ver_to_stridx); + free_alloc_pool (strinfo_pool); + if (decl_to_stridxlist_htab) + { + obstack_free (&stridx_obstack, NULL); + htab_delete (decl_to_stridxlist_htab); + decl_to_stridxlist_htab = NULL; + } + + return 0; +} + +static bool +gate_strlen (void) +{ + return flag_optimize_strlen != 0; +} + +struct gimple_opt_pass pass_strlen = +{ + { + GIMPLE_PASS, + "strlen", /* name */ + gate_strlen, /* gate */ + tree_ssa_strlen, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_TREE_STRLEN, /* tv_id */ + PROP_cfg | PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_ggc_collect + | TODO_verify_ssa /* todo_flags_finish */ + } +}; --- gcc/testsuite/gcc.dg/strlenopt-1.c.jj 2011-09-05 13:10:53.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-1.c 2011-09-05 13:10:53.000000000 +0200 @@ -0,0 +1,45 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +#include "strlenopt.h" + +__attribute__((noinline, noclone)) char * +foo (char *p, char *r) +{ + char *q = malloc (strlen (p) + strlen (r) + 64); + if (q == NULL) return NULL; + /* This strcpy can be optimized into memcpy, using the remembered + strlen (p). */ + strcpy (q, p); + /* These two strcat can be optimized into memcpy. The first one + could be even optimized into a *ptr = '/'; store as the '\0' + is immediately overwritten. */ + strcat (q, "/"); + strcat (q, "abcde"); + /* Due to inefficient PTA (PR50262) the above calls invalidate + string length of r, so it is optimized just into strcpy instead + of memcpy. */ + strcat (q, r); + return q; +} + +int +main () +{ + char *volatile p = "string1"; + char *volatile r = "string2"; + char *q = foo (p, r); + if (q != NULL) + { + if (strcmp (q, "string1/abcdestring2")) + abort (); + free (q); + } + return 0; +} + +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 3 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ +/* { dg-final { cleanup-tree-dump "strlen" } } */ --- gcc/testsuite/gcc.dg/strlenopt-2.c.jj 2011-09-05 13:10:53.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-2.c 2011-09-05 13:10:53.000000000 +0200 @@ -0,0 +1,47 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +#include "strlenopt.h" + +__attribute__((noinline, noclone)) char * +foo (char *p, char *r) +{ + char buf[26]; + if (strlen (p) + strlen (r) + 9 > 26) + return NULL; + /* This strcpy can be optimized into memcpy, using the remembered + strlen (p). */ + strcpy (buf, p); + /* These two strcat can be optimized into memcpy. The first one + could be even optimized into a *ptr = '/'; store as the '\0' + is immediately overwritten. */ + strcat (buf, "/"); + strcat (buf, "abcde"); + /* This strcpy can be optimized into memcpy, using the remembered + strlen (r). */ + strcat (buf, r); + /* And this can be optimized into memcpy too. */ + strcat (buf, "fg"); + return strdup (buf); +} + +int +main () +{ + char *volatile p = "string1"; + char *volatile r = "string2"; + char *q = foo (p, r); + if (q != NULL) + { + if (strcmp (q, "string1/abcdestring2fg")) + abort (); + free (q); + } + return 0; +} + +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 5 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ +/* { dg-final { cleanup-tree-dump "strlen" } } */ --- gcc/testsuite/gcc.dg/strlenopt-3.c.jj 2011-09-05 13:10:53.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt-3.c 2011-09-05 13:10:53.000000000 +0200 @@ -0,0 +1,63 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-strlen -fdump-tree-optimized" } */ + +#include "strlenopt.h" + +__attribute__((noinline, noclone)) size_t +fn1 (char *p, char *q) +{ + size_t s = strlen (q); + strcpy (p, q); + return s - strlen (p); +} + +__attribute__((noinline, noclone)) size_t +fn2 (char *p, char *q) +{ + size_t s = strlen (q); + memcpy (p, q, s + 1); + return s - strlen (p); +} + +__attribute__((noinline, noclone)) size_t +fn3 (char *p) +{ + memcpy (p, "abcd", 5); + return strlen (p); +} + +__attribute__((noinline, noclone)) size_t +fn4 (char *p) +{ + memcpy (p, "efg\0hij", 6); + return strlen (p); +} + +int +main () +{ + char buf[64]; + char *volatile p = buf; + char *volatile q = "ABCDEF"; + buf[7] = 'G'; + if (fn1 (p, q) != 0 || memcmp (buf, "ABCDEF\0G", 8)) + abort (); + q = "HIJ"; + if (fn2 (p + 1, q) != 0 || memcmp (buf, "AHIJ\0F\0G", 8)) + abort (); + buf[6] = 'K'; + if (fn3 (p + 1) != 4 || memcmp (buf, "Aabcd\0KG", 8)) + abort (); + if (fn4 (p) != 3 || memcmp (buf, "efg\0hiKG", 8)) + abort (); + return 0; +} + +/* { dg-final { scan-tree-dump-times "strlen \\(" 2 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 4 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ +/* { dg-final { cleanup-tree-dump "strlen" } } */ +/* { dg-final { scan-tree-dump-times "return 0" 3 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "return 4" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "return 3" 1 "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ --- gcc/testsuite/gcc.dg/strlenopt.h.jj 2011-09-05 13:10:53.000000000 +0200 +++ gcc/testsuite/gcc.dg/strlenopt.h 2011-09-05 13:10:53.000000000 +0200 @@ -0,0 +1,57 @@ +/* This is a replacement of needed parts from stdlib.h and string.h + for -foptimize-strlen testing, to ensure we are testing the builtins + rather than whatever the OS has in its headers. */ + +#define NULL ((void *) 0) +typedef __SIZE_TYPE__ size_t; +extern void abort (void); +void *malloc (size_t); +void free (void *); +char *strdup (const char *); +size_t strlen (const char *); +void *memcpy (void *__restrict, const void *__restrict, size_t); +char *strcpy (char *__restrict, const char *__restrict); +char *strcat (char *__restrict, const char *__restrict); +int memcmp (const void *, const void *, size_t); +int strcmp (const char *, const char *); +#ifdef USE_GNU +void *mempcpy (void *__restrict, const void *__restrict, size_t); +char *stpcpy (char *__restrict, const char *__restrict); +#endif + +#if defined(FORTIFY_SOURCE) && FORTIFY_SOURCE > 0 && __OPTIMIZE__ +# define bos(ptr) __builtin_object_size (x, FORTIFY_SOURCE > 0) +# define bos0(ptr) __builtin_object_size (x, 0) + +extern inline __attribute__((gnu_inline, always_inline, artificial)) void * +memcpy (void *__restrict dest, const void *__restrict src, size_t len) +{ + return __builtin___memcpy_chk (dest, src, len, bos0 (dest)); +} + +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * +strcpy (char *__restrict dest, const char *__restrict src) +{ + return __builtin___strcpy_chk (dest, src, bos (dest)); +} + +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * +strcat (char *__restrict dest, const char *__restrict src) +{ + return __builtin___strcat_chk (dest, src, bos (dest)); +} + +# ifdef USE_GNU +extern inline __attribute__((gnu_inline, always_inline, artificial)) void * +mempcpy (void *__restrict dest, const void *__restrict src, size_t len) +{ + return __builtin___mempcpy_chk (dest, src, len, bos0 (dest)); +} + +extern inline __attribute__((gnu_inline, always_inline, artificial)) char * +stpcpy (char *__restrict dest, const char *__restrict src) +{ + return __builtin___stpcpy_chk (dest, src, bos (dest)); +} +# endif +#endif ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-09-05 18:45 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-09-02 16:51 [RFC, WIP] tree-ssa-strlen optimization pass Jakub Jelinek 2011-09-02 17:15 ` Paolo Carlini 2011-09-02 18:23 ` Jakub Jelinek 2011-09-02 18:42 ` Paolo Carlini 2011-09-05 8:55 ` Richard Guenther 2011-09-05 9:41 ` Jakub Jelinek 2011-09-05 9:51 ` Richard Guenther 2011-09-05 18:45 ` Jakub Jelinek
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).