On 7 October 2017 at 12:35, Prathamesh Kulkarni wrote: > On 7 October 2017 at 11:23, Jan Hubicka wrote: >>> On 6 October 2017 at 06:04, Jan Hubicka wrote: >>> >> Hi Honza, >>> >> Thanks for the detailed suggestions, I have updated the patch accordingly. >>> >> I have following questions on call_summary: >>> >> 1] I added field bool is_return_callee in ipa_call_summary to track >>> >> whether the caller possibly returns value returned by callee, which >>> >> gets rid of return_callees_map. I assume ipa_call_summary_t::remove() >>> >> and ipa_call_summary_t::duplicate() will already take care of handling >>> >> late insertion/removal of cgraph nodes ? I just initialized >>> >> is_return_callee to false in ipa_call_summary::reset and that seems to >>> >> work. I am not sure though if I have handled it correctly. Could you >>> >> please check that ? >>> > >>> > I was actually thinking to introduce separate summary for ipa-pure-const pass, >>> > but this seems fine to me too (for one bit definitly more effecient) >>> > ipa_call_summary_t::duplicate copies all the fields, so indeed you should be >>> > safe here. >>> > >>> > Also it is possible for functions to be inserted late. Updating of call summaries >>> > is currently handled by ipa_fn_summary_t::insert >>> >> >>> >> 2] ipa_inline() called ipa_free_fn_summary, which made >>> >> ipa_call_summaries unavailable during ipa-pure-const pass. I removed >>> >> call to ipa_free_fn_summary from ipa_inline, and moved it to >>> >> ipa_pure_const::execute(). Is that OK ? >>> > >>> > Seems OK to me. >>> >> >>> >> Patch passes bootstrap+test and lto bootstrap+test on x86_64-unknown-linux-gnu. >>> >> Verfiied SPEC2k6 compiles and runs without miscompares with LTO >>> >> enabled on aarch64-linux-gnu. >>> >> Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test >>> >> the patch by building chromium or firefox. >>> >> Would it be OK to commit if it passes above validations ? >>> >> >>> >> Thanks, >>> >> Prathamesh >>> >> > >>> >> > Thanks, >>> >> > Honza >>> > >>> >> 2017-10-05 Prathamesh Kulkarni >>> >> >>> >> * cgraph.h (set_malloc_flag): Declare. >>> >> * cgraph.c (set_malloc_flag_1): New function. >>> >> (set_malloc_flag): Likewise. >>> >> * ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee. >>> >> * ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to >>> >> false. >>> >> (read_ipa_call_summary): Add support for reading is_return_callee. >>> >> (write_ipa_call_summary): Stream is_return_callee. >>> >> * ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary. >>> >> * ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h, >>> >> ipa-prop.h, ipa-fnsummary.h. >>> >> (malloc_state_e): Define. >>> >> (malloc_state_names): Define. >>> >> (funct_state_d): Add field malloc_state. >>> >> (varying_state): Set malloc_state to STATE_MALLOC_BOTTOM. >>> >> (check_retval_uses): New function. >>> >> (malloc_candidate_p): Likewise. >>> >> (analyze_function): Add support for malloc attribute. >>> >> (pure_const_write_summary): Stream malloc_state. >>> >> (pure_const_read_summary): Add support for reading malloc_state. >>> >> (dump_malloc_lattice): New function. >>> >> (propagate_malloc): New function. >>> >> (ipa_pure_const::execute): Call propagate_malloc and >>> >> ipa_free_fn_summary. >>> >> (pass_local_pure_const::execute): Add support for malloc attribute. >>> >> * ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro. >>> >> >>> >> testsuite/ >>> >> * gcc.dg/ipa/propmalloc-1.c: New test-case. >>> >> * gcc.dg/ipa/propmalloc-2.c: Likewise. >>> >> * gcc.dg/ipa/propmalloc-3.c: Likewise. >>> >> >>> >> diff --git a/gcc/cgraph.c b/gcc/cgraph.c >>> >> index 3d0cefbd46b..0aad90d59ea 100644 >>> >> --- a/gcc/cgraph.c >>> >> +++ b/gcc/cgraph.c >>> >> @@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow) >>> >> return changed; >>> >> } >>> >> >>> >> +/* Worker to set malloc flag. */ >>> > New line here I guess (it is below) >>> >> +static void >>> >> +set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed) >>> >> +{ >>> >> + if (malloc_p && !DECL_IS_MALLOC (node->decl)) >>> >> + { >>> >> + DECL_IS_MALLOC (node->decl) = true; >>> >> + *changed = true; >>> >> + } >>> >> + >>> >> + ipa_ref *ref; >>> >> + FOR_EACH_ALIAS (node, ref) >>> >> + { >>> >> + cgraph_node *alias = dyn_cast (ref->referring); >>> >> + if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE) >>> >> + set_malloc_flag_1 (alias, malloc_p, changed); >>> >> + } >>> >> + >>> >> + for (cgraph_edge *e = node->callers; e; e = e->next_caller) >>> >> + if (e->caller->thunk.thunk_p >>> >> + && (!malloc_p || e->caller->get_availability () > AVAIL_INTERPOSABLE)) >>> >> + set_malloc_flag_1 (e->caller, malloc_p, changed); >>> >> +} >>> >> + >>> >> +/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any. */ >>> >> + >>> >> +bool >>> >> +cgraph_node::set_malloc_flag (bool malloc_p) >>> >> +{ >>> >> + bool changed = false; >>> >> + >>> >> + if (!malloc_p || get_availability () > AVAIL_INTERPOSABLE) >>> >> + set_malloc_flag_1 (this, malloc_p, &changed); >>> >> + else >>> >> + { >>> >> + ipa_ref *ref; >>> >> + >>> >> + FOR_EACH_ALIAS (this, ref) >>> >> + { >>> >> + cgraph_node *alias = dyn_cast (ref->referring); >>> >> + if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE) >>> >> + set_malloc_flag_1 (alias, malloc_p, &changed); >>> >> + } >>> >> + } >>> >> + return changed; >>> >> +} >>> >> + >>> >> diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h >>> >> index f50d6806e61..613a2b6f625 100644 >>> >> --- a/gcc/ipa-fnsummary.h >>> >> +++ b/gcc/ipa-fnsummary.h >>> >> @@ -197,7 +197,9 @@ struct ipa_call_summary >>> >> int call_stmt_time; >>> >> /* Depth of loop nest, 0 means no nesting. */ >>> >> unsigned int loop_depth; >>> >> - >>> >> + /* Indicates whether the caller returns the value of it's callee. */ >>> > Perhaps add a comment when it is initialized. >>> > Don't you also check that the calle is not captured? In that case I would >>> > is_return_callee_uncaptured or so, so someone does not try to use it with >>> > different meaning. >>> Sorry, I didn't understand "Don't you also check that callee is not captured ?" >>> What does capturing mean in this context ? >> >> Don't you need to know that the returned pointer is new and does not alias >> with something else? > Yes, malloc_candidate_p enforces that locally by checking the returned > pointer is return value of callee and optionally has immediate uses > only within comparisons against 0. But whether the returned pointer is > new depends on whether callee itself can be annotated with malloc, > which is done in propagate_malloc. > IIUC pointer returned by a malloc annotated function, does not alias > anything else ? > > Thanks, > Prathamesh >> >> Honza >>> >>> Thanks, >>> Prathamesh >>> > >>> >> @@ -69,6 +74,15 @@ enum pure_const_state_e >>> >> >>> >> const char *pure_const_names[3] = {"const", "pure", "neither"}; >>> >> >>> >> +enum malloc_state_e >>> >> +{ >>> >> + STATE_MALLOC_TOP, >>> >> + STATE_MALLOC, >>> >> + STATE_MALLOC_BOTTOM >>> >> +}; >>> >> + >>> >> +const char *malloc_state_names[] = {"malloc_top", "malloc", "malloc_bottom"}; >>> > >>> > Seems static is missing in all those declarations, please fix it. Hi Honza, I have done the suggested changes in this version. LTO Bootstrapped+tested on x86_64-unknown-linux-gnu, ppc64le-linux-gnu. Cross-tested on arm*-*-*, aarch64*-*-* Verified no functional regressions with SPEC2006. Would it be OK to commit this version of the patch ? Thanks, Prathamesh >>> > >>> > OK with these changes. Thanks! >>> > >>> > Honza