From: Richard Biener <rguenther@suse.de>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: gcc@gcc.gnu.org, gcc-patches@gcc.gnu.org
Subject: Re: Detect EAF flags in ipa-modref
Date: Wed, 11 Nov 2020 11:09:12 +0100 (CET) [thread overview]
Message-ID: <nycvar.YFH.7.76.2011111104090.10073@p653.nepu.fhfr.qr> (raw)
In-Reply-To: <20201110125418.GB53229@kam.mff.cuni.cz>
On Tue, 10 Nov 2020, Jan Hubicka wrote:
> > > + tree callee = gimple_call_fndecl (stmt);
> > > + if (callee)
> > > + {
> > > + cgraph_node *node = cgraph_node::get (callee);
> > > + modref_summary *summary = node ? get_modref_function_summary (node)
> > > + : NULL;
> > > +
> > > + if (summary && summary->arg_flags.length () > arg)
> >
> > So could we make modref "transform" push this as fnspec attribute or
> > would that not really be an optimization?
>
> It was my original plan to synthetize fnspecs, but I think it is not
> very good idea: we have the summary readily available and we can
> represent information that fnspecs can't
> (do not have artificial limits on number of parameters or counts)
>
> I would preffer fnspecs to be used only for in-compiler declarations.
Fine, I was just curious...
> > > +
> > > +/* Analyze EAF flags for SSA name NAME.
> > > + KNOWN_FLAGS is a cache for flags we already determined.
> > > + DEPTH is a recursion depth used to make debug output prettier. */
> > > +
> > > +static int
> > > +analyze_ssa_name_flags (tree name, vec<int> *known_flags, int depth)
> >
> > C++ has references which makes the access to known_flags nicer ;)
>
> Yay, will chang that :)
> >
> > > +{
> > > + imm_use_iterator ui;
> > > + gimple *use_stmt;
> > > + int flags = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED;
> > > +
> > > + /* See if value is already computed. */
> > > + if ((*known_flags)[SSA_NAME_VERSION (name)])
> > > + {
> > > + /* Punt on cycles for now, so we do not need dataflow. */
> > > + if ((*known_flags)[SSA_NAME_VERSION (name)] == 1)
> > > + {
> > > + if (dump_file)
> > > + fprintf (dump_file,
> > > + "%*sGiving up on a cycle in SSA graph\n", depth * 4, "");
> > > + return 0;
> > > + }
> > > + return (*known_flags)[SSA_NAME_VERSION (name)] - 2;
> > > + }
> > > + /* Recursion guard. */
> > > + (*known_flags)[SSA_NAME_VERSION (name)] = 1;
> >
> > This also guards against multiple evaluations of the same stmts
> > but only in some cases? Consider
> >
> > _1 = ..;
> > _2 = _1 + _3;
> > _4 = _1 + _5;
> > _6 = _2 + _4;
> >
> > where we visit _2 = and _4 = from _1 but from both are going
> > to visit _6.
>
> Here we first push _6, then we go for _2 then for _1 evaluate _1,
> evalueate _2, go for _4 and evaluate _4, and evaluate _6.
> It is DFS and you need backward edge in DFS (comming from a PHI).
Hmm, but then we eventually evaluate _6 twice?
>
> Cycles seems to somewhat matter for GCC: we do have a lot of functions
> that walk linked lists that we could track otherwise.
> >
> > Maybe I'm blind but you're not limiting depth? Guess that asks
> > for problems, esp. as you are recursing rather than using a
> > worklist or so?
> >
> > I see you try to "optimize" the walk by only visiting def->use
> > links from parameters but then a RPO walk over all stmts would
> > be simpler iteration-wise ...
> We usually evaluate just small part of bigger functions (since we lose
> track quite easily after hitting first memory store). My plan was to
> change this to actual dataflow once we have it well defined
> (this means after discussing EAF flags with you and adding the logic to
> track callsites for true IPA pass that midly complicated things - for
> every ssa name I track callsite/arg pair where it is passed to
> either directly or indirectly. Then this is translaed into call summary
> and used by IPA pass to compute final flags)
>
> I guess I can add --param ipa-modref-walk-depth for now and handle
> dataflow incremntally?
Works for me.
> In particular I am not sure if I should just write iterated RPO myself
> or use tree-ssa-propagate.h (the second may be overkill).
tree-ssa-propagate.h is not to be used, it should DIE ;)
I guess you do want to iterate SSA cycles rather than BB cycles
so I suggest to re-surrect the SSA SCC discovery from the SCC
value-numbering (see tree-ssa-sccvn.c:DFS () on the gcc-8-branch)
which is non-recursive and micro-optimized. Could put it
somewhere useful (tree-ssa.c?).
> >
> > > + if (dump_file)
> > > + {
> > > + fprintf (dump_file,
> > > + "%*sAnalyzing flags of ssa name: ", depth * 4, "");
> > > + print_generic_expr (dump_file, name);
> > > + fprintf (dump_file, "\n");
> > > + }
> > > +
> > > + FOR_EACH_IMM_USE_STMT (use_stmt, ui, name)
> > > + {
> > > + if (flags == 0)
> > > + {
> > > + BREAK_FROM_IMM_USE_STMT (ui);
> > > + }
> > > + if (is_gimple_debug (use_stmt))
> > > + continue;
> > > + if (dump_file)
> > > + {
> > > + fprintf (dump_file, "%*s Analyzing stmt:", depth * 4, "");
> > > + print_gimple_stmt (dump_file, use_stmt, 0);
> > > + }
> > > +
> > > + /* Gimple return may load the return value. */
> > > + if (gimple_code (use_stmt) == GIMPLE_RETURN)
> >
> > if (greturn *ret = dyn_cast <greturn *> (use_stmt))
> >
> > makes the as_a below not needed, similar for the other cases (it
> > also makes accesses cheaper, avoiding some checking code).
>
> Looks cleaner indeed.
> >
> > > + {
> > > + if (memory_access_to (gimple_return_retval
> > > + (as_a <greturn *> (use_stmt)), name))
> > > + flags &= ~EAF_UNUSED;
> > > + }
> > > + /* Account for LHS store, arg loads and flags from callee function. */
> > > + else if (is_gimple_call (use_stmt))
> > > + {
> > > + tree callee = gimple_call_fndecl (use_stmt);
> > > +
> > > + /* Recursion would require bit of propagation; give up for now. */
> > > + if (callee && recursive_call_p (current_function_decl, callee))
> > > + flags = 0;
> > > + else
> > > + {
> > > + int ecf_flags = gimple_call_flags (use_stmt);
> > > + bool ignore_stores = ignore_stores_p (current_function_decl,
> > > + ecf_flags);
> > > +
> > > + /* Handle *name = func (...). */
> > > + if (gimple_call_lhs (use_stmt)
> > > + && memory_access_to (gimple_call_lhs (use_stmt), name))
> > > + flags &= ~(EAF_UNUSED | EAF_NOCLOBBER);
> > > +
> > > + /* Handle all function parameters. */
> > > + for (unsigned i = 0; i < gimple_call_num_args (use_stmt); i++)
> > > + /* Name is directly passed to the callee. */
> > > + if (gimple_call_arg (use_stmt, i) == name)
> > > + {
> > > + if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
> > > + flags &= ignore_stores
> > > + ? 0
> > > + : call_lhs_flags (use_stmt, i,
> > > + known_flags, depth);
> > > + else
> > > + {
> > > + int call_flags = gimple_call_arg_flags (as_a <gcall *>
> > > + (use_stmt), i);
> > > + if (ignore_stores)
> > > + call_flags |= EAF_NOCLOBBER | EAF_NOESCAPE;
> > > + else
> > > + call_flags &= call_lhs_flags (use_stmt, i,
> > > + known_flags, depth);
> > > +
> > > + flags &= call_flags;
> > > + }
> > > + }
> > > + /* Name is dereferenced and passed to a callee. */
> > > + else if (memory_access_to (gimple_call_arg (use_stmt, i), name))
> > > + {
> > > + if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
> > > + flags &= ~EAF_UNUSED;
> > > + else
> > > + flags &= deref_flags (gimple_call_arg_flags
> > > + (as_a <gcall *> (use_stmt), i),
> > > + ignore_stores);
> > > + if (!ignore_stores)
> > > + flags &= call_lhs_flags (use_stmt, i, known_flags,
> > > + depth);
> > > + }
> >
> > Hmm, you forget gimple_call_chain at least which is at least
> > argument passing? Possibly the chain argument is never a function
> > parameter but how do you know... (partial inlining?)
>
> Ah, right. I forgot about chain.
> I wil ladd it.
> >
> > Then there's gimple_call_fn for indirect function calls to a
> > function parameter?
>
> Well, it is never load or store, so I do not really care about it.
> for
>
> void test (int (*foo)())
> {
> foo();
> }
>
> I should be able to derive EAF_UNUSED.
>
> void test (int (**foo)())
> {
> (*foo)();
> }
>
> I will see sparate load.
> >
> > I guess it would be nice to amend the gimple_walk_ stuff to have
> > a SSA name callback where the tree and the SSA use is passed.
> > Well, for another time.
> >
> > > + }
> > > + /* Only NOCLOBBER or DIRECT flags alone are not useful (see comments
> > > + in tree-ssa-alias.c). Give up earlier. */
> > > + if ((flags & ~(EAF_DIRECT | EAF_NOCLOBBER)) == 0)
> > > + flags = 0;
> > > + }
> > > + else if (gimple_assign_load_p (use_stmt))
> > > + {
> > > + /* Memory to memory copy. */
> > > + if (gimple_store_p (use_stmt))
> > > + {
> > > + /* Handle *name = *exp. */
> > > + if (memory_access_to (gimple_assign_lhs (use_stmt), name))
> > > + flags &= ~(EAF_UNUSED | EAF_NOCLOBBER);
> > > +
> > > + /* Handle *lhs = *name.
> > > +
> > > + We do not track memory locations, so assume that value
> > > + is used arbitrarily. */
> > > + if (memory_access_to (gimple_assign_rhs1 (use_stmt), name))
> > > + flags = 0;
> > > + }
> > > + /* Handle lhs = *name. */
> > > + else if (memory_access_to (gimple_assign_rhs1 (use_stmt), name))
> > > + flags &= deref_flags (analyze_ssa_name_flags
> > > + (gimple_assign_lhs (use_stmt),
> > > + known_flags, depth + 1), false);
> > > + }
> > > + else if (gimple_store_p (use_stmt))
> > > + {
> > > + /* Handle *lhs = name. */
> > > + if (is_gimple_assign (use_stmt)
> > > + && gimple_assign_rhs1 (use_stmt) == name)
> > > + {
> > > + if (dump_file)
> > > + fprintf (dump_file, "%*s ssa name saved to memory\n",
> > > + depth * 4, "");
> > > + flags = 0;
> > > + }
> > > + /* Handle *name = exp. */
> > > + else if (is_gimple_assign (use_stmt)
> > > + && memory_access_to (gimple_assign_lhs (use_stmt), name))
> > > + flags &= ~(EAF_UNUSED | EAF_NOCLOBBER);
> > > + /* ASM statements etc. */
> > > + else if (!is_gimple_assign (use_stmt))
> > > + {
> > > + if (dump_file)
> > > + fprintf (dump_file, "%*s Unhandled store\n",
> > > + depth * 4, "");
> > > + flags = 0;
> > > + }
> > > + }
> > > + else if (is_gimple_assign (use_stmt))
> > > + {
> > > + enum tree_code code = gimple_assign_rhs_code (use_stmt);
> > > +
> > > + /* See if operation is a merge as considered by
> > > + tree-ssa-structalias.c:find_func_aliases. */
> > > + if (!truth_value_p (code)
> > > + && code != POINTER_DIFF_EXPR
> > > + && (code != POINTER_PLUS_EXPR
> > > + || gimple_assign_rhs1 (use_stmt) == name))
> > > + flags &= analyze_ssa_name_flags
> > > + (gimple_assign_lhs (use_stmt), known_flags,
> > > + depth + 1);
> > > + }
> > > + else if (gimple_code (use_stmt) == GIMPLE_PHI)
> > > + {
> > > + flags &= analyze_ssa_name_flags
> > > + (gimple_phi_result (use_stmt), known_flags,
> > > + depth + 1);
> > > + }
> > > + /* Conditions are not considered escape points
> > > + by tree-ssa-structalias. */
> > > + else if (gimple_code (use_stmt) == GIMPLE_COND)
> > > + ;
> > > + else
> > > + {
> > > + if (dump_file)
> > > + fprintf (dump_file, "%*s Unhandled stmt\n", depth * 4, "");
> > > + flags = 0;
> > > + }
> > > +
> > > + if (dump_file)
> > > + {
> > > + fprintf (dump_file, "%*s current flags of ", depth * 4, "");
> > > + print_generic_expr (dump_file, name);
> > > + dump_eaf_flags (dump_file, flags);
> > > + }
> > > + }
> > > + if (dump_file)
> > > + {
> > > + fprintf (dump_file, "%*sflags of ssa name ", depth * 4, "");
> > > + print_generic_expr (dump_file, name);
> > > + dump_eaf_flags (dump_file, flags);
> > > + }
> > > + (*known_flags)[SSA_NAME_VERSION (name)] = flags + 2;
> > > + return flags;
> > > +}
> > > +
> > > +/* Determine EAF flags for function parameters. */
> > > +
> > > +static void
> > > +analyze_parms (modref_summary *summary)
> > > +{
> > > + unsigned int parm_index = 0;
> > > + unsigned int count = 0;
> > > +
> > > + for (tree parm = DECL_ARGUMENTS (current_function_decl); parm;
> > > + parm = TREE_CHAIN (parm))
> > > + count++;
> > > +
> > > + if (!count)
> > > + return;
> > > +
> > > + auto_vec<int> known_flags;
> > > + known_flags.safe_grow_cleared (num_ssa_names);
> >
> > , true for exact reserve. Could be space optimized by not using
> > auto_vec<int> but auto_vec<usigned char>?
>
> I think there is not that much memory wasted, but indeed chars will be
> more effective.
> >
> > > +
> > > + for (tree parm = DECL_ARGUMENTS (current_function_decl); parm; parm_index++,
> > > + parm = TREE_CHAIN (parm))
> > > + {
> > > + tree name = ssa_default_def (cfun, parm);
> > > + if (!name)
> > > + continue;
> >
> > looks like the vec might be quite sparse ...
> >
> > > + int flags = analyze_ssa_name_flags (name, &known_flags, 0);
> > > +
> > > + if (flags)
> > > + {
> > > + if (parm_index >= summary->arg_flags.length ())
> > > + summary->arg_flags.safe_grow_cleared (count);
> >
> > you want , true for exact reserve.
> >
> > > + summary->arg_flags[parm_index] = flags;
> >
> > maybe this as well - does it make sense to skip !is_gimple_reg_type
> > params in the counting? Guess it makes lookup too complicated.
> > But we do have testcases with >30000 parameters ...
>
> Well, the index needs to be actual index all call argument.
> As said, i did not see any noticeable modref memory use increase at WPA
> (I watched) since it is at most 1 byte for parameter in case something
> got tracked.
>
> Thanks for review!
>
> Honza
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend
prev parent reply other threads:[~2020-11-11 10:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-08 12:47 Definition of EAF_NOESCAPE and fnspec strings Jan Hubicka
2020-11-09 7:31 ` Richard Biener
2020-11-09 10:29 ` Jan Hubicka
2020-11-09 12:38 ` Richard Biener
2020-11-09 13:16 ` Detect EAF flags in ipa-modref Jan Hubicka
2020-11-09 23:14 ` Jan Hubicka
2020-11-10 9:17 ` Jan Hubicka
2020-11-10 10:25 ` Jan Hubicka
2020-11-10 10:55 ` Jan Hubicka
2020-11-10 11:04 ` Richard Biener
2020-11-10 12:54 ` Jan Hubicka
2020-11-10 14:31 ` Jan Hubicka
2020-11-13 8:06 ` Richard Biener
2020-11-15 13:25 ` H.J. Lu
2020-11-15 14:13 ` Jan Hubicka
2020-11-15 10:41 ` Andreas Schwab
2020-11-15 11:12 ` Jan Hubicka
2020-11-15 11:25 ` Rainer Orth
2020-11-15 12:33 ` Jan Hubicka
2020-11-15 12:43 ` Rainer Orth
2020-11-15 13:03 ` Jan Hubicka
2020-11-15 16:03 ` Rainer Orth
2020-11-15 16:15 ` Andreas Schwab
2020-11-15 18:07 ` Jan Hubicka
2020-11-16 7:48 ` Richard Biener
2020-11-16 9:26 ` Andreas Schwab
2020-11-16 10:59 ` Jan Hubicka
2020-11-16 12:36 ` Richard Biener
2020-11-16 12:44 ` Jan Hubicka
2020-11-16 19:33 ` Martin Liška
2020-11-16 19:46 ` Jan Hubicka
2020-11-11 10:09 ` Richard Biener [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.YFH.7.76.2011111104090.10073@p653.nepu.fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=gcc@gcc.gnu.org \
--cc=hubicka@ucw.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).