From: luoxhu <luoxhu@linux.ibm.com>
To: Feng Xue OS <fxue@os.amperecomputing.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Jan Hubicka <hubicka@ucw.cz>,
Martin Jambor <mjambor@suse.cz>
Subject: Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)
Date: Fri, 18 Oct 2019 02:12:00 -0000 [thread overview]
Message-ID: <ca6dc810-560e-47aa-5f99-54289ce74369@linux.ibm.com> (raw)
In-Reply-To: <BYAPR01MB4869F706EB198F95EACD7692F76D0@BYAPR01MB4869.prod.exchangelabs.com>
Hi Feng,
On 2019/10/17 16:23, Feng Xue OS wrote:
> IPA does not allow constant propagation on parameter that is used to control
> function recursion.
>
> recur_fn (i)
> {
> if ( !terminate_recursion (i))
> {
> ...
> recur_fn (i + 1);
> ...
> }
> ...
> }
>
> This patch is composed to enable multi-versioning for self-recursive function,
> and versioning copies is limited by a specified option.
I noticed similar issue when analyzing the SPEC, self-recursive function is
not versioned and posted my observations in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92074.
Generally, this could be implemented well by your patch, while I am
wondering whether it is OK to convert the recursive function to
non-recursive function in a independent pass after ipa-cp and ipa-sra instead
of reuse the ipa-cp framework?
The reason is sometimes the argument is passed-by-reference, and
ipa-sra runs after ipa-cp, so this kind of optimization may not be done in
WPA. What's your idea about this, please? Thanks.
Thanks
Xiong Hu
>
> Feng
> ---
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 045072e02ec..6255a766e4d 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -229,7 +229,9 @@ public:
> inline bool set_contains_variable ();
> bool add_value (valtype newval, cgraph_edge *cs,
> ipcp_value<valtype> *src_val = NULL,
> - int src_idx = 0, HOST_WIDE_INT offset = -1);
> + int src_idx = 0, HOST_WIDE_INT offset = -1,
> + ipcp_value<valtype> **val_pos_p = NULL,
> + bool unlimited = false);
> void print (FILE * f, bool dump_sources, bool dump_benefits);
> };
>
> @@ -1579,22 +1581,37 @@ allocate_and_init_ipcp_value (ipa_polymorphic_call_context source)
> /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it. CS,
> SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same
> meaning. OFFSET -1 means the source is scalar and not a part of an
> - aggregate. */
> + aggregate. If non-NULL, VAL_POS_P specifies position in value list,
> + after which newly created ipcp_value will be inserted, and it is also
> + used to record address of the added ipcp_value before function returns.
> + UNLIMITED means whether value count should not exceed the limit given
> + by PARAM_IPA_CP_VALUE_LIST_SIZE. */
>
> template <typename valtype>
> bool
> ipcp_lattice<valtype>::add_value (valtype newval, cgraph_edge *cs,
> ipcp_value<valtype> *src_val,
> - int src_idx, HOST_WIDE_INT offset)
> + int src_idx, HOST_WIDE_INT offset,
> + ipcp_value<valtype> **val_pos_p,
> + bool unlimited)
> {
> ipcp_value<valtype> *val;
>
> + if (val_pos_p)
> + {
> + for (val = values; val && val != *val_pos_p; val = val->next);
> + gcc_checking_assert (val);
> + }
> +
> if (bottom)
> return false;
>
> for (val = values; val; val = val->next)
> if (values_equal_for_ipcp_p (val->value, newval))
> {
> + if (val_pos_p)
> + *val_pos_p = val;
> +
> if (ipa_edge_within_scc (cs))
> {
> ipcp_value_source<valtype> *s;
> @@ -1609,7 +1626,8 @@ ipcp_lattice<valtype>::add_value (valtype newval, cgraph_edge *cs,
> return false;
> }
>
> - if (values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
> + if (!unlimited
> + && values_count == PARAM_VALUE (PARAM_IPA_CP_VALUE_LIST_SIZE))
> {
> /* We can only free sources, not the values themselves, because sources
> of other values in this SCC might point to them. */
> @@ -1623,6 +1641,9 @@ ipcp_lattice<valtype>::add_value (valtype newval, cgraph_edge *cs,
> }
> }
>
> + if (val_pos_p)
> + *val_pos_p = NULL;
> +
> values = NULL;
> return set_to_bottom ();
> }
> @@ -1630,8 +1651,54 @@ ipcp_lattice<valtype>::add_value (valtype newval, cgraph_edge *cs,
> values_count++;
> val = allocate_and_init_ipcp_value (newval);
> val->add_source (cs, src_val, src_idx, offset);
> - val->next = values;
> - values = val;
> + if (val_pos_p)
> + {
> + val->next = (*val_pos_p)->next;
> + (*val_pos_p)->next = val;
> + *val_pos_p = val;
> + }
> + else
> + {
> + val->next = values;
> + values = val;
> + }
> +
> + return true;
> +}
> +
> +/* Return true, if a ipcp_value VAL is orginated from parameter value of
> + self-feeding recursive function by applying non-passthrough arithmetic
> + transformation. */
> +
> +static bool
> +self_recursively_generated_p (ipcp_value<tree> *val)
> +{
> + class ipa_node_params *info = NULL;
> +
> + for (ipcp_value_source<tree> *src = val->sources; src; src = src->next)
> + {
> + cgraph_edge *cs = src->cs;
> +
> + if (!src->val || cs->caller != cs->callee->function_symbol ()
> + || src->val == val)
> + return false;
> +
> + if (!info)
> + info = IPA_NODE_REF (cs->caller);
> +
> + class ipcp_param_lattices *plats = ipa_get_parm_lattices (info,
> + src->index);
> + ipcp_lattice<tree> *src_lat = src->offset == -1 ? &plats->itself
> + : plats->aggs;
> + ipcp_value<tree> *src_val;
> +
> + for (src_val = src_lat->values; src_val && src_val != val;
> + src_val = src_val->next);
> +
> + if (!src_val)
> + return false;
> + }
> +
> return true;
> }
>
> @@ -1649,20 +1716,72 @@ propagate_vals_across_pass_through (cgraph_edge *cs, ipa_jump_func *jfunc,
> ipcp_value<tree> *src_val;
> bool ret = false;
>
> - /* Do not create new values when propagating within an SCC because if there
> - are arithmetic functions with circular dependencies, there is infinite
> - number of them and we would just make lattices bottom. If this condition
> - is ever relaxed we have to detect self-feeding recursive calls in
> - cgraph_edge_brings_value_p in a smarter way. */
> + /* Due to circular dependencies, propagating within an SCC through arithmetic
> + transformation would create infinite number of values. But for
> + self-feeding recursive function, we could allow propagation in a limited
> + count, and this can enable a simple kind of recursive function versioning.
> + For other scenario, we would just make lattices bottom. */
> if ((ipa_get_jf_pass_through_operation (jfunc) != NOP_EXPR)
> && ipa_edge_within_scc (cs))
> - ret = dest_lat->set_contains_variable ();
> + {
> + if (src_lat != dest_lat)
> + return dest_lat->set_contains_variable ();
> +
> + auto_vec<ipcp_value<tree> *, 8> val_seeds;
> +
> + for (src_val = src_lat->values; src_val; src_val = src_val->next)
> + {
> + /* Now we do not use self-recursively generated value as propagation
> + source, this is absolutely conservative, but could avoid explosion
> + of lattice's value space, especially when one recursive function
> + calls another recursive. */
> + if (self_recursively_generated_p (src_val))
> + {
> + /* If the lattice has already been propagated for the call site,
> + no need to do that again. */
> + for (ipcp_value_source<tree> *s = src_val->sources; s;
> + s = s->next)
> + if (s->cs == cs)
> + return dest_lat->set_contains_variable ();
> + }
> + else
> + val_seeds.safe_push (src_val);
> + }
> +
> + int clone_copies = PARAM_VALUE (PARAM_IPA_CP_MAX_RECURSION_COPIES);
> + int i;
> +
> + /* Recursively generate lattice values with a limited count. */
> + FOR_EACH_VEC_ELT (val_seeds, i, src_val)
> + {
> + for (int j = 1; j < clone_copies; j++)
> + {
> + tree cstval = ipa_get_jf_pass_through_result (jfunc,
> + src_val->value,
> + parm_type);
> + if (!cstval)
> + break;
> +
> + /* Try to place the new lattice value after its source, which
> + can decrease iterations of propagate stage. */
> + ret |= dest_lat->add_value (cstval, cs, src_val, src_idx,
> + -1, &src_val, true);
> + gcc_checking_assert (src_val);
> + }
> + }
> + ret |= dest_lat->set_contains_variable ();
> + }
> else
> for (src_val = src_lat->values; src_val; src_val = src_val->next)
> {
> + /* Now we do not use self-recursively generated value as propagation
> + source, otherwise it is easy to make value space of normal lattice
> + overflow. */
> + if (self_recursively_generated_p (src_val))
> + continue;
> +
> tree cstval = ipa_get_jf_pass_through_result (jfunc, src_val->value,
> parm_type);
> -
> if (cstval)
> ret |= dest_lat->add_value (cstval, cs, src_val, src_idx);
> else
> @@ -3635,6 +3754,9 @@ get_info_about_necessary_edges (ipcp_value<valtype> *val, cgraph_node *dest,
> hot |= cs->maybe_hot_p ();
> if (cs->caller != dest)
> non_self_recursive = true;
> + else if (src->val)
> + gcc_assert (values_equal_for_ipcp_p (src->val->value,
> + val->value));
> }
> cs = get_next_cgraph_edge_clone (cs);
> }
> diff --git a/gcc/params.def b/gcc/params.def
> index 322c37f8b96..3e242e09f01 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -1135,6 +1135,11 @@ DEFPARAM (PARAM_IPA_CP_RECURSION_PENALTY,
> "are evaluated for cloning.",
> 40, 0, 100)
>
> +DEFPARAM (PARAM_IPA_CP_MAX_RECURSION_COPIES,
> + "ipa-cp-max-recursion-copies",
> + "Maximum function versioning copies for a self-recursive function.",
> + 8, 0, 0)
> +
> DEFPARAM (PARAM_IPA_CP_SINGLE_CALL_PENALTY,
> "ipa-cp-single-call-penalty",
> "Percentage penalty functions containing a single call to another "
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c b/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c
> new file mode 100644
> index 00000000000..7c1c01047c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-clone-2.c
> @@ -0,0 +1,47 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fdump-ipa-cp-details -fno-early-inlining" } */
> +
> +int fn();
> +
> +int data[100];
> +
> +int recur_fn (int i)
> +{
> + int j;
> +
> + if (i == 6)
> + {
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + fn();
> + return 10;
> + }
> +
> + data[i] = i;
> +
> + for (j = 0; j < 100; j++)
> + recur_fn (i + 1);
> +
> + return i;
> +}
> +
> +int main ()
> +{
> + int i;
> +
> + for (i = 0; i < 100; i++)
> + recur_fn (1) + recur_fn (-5);
> +
> + return 1;
> +}
> +
> +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of recur_fn/\[0-9\]*\\." 12 "cp" } } */
>
next prev parent reply other threads:[~2019-10-18 2:03 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-17 8:35 Feng Xue OS
2019-10-18 2:12 ` luoxhu [this message]
2019-10-18 5:15 ` Feng Xue OS
2019-10-24 6:17 ` luoxhu
2019-10-24 6:57 ` Feng Xue OS
2019-11-14 13:29 ` Jan Hubicka
2019-11-14 15:16 ` Feng Xue OS
2019-11-14 15:28 ` Jan Hubicka
2019-11-14 16:02 ` Feng Xue OS
2019-11-14 20:50 ` Jan Hubicka
2019-11-15 15:37 ` Feng Xue OS
2019-11-22 5:26 ` Ping: " Feng Xue OS
2019-11-22 11:34 ` Martin Jambor
2019-11-25 14:17 ` Feng Xue OS
2019-11-27 2:07 ` Feng Xue OS
2019-11-27 15:12 ` Jan Hubicka
2019-11-28 3:48 ` Feng Xue OS
2019-12-01 23:20 ` Jeff Law
2019-12-02 7:07 ` Feng Xue OS
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ca6dc810-560e-47aa-5f99-54289ce74369@linux.ibm.com \
--to=luoxhu@linux.ibm.com \
--cc=fxue@os.amperecomputing.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hubicka@ucw.cz \
--cc=mjambor@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).