From: Kyrill Tkachov <kyrylo.tkachov@arm.com>
To: James Greenhalgh <james.greenhalgh@arm.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>,
Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
Richard Earnshaw <Richard.Earnshaw@arm.com>
Subject: Re: [PATCH][AArch64][9/14] Implement TARGET_CAN_INLINE_P
Date: Thu, 23 Jul 2015 10:27:00 -0000 [thread overview]
Message-ID: <55B0BF30.2030007@arm.com> (raw)
In-Reply-To: <20150721160757.GA14953@arm.com>
[-- Attachment #1: Type: text/plain, Size: 8404 bytes --]
On 21/07/15 17:07, James Greenhalgh wrote:
> On Thu, Jul 16, 2015 at 04:21:02PM +0100, Kyrill Tkachov wrote:
>> Hi all,
>>
>> This patch implements the target-specific inlining rules.
>> The basic philosophy is that we want to definitely reject inlining if the callee's architecture
>> is not a subset, feature-wise, of the caller's.
>>
>> Beyond that, we want to allow inlining if the callee is always_inline.
>> If it's not, we reject inlining if the TargetSave options don't match up
>> in a way that's described in the comments in the patch.
>>
>> Generally, we try to allow as much inlining as possible for the benefit of LTO.
>> However, if the architectural features of the callee are not a subset of the features
>> of the caller, then we must reject inlining. For example, inlining a function with 'simd'
>> into a function without 'simd' is not allowed.
>>
>> Also, inlining a non-strict-align function into a strict-align function is not allowed.
>> These two restrictions apply even when the callee is tagged with always_inline because they
>> can affect the correctness of the program.
>>
>> Beyond that, we reject inlining only if the user has explicitly specified attributes/options for
>> both the caller and the callee and they don't match up.
>>
>> An exception to that are the tuning CPUs. We want to allow inlining even when the tuning CPUs don't match.
>>
>> Bootstrapped and tested on aarch64.
>>
>> Ok for trunk?
> Comments below.
>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 0a6ed70..34cd986 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -8486,6 +8486,113 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
>> return ret;
>> }
>>
>> +/* Helper for aarch64_can_inline_p. In the case where CALLER and CALLEE are
>> + tri-bool options (yes, no, don't care) and the default value is
>> + DEF, determine whether to reject inlining. */
>> +
>> +static bool
>> +aarch64_reject_inlining (int caller, int callee, int dont_care, int def)
>> +{
>> + /* If both caller and callee care about the value then reject inlining
>> + if they don't match up. */
>> + if (caller != dont_care && callee != dont_care && caller != callee)
>> + return true;
>> +
>> + /* If caller doesn't care then make sure that the default agrees
>> + with the callee. */
>> + if (caller == dont_care && callee != dont_care && callee != def)
>> + return true;
> I don't like boolean functions which return the negative of how they
> will be used, so how about flipping the sense of this function to be
> positive and cleaning up the logical cases to be more explicit:
>
> static bool
> aarch64_tribool_values_ok_for_inlining_p (int caller, int callee,
> int dont_care, int def)
> {
> /* If the callee doesn't care, always allow inlining. */
> if (callee == dont_care)
> return true;
>
> /* If the caller doesn't care, always allow inlining. */
> if (caller == dont_care)
> return true;
>
> /* Otherwise, allow inlining if either the callee and caller values
> agree, or if the callee is using the default value.
> return (callee == caller || callee == def)
> }
>
>> +/* Implement TARGET_CAN_INLINE_P. Decide whether it is valid
>> + to inline CALLE into CALLER based on target-specific info.
> s/CALLE/CALLEE/
>
>> + Make sure that the caller and callee have compatible architectural
>> + features. Then go through the other possible target attributes
>> + and. Try not to reject always_inline callees unless they are
> Typo: and... what?
>
>> + incompatible architecturally. */
>> +
>> +static bool
>> +aarch64_can_inline_p (tree caller, tree callee)
>> +{
>> + bool ret = false;
>> + tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>> + tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
>> +
>> + /* If callee has no option attributes, then it is ok to inline. */
>> + if (!callee_tree)
>> + ret = true;
> Just return true, drop the else and clean up the control flow, particularly
> as the rest of the code below is going to fall through to each condition
> in turn.
>
>> + else
>> + {
>> + struct cl_target_option *caller_opts
>> + = TREE_TARGET_OPTION (caller_tree ? caller_tree
>> + : target_option_default_node);
>> + struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
>> +
>> +
>> + /* Callee's ISA flags should be a subset of the caller's. */
>> + if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags)
>> + == callee_opts->x_aarch64_isa_flags)
>> + ret = true;
> This is the only place that ret can become true. Why not switch the sense
> of the comparison (!= callee_opts...) and simply return false here.
>
> Then each of the cases below can just return false as they go and the
> fallthrough case can return true.
>
> That seems much more readable to me!
Thanks, I've implemented the suggestions.
Re-bootstrapped and tested on aarch64.
How's this?
Thanks,
Kyrill
2015-07-23 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
* config/aarch64/aarch64.c (aarch64_tribools_ok_for_inlining_p):
New function.
(aarch64_can_inline_p): Likewise.
(TARGET_CAN_INLINE_P): Define.
>
> Thanks,
> James
>
>
>> +
>> + /* Allow non-strict aligned functions inlining into strict
>> + aligned ones. */
>> + if ((TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags)
>> + != TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags))
>> + && !(!TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags)
>> + && TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags)))
>> + ret = false;
>> +
>> + bool always_inline = lookup_attribute ("always_inline",
>> + DECL_ATTRIBUTES (callee));
>> +
>> + /* If the architectural features match up and the callee is always_inline
>> + then the other attributes don't matter. */
>> + if (always_inline)
>> + return ret;
>> +
>> + if (caller_opts->x_aarch64_cmodel_var
>> + != callee_opts->x_aarch64_cmodel_var)
>> + ret = false;
>> +
>> + if (caller_opts->x_aarch64_tls_dialect
>> + != callee_opts->x_aarch64_tls_dialect)
>> + ret = false;
>> +
>> +
>> + /* Honour explicit requests to workaround errata. */
>> + if (aarch64_reject_inlining (caller_opts->x_aarch64_fix_a53_err835769,
>> + callee_opts->x_aarch64_fix_a53_err835769,
>> + 2, TARGET_FIX_ERR_A53_835769_DEFAULT))
>> + ret = false;
>> +
>> + /* If the user explicitly specified -momit-leaf-frame-pointer for the
>> + caller and calle and they don't match up, reject inlining. */
>> + if (aarch64_reject_inlining (caller_opts->x_flag_omit_leaf_frame_pointer,
>> + callee_opts->x_flag_omit_leaf_frame_pointer,
>> + 2, 1))
>> + ret = false;
>> +
>> + /* If the callee has specific tuning overrides, respect them. */
>> + if (callee_opts->x_aarch64_override_tune_string != NULL
>> + && caller_opts->x_aarch64_override_tune_string == NULL)
>> + ret = false;
>> +
>> + /* If the user specified tuning override strings for the
>> + caller and callee and they don't match up, reject inlining.
>> + We just do a string compare here, we don't analyze the meaning
>> + of the string, as it would be too costly for little gain. */
>> + if (callee_opts->x_aarch64_override_tune_string
>> + && caller_opts->x_aarch64_override_tune_string
>> + && (strcmp (callee_opts->x_aarch64_override_tune_string,
>> + caller_opts->x_aarch64_override_tune_string) != 0))
>> + ret = false;
>> + }
>> +
>> + return ret;
>> +}
>> +
>> /* Return true if SYMBOL_REF X binds locally. */
>>
>> static bool
>> @@ -12963,6 +13070,9 @@ aarch64_unspec_may_trap_p (const_rtx x, unsigned flags)
>> #undef TARGET_OPTION_VALID_ATTRIBUTE_P
>> #define TARGET_OPTION_VALID_ATTRIBUTE_P aarch64_option_valid_attribute_p
>>
>> +#undef TARGET_CAN_INLINE_P
>> +#define TARGET_CAN_INLINE_P aarch64_can_inline_p
>> +
>> #undef TARGET_PASS_BY_REFERENCE
>> #define TARGET_PASS_BY_REFERENCE aarch64_pass_by_reference
>>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: aarch64-attrs-9.patch --]
[-- Type: text/x-patch; name=aarch64-attrs-9.patch, Size: 5021 bytes --]
commit cc49230c0ed153759e6f617c6251d48f9a1814b2
Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Date: Thu May 14 12:00:07 2015 +0100
[AArch64][9/N] Implement TARGET_CAN_INLINE_P
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ba82ca9..ff584fe 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8486,6 +8486,113 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
return ret;
}
+/* Helper for aarch64_can_inline_p. In the case where CALLER and CALLEE are
+ tri-bool options (yes, no, don't care) and the default value is
+ DEF, determine whether to reject inlining. */
+
+static bool
+aarch64_tribools_ok_for_inlining_p (int caller, int callee,
+ int dont_care, int def)
+{
+ /* If the callee doesn't care, always allow inlining. */
+ if (callee == dont_care)
+ return true;
+
+ /* If the caller doesn't care, always allow inlining. */
+ if (caller == dont_care)
+ return true;
+
+ /* Otherwise, allow inlining if either the callee and caller values
+ agree, or if the callee is using the default value. */
+ return (callee == caller || callee == def);
+}
+
+/* Implement TARGET_CAN_INLINE_P. Decide whether it is valid
+ to inline CALLEE into CALLER based on target-specific info.
+ Make sure that the caller and callee have compatible architectural
+ features. Then go through the other possible target attributes
+ and see if they can block inlining. Try not to reject always_inline
+ callees unless they are incompatible architecturally. */
+
+static bool
+aarch64_can_inline_p (tree caller, tree callee)
+{
+ tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
+ tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
+
+ /* If callee has no option attributes, then it is ok to inline. */
+ if (!callee_tree)
+ return true;
+
+ struct cl_target_option *caller_opts
+ = TREE_TARGET_OPTION (caller_tree ? caller_tree
+ : target_option_default_node);
+
+ struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
+
+
+ /* Callee's ISA flags should be a subset of the caller's. */
+ if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags)
+ != callee_opts->x_aarch64_isa_flags)
+ return false;
+
+ /* Allow non-strict aligned functions inlining into strict
+ aligned ones. */
+ if ((TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags)
+ != TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags))
+ && !(!TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags)
+ && TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags)))
+ return false;
+
+ bool always_inline = lookup_attribute ("always_inline",
+ DECL_ATTRIBUTES (callee));
+
+ /* If the architectural features match up and the callee is always_inline
+ then the other attributes don't matter. */
+ if (always_inline)
+ return true;
+
+ if (caller_opts->x_aarch64_cmodel_var
+ != callee_opts->x_aarch64_cmodel_var)
+ return false;
+
+ if (caller_opts->x_aarch64_tls_dialect
+ != callee_opts->x_aarch64_tls_dialect)
+ return false;
+
+ /* Honour explicit requests to workaround errata. */
+ if (!aarch64_tribools_ok_for_inlining_p (
+ caller_opts->x_aarch64_fix_a53_err835769,
+ callee_opts->x_aarch64_fix_a53_err835769,
+ 2, TARGET_FIX_ERR_A53_835769_DEFAULT))
+ return false;
+
+ /* If the user explicitly specified -momit-leaf-frame-pointer for the
+ caller and calle and they don't match up, reject inlining. */
+ if (!aarch64_tribools_ok_for_inlining_p (
+ caller_opts->x_flag_omit_leaf_frame_pointer,
+ callee_opts->x_flag_omit_leaf_frame_pointer,
+ 2, 1))
+ return false;
+
+ /* If the callee has specific tuning overrides, respect them. */
+ if (callee_opts->x_aarch64_override_tune_string != NULL
+ && caller_opts->x_aarch64_override_tune_string == NULL)
+ return false;
+
+ /* If the user specified tuning override strings for the
+ caller and callee and they don't match up, reject inlining.
+ We just do a string compare here, we don't analyze the meaning
+ of the string, as it would be too costly for little gain. */
+ if (callee_opts->x_aarch64_override_tune_string
+ && caller_opts->x_aarch64_override_tune_string
+ && (strcmp (callee_opts->x_aarch64_override_tune_string,
+ caller_opts->x_aarch64_override_tune_string) != 0))
+ return false;
+
+ return true;
+}
+
/* Return true if SYMBOL_REF X binds locally. */
static bool
@@ -12850,6 +12957,9 @@ aarch64_unspec_may_trap_p (const_rtx x, unsigned flags)
#undef TARGET_CAN_ELIMINATE
#define TARGET_CAN_ELIMINATE aarch64_can_eliminate
+#undef TARGET_CAN_INLINE_P
+#define TARGET_CAN_INLINE_P aarch64_can_inline_p
+
#undef TARGET_CANNOT_FORCE_CONST_MEM
#define TARGET_CANNOT_FORCE_CONST_MEM aarch64_cannot_force_const_mem
next prev parent reply other threads:[~2015-07-23 10:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-16 15:21 Kyrill Tkachov
2015-07-21 16:27 ` James Greenhalgh
2015-07-23 10:27 ` Kyrill Tkachov [this message]
2015-08-03 11:04 ` James Greenhalgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55B0BF30.2030007@arm.com \
--to=kyrylo.tkachov@arm.com \
--cc=Marcus.Shawcroft@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=james.greenhalgh@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).