On 21/07/15 17:07, James Greenhalgh wrote: > On Thu, Jul 16, 2015 at 04:21:02PM +0100, Kyrill Tkachov wrote: >> Hi all, >> >> This patch implements the target-specific inlining rules. >> The basic philosophy is that we want to definitely reject inlining if the callee's architecture >> is not a subset, feature-wise, of the caller's. >> >> Beyond that, we want to allow inlining if the callee is always_inline. >> If it's not, we reject inlining if the TargetSave options don't match up >> in a way that's described in the comments in the patch. >> >> Generally, we try to allow as much inlining as possible for the benefit of LTO. >> However, if the architectural features of the callee are not a subset of the features >> of the caller, then we must reject inlining. For example, inlining a function with 'simd' >> into a function without 'simd' is not allowed. >> >> Also, inlining a non-strict-align function into a strict-align function is not allowed. >> These two restrictions apply even when the callee is tagged with always_inline because they >> can affect the correctness of the program. >> >> Beyond that, we reject inlining only if the user has explicitly specified attributes/options for >> both the caller and the callee and they don't match up. >> >> An exception to that are the tuning CPUs. We want to allow inlining even when the tuning CPUs don't match. >> >> Bootstrapped and tested on aarch64. >> >> Ok for trunk? > Comments below. > >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index 0a6ed70..34cd986 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -8486,6 +8486,113 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int) >> return ret; >> } >> >> +/* Helper for aarch64_can_inline_p. In the case where CALLER and CALLEE are >> + tri-bool options (yes, no, don't care) and the default value is >> + DEF, determine whether to reject inlining. */ >> + >> +static bool >> +aarch64_reject_inlining (int caller, int callee, int dont_care, int def) >> +{ >> + /* If both caller and callee care about the value then reject inlining >> + if they don't match up. */ >> + if (caller != dont_care && callee != dont_care && caller != callee) >> + return true; >> + >> + /* If caller doesn't care then make sure that the default agrees >> + with the callee. */ >> + if (caller == dont_care && callee != dont_care && callee != def) >> + return true; > I don't like boolean functions which return the negative of how they > will be used, so how about flipping the sense of this function to be > positive and cleaning up the logical cases to be more explicit: > > static bool > aarch64_tribool_values_ok_for_inlining_p (int caller, int callee, > int dont_care, int def) > { > /* If the callee doesn't care, always allow inlining. */ > if (callee == dont_care) > return true; > > /* If the caller doesn't care, always allow inlining. */ > if (caller == dont_care) > return true; > > /* Otherwise, allow inlining if either the callee and caller values > agree, or if the callee is using the default value. > return (callee == caller || callee == def) > } > >> +/* Implement TARGET_CAN_INLINE_P. Decide whether it is valid >> + to inline CALLE into CALLER based on target-specific info. > s/CALLE/CALLEE/ > >> + Make sure that the caller and callee have compatible architectural >> + features. Then go through the other possible target attributes >> + and. Try not to reject always_inline callees unless they are > Typo: and... what? > >> + incompatible architecturally. */ >> + >> +static bool >> +aarch64_can_inline_p (tree caller, tree callee) >> +{ >> + bool ret = false; >> + tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); >> + tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); >> + >> + /* If callee has no option attributes, then it is ok to inline. */ >> + if (!callee_tree) >> + ret = true; > Just return true, drop the else and clean up the control flow, particularly > as the rest of the code below is going to fall through to each condition > in turn. > >> + else >> + { >> + struct cl_target_option *caller_opts >> + = TREE_TARGET_OPTION (caller_tree ? caller_tree >> + : target_option_default_node); >> + struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree); >> + >> + >> + /* Callee's ISA flags should be a subset of the caller's. */ >> + if ((caller_opts->x_aarch64_isa_flags & callee_opts->x_aarch64_isa_flags) >> + == callee_opts->x_aarch64_isa_flags) >> + ret = true; > This is the only place that ret can become true. Why not switch the sense > of the comparison (!= callee_opts...) and simply return false here. > > Then each of the cases below can just return false as they go and the > fallthrough case can return true. > > That seems much more readable to me! Thanks, I've implemented the suggestions. Re-bootstrapped and tested on aarch64. How's this? Thanks, Kyrill 2015-07-23 Kyrylo Tkachov * config/aarch64/aarch64.c (aarch64_tribools_ok_for_inlining_p): New function. (aarch64_can_inline_p): Likewise. (TARGET_CAN_INLINE_P): Define. > > Thanks, > James > > >> + >> + /* Allow non-strict aligned functions inlining into strict >> + aligned ones. */ >> + if ((TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags) >> + != TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags)) >> + && !(!TARGET_STRICT_ALIGN_P (callee_opts->x_target_flags) >> + && TARGET_STRICT_ALIGN_P (caller_opts->x_target_flags))) >> + ret = false; >> + >> + bool always_inline = lookup_attribute ("always_inline", >> + DECL_ATTRIBUTES (callee)); >> + >> + /* If the architectural features match up and the callee is always_inline >> + then the other attributes don't matter. */ >> + if (always_inline) >> + return ret; >> + >> + if (caller_opts->x_aarch64_cmodel_var >> + != callee_opts->x_aarch64_cmodel_var) >> + ret = false; >> + >> + if (caller_opts->x_aarch64_tls_dialect >> + != callee_opts->x_aarch64_tls_dialect) >> + ret = false; >> + >> + >> + /* Honour explicit requests to workaround errata. */ >> + if (aarch64_reject_inlining (caller_opts->x_aarch64_fix_a53_err835769, >> + callee_opts->x_aarch64_fix_a53_err835769, >> + 2, TARGET_FIX_ERR_A53_835769_DEFAULT)) >> + ret = false; >> + >> + /* If the user explicitly specified -momit-leaf-frame-pointer for the >> + caller and calle and they don't match up, reject inlining. */ >> + if (aarch64_reject_inlining (caller_opts->x_flag_omit_leaf_frame_pointer, >> + callee_opts->x_flag_omit_leaf_frame_pointer, >> + 2, 1)) >> + ret = false; >> + >> + /* If the callee has specific tuning overrides, respect them. */ >> + if (callee_opts->x_aarch64_override_tune_string != NULL >> + && caller_opts->x_aarch64_override_tune_string == NULL) >> + ret = false; >> + >> + /* If the user specified tuning override strings for the >> + caller and callee and they don't match up, reject inlining. >> + We just do a string compare here, we don't analyze the meaning >> + of the string, as it would be too costly for little gain. */ >> + if (callee_opts->x_aarch64_override_tune_string >> + && caller_opts->x_aarch64_override_tune_string >> + && (strcmp (callee_opts->x_aarch64_override_tune_string, >> + caller_opts->x_aarch64_override_tune_string) != 0)) >> + ret = false; >> + } >> + >> + return ret; >> +} >> + >> /* Return true if SYMBOL_REF X binds locally. */ >> >> static bool >> @@ -12963,6 +13070,9 @@ aarch64_unspec_may_trap_p (const_rtx x, unsigned flags) >> #undef TARGET_OPTION_VALID_ATTRIBUTE_P >> #define TARGET_OPTION_VALID_ATTRIBUTE_P aarch64_option_valid_attribute_p >> >> +#undef TARGET_CAN_INLINE_P >> +#define TARGET_CAN_INLINE_P aarch64_can_inline_p >> + >> #undef TARGET_PASS_BY_REFERENCE >> #define TARGET_PASS_BY_REFERENCE aarch64_pass_by_reference >>