Hi Segher, on 2021/12/6 下午9:06, Segher Boessenkool wrote: > On Fri, Dec 03, 2021 at 11:46:53AM +0800, Kewen.Lin wrote: >>>> This patch is to fix the inconsistent behaviors for non-LTO mode >>>> and LTO mode. As Martin pointed out, currently the function >>>> rs6000_can_inline_p simply makes it inlinable if callee_tree is >>>> NULL, but it's wrong, we should use the command line options >>>> from target_option_default_node as default. >>> >>> This is not documented. >> >> Yeah, but according to the document for the target attribute [1], >> "Multiple target back ends implement the target attribute to specify >> that a function is to be compiled with different target options than >> specified on the command line. The original target command-line options >> are ignored. ", it seems to say the function without any target >> attribute/pragma will be compiled with target options specified on the >> command line. I think it's a normal expectation for users. > > The whole target_option_default_node is not documented, I meant. > Ah, you meant that. Yeah, it can be improved separately. >> Excepting for the inconsistent behaviors between LTO and non-LTO, >> it can also make the below case different. >> >> // Option: -O2 -mcpu=power8 -mhtm >> >> int foo(int *b) { >> *b += __builtin_ttest(); >> return *b; >> } >> >> __attribute__((target("no-htm"))) int bar(int *a) { >> *a = foo(a); >> return 0; >> } >> >> Without this fix, bar inlines foo but get the error as below: >> >> In function ‘foo’, >> inlined from ‘bar’ at test.c:8:8: >> test.c:3:9: error: ‘__builtin_ttest’ requires the ‘-mhtm’ option >> 3 | *b += __builtin_ttest(); >> | ^~~~~~~~~~~~~~~~~ >> >> Since bar doesn't support foo's required HTM feature. >> >> With this fix, bar doesn't inline foo and the case gets compiled well. > > And if you inline something no-htm into something that allows htm? That > should work fine, be allowed just fine. > Thanks for helping to answer this, Peter! >>>> It also replaces >>>> rs6000_isa_flags with the one from target_option_default_node >>>> when caller_tree is NULL as rs6000_isa_flags could probably >>>> change since initialization. >>> >>> Is this enough then? Are there no other places that use >>> rs6000_isa_flags? Is it correct for us to have that variable at all >>> anymore? Etc. >> >> Good question, I think the answer is yes. I digged into it and found >> the current target option save/restore scheme would keep rs6000_isa_flags >> to be the same as the one saved in target_option_default_node for the context >> of inlining. So I put one assertion as below: >> >> if (!caller_tree) { >> caller_tree = target_option_default_node; >> gcc_assert (rs6000_isa_flags >> == TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags); >> } >> >> And kicked off regression testings on both BE and LE, the result showed >> only one same failure, which seems to be a latent bug. I've filed PR103515 >> for it. > > Ah cool :-) Please send a patch to add that asser then (well, once we > can bootstrap with it ;-) ) > OK. >> This bug also indicates it's better to use target_option_default_node >> rather than rs6000_isa_flags when the caller_tree is NULL. Since >> the target_option_default_node is straightforward and doesn't rely >> on the assumption that rs6000_isa_flags would be kept as the default >> one correctly, it doesn't suffer from bugs like PR103515. > > So we should not ever use rs6000_isa_flags anymore? > If the question is for the scope in function rs6000_can_inline_p, yes. Before this patch, the only use of rs6000_isa_flags is: /* If the caller has option attributes, then use them. Otherwise, use the command line options. */ if (caller_tree) caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags; else caller_isa = rs6000_isa_flags; This patch changes this part to be like (logically): if (caller_tree) caller_isa = TREE_TARGET_OPTION (caller_tree)->x_rs6000_isa_flags; else caller_isa = TREE_TARGET_OPTION (target_option_default_node)->x_rs6000_isa_flags; If the question is for the others out of function rs6000_can_inline_p, no. The rs6000_isa_flags holds the flags for the current context of either top level or some given function decl. As function rs6000_set_current_function shows, we already consider target_option_default_node for the case that there is NULL DECL_FUNCTION_SPECIFIC_TARGET for one decl. >>> The fusion ones are obvious. The other two are not. Please put those >>> two more obviously separate (not somewhere in the sea of fusion >>> options), and some comment wouldn't hurt either. You can make it >>> separate statements even, make it really stand out. >>> >> >> Fixed by splitting it into: fusion_options_mask and optimization_options_mask. >> I'm not happy for the later name, but poor to get a better in mind. Do you >> have any suggestions on the name and words? > > You dont have to split it into variables, as you found out then you get > the usual naming problem. But you can just split it in the code: > > if (important_condition > || another_important_one > /* comment explaining things */ > || bla1 || bla2 || bla3 || bla4 || bla5) > Got it, thanks. Removed those two variables and updated it to: /* Some features can be tolerated for always inlines. */ unsigned HOST_WIDE_INT always_inline_safe_mask /* Fusion option masks. */ = OPTION_MASK_P8_FUSION | OPTION_MASK_P10_FUSION | OPTION_MASK_P8_FUSION_SIGN | OPTION_MASK_P10_FUSION | OPTION_MASK_P10_FUSION_LD_CMPI | OPTION_MASK_P10_FUSION_2LOGICAL | OPTION_MASK_P10_FUSION_LOGADD | OPTION_MASK_P10_FUSION_ADDLOG | OPTION_MASK_P10_FUSION_2ADD /* Like fusion, some option masks which are just for optimization. */ | OPTION_MASK_SAVE_TOC_INDIRECT | OPTION_MASK_PCREL_OPT; >>> Why are there OPTION_MASKs for separate P10 fusion types here, as well as >>> MASK_P10_FUSION? >> >> Mike helped to explain the history, I've updated all of them using OPTION_MASK_ >> to avoid potential confusion. > > That is one thing, sure, but why are both needed? Both the "main" flag, > and the "details" flags. Because they are taken as independent bits in the below checking with bitwise manipulation, like: (caller_isa & callee_isa) == callee_isa) > > (The latter should soon go away btw). > > > Segher > The updated patch is attached, just changing always_inline_safe_mask should be no any functional changes comparing to the previous version. Does it look good to you? BR, Kewen --- gcc/ChangeLog: PR ipa/102059 * config/rs6000/rs6000.c (rs6000_can_inline_p): Adjust with target_option_default_node and consider always_inline_safe flags. gcc/testsuite/ChangeLog: PR ipa/102059 * gcc.target/powerpc/pr102059-1.c: New test. * gcc.target/powerpc/pr102059-2.c: New test. * gcc.target/powerpc/pr102059-3.c: New test. * gcc.target/powerpc/pr102059-4.c: New test. * gcc.target/powerpc/pr102059-5.c: New test. -----