From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id A967A3858408 for ; Fri, 1 Jul 2022 09:20:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A967A3858408 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id A33AF289AEE; Fri, 1 Jul 2022 11:20:57 +0200 (CEST) Date: Fri, 1 Jul 2022 11:20:57 +0200 From: Jan Hubicka To: Richard Biener Cc: "Kewen.Lin" , Jakub Jelinek , GCC Patches , mliska@suse.cz Subject: Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459] Message-ID: References: <75d4b8f9-5ed2-de4c-d606-510a66714c9a@linux.ibm.com> <05835d48-50e9-be2c-2b60-1b42e9c6e5d6@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2022 09:21:02 -0000 > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote: > > > > Hi, > > > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > > > BR, > > Kewen > > > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > > Hi, > > > > > > PR105459 exposes one issue in inline_call handling that when it > > > decides to copy FP flags from callee to caller and rebuild the > > > optimization node for caller fndecl, it's possible that the target > > > option node is also necessary to be rebuilt. Without updating > > > target option node early, it can make nodes share the same target > > > option node wrongly, later when we want to unshare it somewhere > > > (like in target hook) it can get unexpected results, like ICE on > > > uninitialized secondary member of target globals exposed in this PR. > > I think that > > /* Reload global optimization flags. */ > if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > set_cfun (cfun, true); > > is supposed to do that via ix86_set_current_function which will eventually > re-build the target optimization node exactly for this reason. > > But with LTO we arrive here during WPA time only and there cfun is NULL > (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't > get the chance to fix things up here. I see this logic was added by Martin in 2017 and it indeed looks bit odd, but I suppose it is intended primarily for early inliner where cfun is non-NULL and we really need to update global state. > > Now, it should be fine to delay this fixup until we set the cfun at LTRANS > time but there we run into > > if (old_tree != new_tree) > { > cl_target_option_restore (&global_options, &global_options_set, > TREE_TARGET_OPTION (new_tree)); > ... > } > else if (flag_unsafe_math_optimizations > != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations > || (flag_excess_precision > != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)) > { > ... FIXUP! ... > > and old_tree != new_tree disables the fixup. > > When we refactor the above to always consider the FP flag change (so apply it > lazily), then this fixes the testcase in the PR as well. Thus something like > the attached. > > Ideally this stuff would be refactored to a target hook that can work without > the set_cfun, also working towards merging the target and optimization node > since they have to be kept in sync ... > > I think your proposed patch makes another variant through the maze to > do something at WPA time but that makes it all even more complicated :/ > > Sorry for the delay btw. > > Folks - any other opinions? Your patch looks reasonable to me... Indeed working on nodes directly would be nicer, but that means bigger surgery in the optimization handling right? Thanks, Honza > > Thanks, > Richard. > > > > Commit r12-3721 makes it get exact fp_expression info and causes > > > more optimization chances then exposes this issue. Commit r11-5855 > > > introduces two target options to shadow flag_excess_precision and > > > flag_unsafe_math_optimizations and shows the need to rebuild target > > > node in inline_call when optimization node changes. > > > > > > As commented in PR105459, I tried to postpone init_function_start > > > in cgraph_node::expand, but abandoned it since I thought it just > > > concealed the issue. And I also tried to adjust the target node > > > when current function switching, but failed since we get the NULL > > > cfun and fndecl in WPA phase. > > > > > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu > > > P8 and powerpc64le-linux-gnu P9. > > > > > > Any thoughts? Is it OK for trunk? > > > > > > BR, > > > Kewen > > > ----- > > > > > > PR tree-optimization/105459 > > > > > > gcc/ChangeLog: > > > > > > * ipa-inline-transform.cc (inline_call): Rebuild target option node > > > once optimization node gets rebuilt. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.dg/lto/pr105459_0.c: New test. > > > --- > > > gcc/ipa-inline-transform.cc | 50 +++++++++++++++++++++++++-- > > > gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++++++++++++++++++ > > > 2 files changed, 83 insertions(+), 2 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > > > index 07288e57c73..edba58377f4 100644 > > > --- a/gcc/ipa-inline-transform.cc > > > +++ b/gcc/ipa-inline-transform.cc > > > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see > > > #include "ipa-modref.h" > > > #include "symtab-thunks.h" > > > #include "symtab-clones.h" > > > +#include "target.h" > > > > > > int ncalls_inlined; > > > int nfunctions_inlined; > > > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool update_original, > > > } > > > > > > /* Reload global optimization flags. */ > > > - if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > - set_cfun (cfun, true); > > > + if (reload_optimization_node) > > > + { > > > + /* Only need to check and update target option node > > > + when target_option_default_node is not NULL. */ > > > + if (target_option_default_node) > > > + { > > > + /* Save the current context for optimization and target option > > > + node. */ > > > + tree old_optimize > > > + = build_optimization_node (&global_options, &global_options_set); > > > + tree old_target_opt > > > + = build_target_option_node (&global_options, &global_options_set); > > > + > > > + /* Restore optimization with new optimizatin node. */ > > > + tree new_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl); > > > + if (old_optimize != new_optimize) > > > + cl_optimization_restore (&global_options, &global_options_set, > > > + TREE_OPTIMIZATION (new_optimize)); > > > + > > > + /* Restore target option with the one from caller fndecl. */ > > > + tree cur_target_opt = DECL_FUNCTION_SPECIFIC_TARGET (to->decl); > > > + if (!cur_target_opt) > > > + cur_target_opt = target_option_default_node; > > > + cl_target_option_restore (&global_options, &global_options_set, > > > + TREE_TARGET_OPTION (cur_target_opt)); > > > + > > > + /* Update target option as optimization changes. */ > > > + targetm.target_option.override (); > > > + > > > + /* Rebuild target option node for caller fndecl and replace > > > + with it if the node changes. */ > > > + tree new_target_opt > > > + = build_target_option_node (&global_options, &global_options_set); > > > + if (cur_target_opt != new_target_opt) > > > + DECL_FUNCTION_SPECIFIC_TARGET (to->decl) = new_target_opt; > > > + > > > + /* Restore the context with previous saved nodes. */ > > > + if (old_optimize != new_optimize) > > > + cl_optimization_restore (&global_options, &global_options_set, > > > + TREE_OPTIMIZATION (old_optimize)); > > > + cl_target_option_restore (&global_options, &global_options_set, > > > + TREE_TARGET_OPTION (old_target_opt)); > > > + } > > > + > > > + if (DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > + set_cfun (cfun, true); > > > + } > > > > > > /* If aliases are involved, redirect edge to the actual destination and > > > possibly remove the aliases. */ > > > diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > new file mode 100644 > > > index 00000000000..c799e6ef23d > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > @@ -0,0 +1,35 @@ > > > +/* { dg-lto-do link } */ > > > +/* { dg-lto-options { { -flto -O1 } } } */ > > > + > > > +double m; > > > +int n; > > > + > > > +__attribute__ ((optimize ("-funsafe-math-optimizations"))) > > > +void > > > +bar (int x) > > > +{ > > > + n = x; > > > + m = n; > > > +} > > > + > > > +__attribute__ ((flatten)) > > > +void > > > +foo (int x) > > > +{ > > > + bar (x); > > > +} > > > + > > > +void > > > +quux (void) > > > +{ > > > + ++n; > > > +} > > > + > > > +int > > > +main (void) > > > +{ > > > + foo (0); > > > + quux (); > > > + > > > + return 0; > > > +} > > > -- > > > 2.27.0