* [PATCH] inline: Rebuild target option node for caller [PR105459] @ 2022-06-06 6:20 Kewen.Lin 2022-06-23 2:03 ` PING^1 " Kewen.Lin 2022-07-08 11:37 ` Martin Liška 0 siblings, 2 replies; 8+ messages in thread From: Kewen.Lin @ 2022-06-06 6:20 UTC (permalink / raw) To: GCC Patches; +Cc: Richard Biener, Jakub Jelinek, Jan Hubicka, Martin Liška Hi, PR105459 exposes one issue in inline_call handling that when it decides to copy FP flags from callee to caller and rebuild the optimization node for caller fndecl, it's possible that the target option node is also necessary to be rebuilt. Without updating target option node early, it can make nodes share the same target option node wrongly, later when we want to unshare it somewhere (like in target hook) it can get unexpected results, like ICE on uninitialized secondary member of target globals exposed in this PR. Commit r12-3721 makes it get exact fp_expression info and causes more optimization chances then exposes this issue. Commit r11-5855 introduces two target options to shadow flag_excess_precision and flag_unsafe_math_optimizations and shows the need to rebuild target node in inline_call when optimization node changes. As commented in PR105459, I tried to postpone init_function_start in cgraph_node::expand, but abandoned it since I thought it just concealed the issue. And I also tried to adjust the target node when current function switching, but failed since we get the NULL cfun and fndecl in WPA phase. Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu P8 and powerpc64le-linux-gnu P9. Any thoughts? Is it OK for trunk? BR, Kewen ----- PR tree-optimization/105459 gcc/ChangeLog: * ipa-inline-transform.cc (inline_call): Rebuild target option node once optimization node gets rebuilt. gcc/testsuite/ChangeLog: * gcc.dg/lto/pr105459_0.c: New test. --- gcc/ipa-inline-transform.cc | 50 +++++++++++++++++++++++++-- gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++++++++++++++++++ 2 files changed, 83 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc index 07288e57c73..edba58377f4 100644 --- a/gcc/ipa-inline-transform.cc +++ b/gcc/ipa-inline-transform.cc @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see #include "ipa-modref.h" #include "symtab-thunks.h" #include "symtab-clones.h" +#include "target.h" int ncalls_inlined; int nfunctions_inlined; @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool update_original, } /* Reload global optimization flags. */ - if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) - set_cfun (cfun, true); + if (reload_optimization_node) + { + /* Only need to check and update target option node + when target_option_default_node is not NULL. */ + if (target_option_default_node) + { + /* Save the current context for optimization and target option + node. */ + tree old_optimize + = build_optimization_node (&global_options, &global_options_set); + tree old_target_opt + = build_target_option_node (&global_options, &global_options_set); + + /* Restore optimization with new optimizatin node. */ + tree new_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl); + if (old_optimize != new_optimize) + cl_optimization_restore (&global_options, &global_options_set, + TREE_OPTIMIZATION (new_optimize)); + + /* Restore target option with the one from caller fndecl. */ + tree cur_target_opt = DECL_FUNCTION_SPECIFIC_TARGET (to->decl); + if (!cur_target_opt) + cur_target_opt = target_option_default_node; + cl_target_option_restore (&global_options, &global_options_set, + TREE_TARGET_OPTION (cur_target_opt)); + + /* Update target option as optimization changes. */ + targetm.target_option.override (); + + /* Rebuild target option node for caller fndecl and replace + with it if the node changes. */ + tree new_target_opt + = build_target_option_node (&global_options, &global_options_set); + if (cur_target_opt != new_target_opt) + DECL_FUNCTION_SPECIFIC_TARGET (to->decl) = new_target_opt; + + /* Restore the context with previous saved nodes. */ + if (old_optimize != new_optimize) + cl_optimization_restore (&global_options, &global_options_set, + TREE_OPTIMIZATION (old_optimize)); + cl_target_option_restore (&global_options, &global_options_set, + TREE_TARGET_OPTION (old_target_opt)); + } + + if (DECL_STRUCT_FUNCTION (to->decl) == cfun) + set_cfun (cfun, true); + } /* If aliases are involved, redirect edge to the actual destination and possibly remove the aliases. */ diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c b/gcc/testsuite/gcc.dg/lto/pr105459_0.c new file mode 100644 index 00000000000..c799e6ef23d --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c @@ -0,0 +1,35 @@ +/* { dg-lto-do link } */ +/* { dg-lto-options { { -flto -O1 } } } */ + +double m; +int n; + +__attribute__ ((optimize ("-funsafe-math-optimizations"))) +void +bar (int x) +{ + n = x; + m = n; +} + +__attribute__ ((flatten)) +void +foo (int x) +{ + bar (x); +} + +void +quux (void) +{ + ++n; +} + +int +main (void) +{ + foo (0); + quux (); + + return 0; +} -- 2.27.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459] 2022-06-06 6:20 [PATCH] inline: Rebuild target option node for caller [PR105459] Kewen.Lin @ 2022-06-23 2:03 ` Kewen.Lin 2022-07-01 8:40 ` Richard Biener 2022-07-08 11:37 ` Martin Liška 1 sibling, 1 reply; 8+ messages in thread From: Kewen.Lin @ 2022-06-23 2:03 UTC (permalink / raw) To: GCC Patches; +Cc: Jakub Jelinek, Jan Hubicka, Richard Biener, Martin Liška Hi, Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html BR, Kewen on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > Hi, > > PR105459 exposes one issue in inline_call handling that when it > decides to copy FP flags from callee to caller and rebuild the > optimization node for caller fndecl, it's possible that the target > option node is also necessary to be rebuilt. Without updating > target option node early, it can make nodes share the same target > option node wrongly, later when we want to unshare it somewhere > (like in target hook) it can get unexpected results, like ICE on > uninitialized secondary member of target globals exposed in this PR. > > Commit r12-3721 makes it get exact fp_expression info and causes > more optimization chances then exposes this issue. Commit r11-5855 > introduces two target options to shadow flag_excess_precision and > flag_unsafe_math_optimizations and shows the need to rebuild target > node in inline_call when optimization node changes. > > As commented in PR105459, I tried to postpone init_function_start > in cgraph_node::expand, but abandoned it since I thought it just > concealed the issue. And I also tried to adjust the target node > when current function switching, but failed since we get the NULL > cfun and fndecl in WPA phase. > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu > P8 and powerpc64le-linux-gnu P9. > > Any thoughts? Is it OK for trunk? > > BR, > Kewen > ----- > > PR tree-optimization/105459 > > gcc/ChangeLog: > > * ipa-inline-transform.cc (inline_call): Rebuild target option node > once optimization node gets rebuilt. > > gcc/testsuite/ChangeLog: > > * gcc.dg/lto/pr105459_0.c: New test. > --- > gcc/ipa-inline-transform.cc | 50 +++++++++++++++++++++++++-- > gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++++++++++++++++++ > 2 files changed, 83 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > index 07288e57c73..edba58377f4 100644 > --- a/gcc/ipa-inline-transform.cc > +++ b/gcc/ipa-inline-transform.cc > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see > #include "ipa-modref.h" > #include "symtab-thunks.h" > #include "symtab-clones.h" > +#include "target.h" > > int ncalls_inlined; > int nfunctions_inlined; > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool update_original, > } > > /* Reload global optimization flags. */ > - if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > - set_cfun (cfun, true); > + if (reload_optimization_node) > + { > + /* Only need to check and update target option node > + when target_option_default_node is not NULL. */ > + if (target_option_default_node) > + { > + /* Save the current context for optimization and target option > + node. */ > + tree old_optimize > + = build_optimization_node (&global_options, &global_options_set); > + tree old_target_opt > + = build_target_option_node (&global_options, &global_options_set); > + > + /* Restore optimization with new optimizatin node. */ > + tree new_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl); > + if (old_optimize != new_optimize) > + cl_optimization_restore (&global_options, &global_options_set, > + TREE_OPTIMIZATION (new_optimize)); > + > + /* Restore target option with the one from caller fndecl. */ > + tree cur_target_opt = DECL_FUNCTION_SPECIFIC_TARGET (to->decl); > + if (!cur_target_opt) > + cur_target_opt = target_option_default_node; > + cl_target_option_restore (&global_options, &global_options_set, > + TREE_TARGET_OPTION (cur_target_opt)); > + > + /* Update target option as optimization changes. */ > + targetm.target_option.override (); > + > + /* Rebuild target option node for caller fndecl and replace > + with it if the node changes. */ > + tree new_target_opt > + = build_target_option_node (&global_options, &global_options_set); > + if (cur_target_opt != new_target_opt) > + DECL_FUNCTION_SPECIFIC_TARGET (to->decl) = new_target_opt; > + > + /* Restore the context with previous saved nodes. */ > + if (old_optimize != new_optimize) > + cl_optimization_restore (&global_options, &global_options_set, > + TREE_OPTIMIZATION (old_optimize)); > + cl_target_option_restore (&global_options, &global_options_set, > + TREE_TARGET_OPTION (old_target_opt)); > + } > + > + if (DECL_STRUCT_FUNCTION (to->decl) == cfun) > + set_cfun (cfun, true); > + } > > /* If aliases are involved, redirect edge to the actual destination and > possibly remove the aliases. */ > diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > new file mode 100644 > index 00000000000..c799e6ef23d > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > @@ -0,0 +1,35 @@ > +/* { dg-lto-do link } */ > +/* { dg-lto-options { { -flto -O1 } } } */ > + > +double m; > +int n; > + > +__attribute__ ((optimize ("-funsafe-math-optimizations"))) > +void > +bar (int x) > +{ > + n = x; > + m = n; > +} > + > +__attribute__ ((flatten)) > +void > +foo (int x) > +{ > + bar (x); > +} > + > +void > +quux (void) > +{ > + ++n; > +} > + > +int > +main (void) > +{ > + foo (0); > + quux (); > + > + return 0; > +} > -- > 2.27.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459] 2022-06-23 2:03 ` PING^1 " Kewen.Lin @ 2022-07-01 8:40 ` Richard Biener 2022-07-01 9:20 ` Jan Hubicka 2022-07-01 9:42 ` Kewen.Lin 0 siblings, 2 replies; 8+ messages in thread From: Richard Biener @ 2022-07-01 8:40 UTC (permalink / raw) To: Kewen.Lin; +Cc: GCC Patches, Jakub Jelinek, Jan Hubicka, Martin Liška [-- Attachment #1: Type: text/plain, Size: 7741 bytes --] On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin <linkw@linux.ibm.com> wrote: > > Hi, > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > BR, > Kewen > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > Hi, > > > > PR105459 exposes one issue in inline_call handling that when it > > decides to copy FP flags from callee to caller and rebuild the > > optimization node for caller fndecl, it's possible that the target > > option node is also necessary to be rebuilt. Without updating > > target option node early, it can make nodes share the same target > > option node wrongly, later when we want to unshare it somewhere > > (like in target hook) it can get unexpected results, like ICE on > > uninitialized secondary member of target globals exposed in this PR. I think that /* Reload global optimization flags. */ if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) set_cfun (cfun, true); is supposed to do that via ix86_set_current_function which will eventually re-build the target optimization node exactly for this reason. But with LTO we arrive here during WPA time only and there cfun is NULL (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't get the chance to fix things up here. Now, it should be fine to delay this fixup until we set the cfun at LTRANS time but there we run into if (old_tree != new_tree) { cl_target_option_restore (&global_options, &global_options_set, TREE_TARGET_OPTION (new_tree)); ... } else if (flag_unsafe_math_optimizations != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations || (flag_excess_precision != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)) { ... FIXUP! ... and old_tree != new_tree disables the fixup. When we refactor the above to always consider the FP flag change (so apply it lazily), then this fixes the testcase in the PR as well. Thus something like the attached. Ideally this stuff would be refactored to a target hook that can work without the set_cfun, also working towards merging the target and optimization node since they have to be kept in sync ... I think your proposed patch makes another variant through the maze to do something at WPA time but that makes it all even more complicated :/ Sorry for the delay btw. Folks - any other opinions? Thanks, Richard. > > Commit r12-3721 makes it get exact fp_expression info and causes > > more optimization chances then exposes this issue. Commit r11-5855 > > introduces two target options to shadow flag_excess_precision and > > flag_unsafe_math_optimizations and shows the need to rebuild target > > node in inline_call when optimization node changes. > > > > As commented in PR105459, I tried to postpone init_function_start > > in cgraph_node::expand, but abandoned it since I thought it just > > concealed the issue. And I also tried to adjust the target node > > when current function switching, but failed since we get the NULL > > cfun and fndecl in WPA phase. > > > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu > > P8 and powerpc64le-linux-gnu P9. > > > > Any thoughts? Is it OK for trunk? > > > > BR, > > Kewen > > ----- > > > > PR tree-optimization/105459 > > > > gcc/ChangeLog: > > > > * ipa-inline-transform.cc (inline_call): Rebuild target option node > > once optimization node gets rebuilt. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/lto/pr105459_0.c: New test. > > --- > > gcc/ipa-inline-transform.cc | 50 +++++++++++++++++++++++++-- > > gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++++++++++++++++++ > > 2 files changed, 83 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > > index 07288e57c73..edba58377f4 100644 > > --- a/gcc/ipa-inline-transform.cc > > +++ b/gcc/ipa-inline-transform.cc > > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see > > #include "ipa-modref.h" > > #include "symtab-thunks.h" > > #include "symtab-clones.h" > > +#include "target.h" > > > > int ncalls_inlined; > > int nfunctions_inlined; > > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool update_original, > > } > > > > /* Reload global optimization flags. */ > > - if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > > - set_cfun (cfun, true); > > + if (reload_optimization_node) > > + { > > + /* Only need to check and update target option node > > + when target_option_default_node is not NULL. */ > > + if (target_option_default_node) > > + { > > + /* Save the current context for optimization and target option > > + node. */ > > + tree old_optimize > > + = build_optimization_node (&global_options, &global_options_set); > > + tree old_target_opt > > + = build_target_option_node (&global_options, &global_options_set); > > + > > + /* Restore optimization with new optimizatin node. */ > > + tree new_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl); > > + if (old_optimize != new_optimize) > > + cl_optimization_restore (&global_options, &global_options_set, > > + TREE_OPTIMIZATION (new_optimize)); > > + > > + /* Restore target option with the one from caller fndecl. */ > > + tree cur_target_opt = DECL_FUNCTION_SPECIFIC_TARGET (to->decl); > > + if (!cur_target_opt) > > + cur_target_opt = target_option_default_node; > > + cl_target_option_restore (&global_options, &global_options_set, > > + TREE_TARGET_OPTION (cur_target_opt)); > > + > > + /* Update target option as optimization changes. */ > > + targetm.target_option.override (); > > + > > + /* Rebuild target option node for caller fndecl and replace > > + with it if the node changes. */ > > + tree new_target_opt > > + = build_target_option_node (&global_options, &global_options_set); > > + if (cur_target_opt != new_target_opt) > > + DECL_FUNCTION_SPECIFIC_TARGET (to->decl) = new_target_opt; > > + > > + /* Restore the context with previous saved nodes. */ > > + if (old_optimize != new_optimize) > > + cl_optimization_restore (&global_options, &global_options_set, > > + TREE_OPTIMIZATION (old_optimize)); > > + cl_target_option_restore (&global_options, &global_options_set, > > + TREE_TARGET_OPTION (old_target_opt)); > > + } > > + > > + if (DECL_STRUCT_FUNCTION (to->decl) == cfun) > > + set_cfun (cfun, true); > > + } > > > > /* If aliases are involved, redirect edge to the actual destination and > > possibly remove the aliases. */ > > diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > new file mode 100644 > > index 00000000000..c799e6ef23d > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > @@ -0,0 +1,35 @@ > > +/* { dg-lto-do link } */ > > +/* { dg-lto-options { { -flto -O1 } } } */ > > + > > +double m; > > +int n; > > + > > +__attribute__ ((optimize ("-funsafe-math-optimizations"))) > > +void > > +bar (int x) > > +{ > > + n = x; > > + m = n; > > +} > > + > > +__attribute__ ((flatten)) > > +void > > +foo (int x) > > +{ > > + bar (x); > > +} > > + > > +void > > +quux (void) > > +{ > > + ++n; > > +} > > + > > +int > > +main (void) > > +{ > > + foo (0); > > + quux (); > > + > > + return 0; > > +} > > -- > > 2.27.0 [-- Attachment #2: p --] [-- Type: application/octet-stream, Size: 2084 bytes --] diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index e11f68186f5..acb2291e70f 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -3232,28 +3232,22 @@ ix86_set_current_function (tree fndecl) if (new_tree == NULL_TREE) new_tree = target_option_default_node; - if (old_tree != new_tree) + bool fp_flag_change + = (flag_unsafe_math_optimizations + != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations + || (flag_excess_precision + != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)); + if (old_tree != new_tree || fp_flag_change) { cl_target_option_restore (&global_options, &global_options_set, TREE_TARGET_OPTION (new_tree)); - if (TREE_TARGET_GLOBALS (new_tree)) - restore_target_globals (TREE_TARGET_GLOBALS (new_tree)); - else if (new_tree == target_option_default_node) - restore_target_globals (&default_target_globals); - else - TREE_TARGET_GLOBALS (new_tree) = save_target_globals_default_opts (); - } - else if (flag_unsafe_math_optimizations - != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations - || (flag_excess_precision - != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)) - { - cl_target_option_restore (&global_options, &global_options_set, - TREE_TARGET_OPTION (new_tree)); - ix86_excess_precision = flag_excess_precision; - ix86_unsafe_math_optimizations = flag_unsafe_math_optimizations; - DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_tree - = build_target_option_node (&global_options, &global_options_set); + if (fp_flag_change) + { + ix86_excess_precision = flag_excess_precision; + ix86_unsafe_math_optimizations = flag_unsafe_math_optimizations; + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_tree + = build_target_option_node (&global_options, &global_options_set); + } if (TREE_TARGET_GLOBALS (new_tree)) restore_target_globals (TREE_TARGET_GLOBALS (new_tree)); else if (new_tree == target_option_default_node) ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459] 2022-07-01 8:40 ` Richard Biener @ 2022-07-01 9:20 ` Jan Hubicka 2022-07-01 12:09 ` Richard Biener 2022-07-01 9:42 ` Kewen.Lin 1 sibling, 1 reply; 8+ messages in thread From: Jan Hubicka @ 2022-07-01 9:20 UTC (permalink / raw) To: Richard Biener; +Cc: Kewen.Lin, Jakub Jelinek, GCC Patches, mliska > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin <linkw@linux.ibm.com> wrote: > > > > Hi, > > > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > > > BR, > > Kewen > > > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > > Hi, > > > > > > PR105459 exposes one issue in inline_call handling that when it > > > decides to copy FP flags from callee to caller and rebuild the > > > optimization node for caller fndecl, it's possible that the target > > > option node is also necessary to be rebuilt. Without updating > > > target option node early, it can make nodes share the same target > > > option node wrongly, later when we want to unshare it somewhere > > > (like in target hook) it can get unexpected results, like ICE on > > > uninitialized secondary member of target globals exposed in this PR. > > I think that > > /* Reload global optimization flags. */ > if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > set_cfun (cfun, true); > > is supposed to do that via ix86_set_current_function which will eventually > re-build the target optimization node exactly for this reason. > > But with LTO we arrive here during WPA time only and there cfun is NULL > (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't > get the chance to fix things up here. I see this logic was added by Martin in 2017 and it indeed looks bit odd, but I suppose it is intended primarily for early inliner where cfun is non-NULL and we really need to update global state. > > Now, it should be fine to delay this fixup until we set the cfun at LTRANS > time but there we run into > > if (old_tree != new_tree) > { > cl_target_option_restore (&global_options, &global_options_set, > TREE_TARGET_OPTION (new_tree)); > ... > } > else if (flag_unsafe_math_optimizations > != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations > || (flag_excess_precision > != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)) > { > ... FIXUP! ... > > and old_tree != new_tree disables the fixup. > > When we refactor the above to always consider the FP flag change (so apply it > lazily), then this fixes the testcase in the PR as well. Thus something like > the attached. > > Ideally this stuff would be refactored to a target hook that can work without > the set_cfun, also working towards merging the target and optimization node > since they have to be kept in sync ... > > I think your proposed patch makes another variant through the maze to > do something at WPA time but that makes it all even more complicated :/ > > Sorry for the delay btw. > > Folks - any other opinions? Your patch looks reasonable to me... Indeed working on nodes directly would be nicer, but that means bigger surgery in the optimization handling right? Thanks, Honza > > Thanks, > Richard. > > > > Commit r12-3721 makes it get exact fp_expression info and causes > > > more optimization chances then exposes this issue. Commit r11-5855 > > > introduces two target options to shadow flag_excess_precision and > > > flag_unsafe_math_optimizations and shows the need to rebuild target > > > node in inline_call when optimization node changes. > > > > > > As commented in PR105459, I tried to postpone init_function_start > > > in cgraph_node::expand, but abandoned it since I thought it just > > > concealed the issue. And I also tried to adjust the target node > > > when current function switching, but failed since we get the NULL > > > cfun and fndecl in WPA phase. > > > > > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu > > > P8 and powerpc64le-linux-gnu P9. > > > > > > Any thoughts? Is it OK for trunk? > > > > > > BR, > > > Kewen > > > ----- > > > > > > PR tree-optimization/105459 > > > > > > gcc/ChangeLog: > > > > > > * ipa-inline-transform.cc (inline_call): Rebuild target option node > > > once optimization node gets rebuilt. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.dg/lto/pr105459_0.c: New test. > > > --- > > > gcc/ipa-inline-transform.cc | 50 +++++++++++++++++++++++++-- > > > gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++++++++++++++++++ > > > 2 files changed, 83 insertions(+), 2 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > > > index 07288e57c73..edba58377f4 100644 > > > --- a/gcc/ipa-inline-transform.cc > > > +++ b/gcc/ipa-inline-transform.cc > > > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see > > > #include "ipa-modref.h" > > > #include "symtab-thunks.h" > > > #include "symtab-clones.h" > > > +#include "target.h" > > > > > > int ncalls_inlined; > > > int nfunctions_inlined; > > > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool update_original, > > > } > > > > > > /* Reload global optimization flags. */ > > > - if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > - set_cfun (cfun, true); > > > + if (reload_optimization_node) > > > + { > > > + /* Only need to check and update target option node > > > + when target_option_default_node is not NULL. */ > > > + if (target_option_default_node) > > > + { > > > + /* Save the current context for optimization and target option > > > + node. */ > > > + tree old_optimize > > > + = build_optimization_node (&global_options, &global_options_set); > > > + tree old_target_opt > > > + = build_target_option_node (&global_options, &global_options_set); > > > + > > > + /* Restore optimization with new optimizatin node. */ > > > + tree new_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl); > > > + if (old_optimize != new_optimize) > > > + cl_optimization_restore (&global_options, &global_options_set, > > > + TREE_OPTIMIZATION (new_optimize)); > > > + > > > + /* Restore target option with the one from caller fndecl. */ > > > + tree cur_target_opt = DECL_FUNCTION_SPECIFIC_TARGET (to->decl); > > > + if (!cur_target_opt) > > > + cur_target_opt = target_option_default_node; > > > + cl_target_option_restore (&global_options, &global_options_set, > > > + TREE_TARGET_OPTION (cur_target_opt)); > > > + > > > + /* Update target option as optimization changes. */ > > > + targetm.target_option.override (); > > > + > > > + /* Rebuild target option node for caller fndecl and replace > > > + with it if the node changes. */ > > > + tree new_target_opt > > > + = build_target_option_node (&global_options, &global_options_set); > > > + if (cur_target_opt != new_target_opt) > > > + DECL_FUNCTION_SPECIFIC_TARGET (to->decl) = new_target_opt; > > > + > > > + /* Restore the context with previous saved nodes. */ > > > + if (old_optimize != new_optimize) > > > + cl_optimization_restore (&global_options, &global_options_set, > > > + TREE_OPTIMIZATION (old_optimize)); > > > + cl_target_option_restore (&global_options, &global_options_set, > > > + TREE_TARGET_OPTION (old_target_opt)); > > > + } > > > + > > > + if (DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > + set_cfun (cfun, true); > > > + } > > > > > > /* If aliases are involved, redirect edge to the actual destination and > > > possibly remove the aliases. */ > > > diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > new file mode 100644 > > > index 00000000000..c799e6ef23d > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > @@ -0,0 +1,35 @@ > > > +/* { dg-lto-do link } */ > > > +/* { dg-lto-options { { -flto -O1 } } } */ > > > + > > > +double m; > > > +int n; > > > + > > > +__attribute__ ((optimize ("-funsafe-math-optimizations"))) > > > +void > > > +bar (int x) > > > +{ > > > + n = x; > > > + m = n; > > > +} > > > + > > > +__attribute__ ((flatten)) > > > +void > > > +foo (int x) > > > +{ > > > + bar (x); > > > +} > > > + > > > +void > > > +quux (void) > > > +{ > > > + ++n; > > > +} > > > + > > > +int > > > +main (void) > > > +{ > > > + foo (0); > > > + quux (); > > > + > > > + return 0; > > > +} > > > -- > > > 2.27.0 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459] 2022-07-01 9:20 ` Jan Hubicka @ 2022-07-01 12:09 ` Richard Biener 0 siblings, 0 replies; 8+ messages in thread From: Richard Biener @ 2022-07-01 12:09 UTC (permalink / raw) To: Jan Hubicka; +Cc: Kewen.Lin, Jakub Jelinek, GCC Patches, Martin Liška On Fri, Jul 1, 2022 at 11:20 AM Jan Hubicka <hubicka@kam.mff.cuni.cz> wrote: > > > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin <linkw@linux.ibm.com> wrote: > > > > > > Hi, > > > > > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > > > > > BR, > > > Kewen > > > > > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > > > Hi, > > > > > > > > PR105459 exposes one issue in inline_call handling that when it > > > > decides to copy FP flags from callee to caller and rebuild the > > > > optimization node for caller fndecl, it's possible that the target > > > > option node is also necessary to be rebuilt. Without updating > > > > target option node early, it can make nodes share the same target > > > > option node wrongly, later when we want to unshare it somewhere > > > > (like in target hook) it can get unexpected results, like ICE on > > > > uninitialized secondary member of target globals exposed in this PR. > > > > I think that > > > > /* Reload global optimization flags. */ > > if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > > set_cfun (cfun, true); > > > > is supposed to do that via ix86_set_current_function which will eventually > > re-build the target optimization node exactly for this reason. > > > > But with LTO we arrive here during WPA time only and there cfun is NULL > > (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't > > get the chance to fix things up here. > > I see this logic was added by Martin in 2017 and it indeed looks bit > odd, but I suppose it is intended primarily for early inliner where cfun > is non-NULL and we really need to update global state. > > > > Now, it should be fine to delay this fixup until we set the cfun at LTRANS > > time but there we run into > > > > if (old_tree != new_tree) > > { > > cl_target_option_restore (&global_options, &global_options_set, > > TREE_TARGET_OPTION (new_tree)); > > ... > > } > > else if (flag_unsafe_math_optimizations > > != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations > > || (flag_excess_precision > > != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)) > > { > > ... FIXUP! ... > > > > and old_tree != new_tree disables the fixup. > > > > When we refactor the above to always consider the FP flag change (so apply it > > lazily), then this fixes the testcase in the PR as well. Thus something like > > the attached. > > > > Ideally this stuff would be refactored to a target hook that can work without > > the set_cfun, also working towards merging the target and optimization node > > since they have to be kept in sync ... > > > > I think your proposed patch makes another variant through the maze to > > do something at WPA time but that makes it all even more complicated :/ > > > > Sorry for the delay btw. > > > > Folks - any other opinions? > > Your patch looks reasonable to me... Indeed working on nodes directly > would be nicer, but that means bigger surgery in the optimization > handling right? Yeah, nothing like I feel doing right now ... I'm going to give the patch full testing now. Richard. > > Thanks, > Honza > > > > Thanks, > > Richard. > > > > > > Commit r12-3721 makes it get exact fp_expression info and causes > > > > more optimization chances then exposes this issue. Commit r11-5855 > > > > introduces two target options to shadow flag_excess_precision and > > > > flag_unsafe_math_optimizations and shows the need to rebuild target > > > > node in inline_call when optimization node changes. > > > > > > > > As commented in PR105459, I tried to postpone init_function_start > > > > in cgraph_node::expand, but abandoned it since I thought it just > > > > concealed the issue. And I also tried to adjust the target node > > > > when current function switching, but failed since we get the NULL > > > > cfun and fndecl in WPA phase. > > > > > > > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu > > > > P8 and powerpc64le-linux-gnu P9. > > > > > > > > Any thoughts? Is it OK for trunk? > > > > > > > > BR, > > > > Kewen > > > > ----- > > > > > > > > PR tree-optimization/105459 > > > > > > > > gcc/ChangeLog: > > > > > > > > * ipa-inline-transform.cc (inline_call): Rebuild target option node > > > > once optimization node gets rebuilt. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * gcc.dg/lto/pr105459_0.c: New test. > > > > --- > > > > gcc/ipa-inline-transform.cc | 50 +++++++++++++++++++++++++-- > > > > gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++++++++++++++++++ > > > > 2 files changed, 83 insertions(+), 2 deletions(-) > > > > create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > > > > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > > > > index 07288e57c73..edba58377f4 100644 > > > > --- a/gcc/ipa-inline-transform.cc > > > > +++ b/gcc/ipa-inline-transform.cc > > > > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see > > > > #include "ipa-modref.h" > > > > #include "symtab-thunks.h" > > > > #include "symtab-clones.h" > > > > +#include "target.h" > > > > > > > > int ncalls_inlined; > > > > int nfunctions_inlined; > > > > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool update_original, > > > > } > > > > > > > > /* Reload global optimization flags. */ > > > > - if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > > - set_cfun (cfun, true); > > > > + if (reload_optimization_node) > > > > + { > > > > + /* Only need to check and update target option node > > > > + when target_option_default_node is not NULL. */ > > > > + if (target_option_default_node) > > > > + { > > > > + /* Save the current context for optimization and target option > > > > + node. */ > > > > + tree old_optimize > > > > + = build_optimization_node (&global_options, &global_options_set); > > > > + tree old_target_opt > > > > + = build_target_option_node (&global_options, &global_options_set); > > > > + > > > > + /* Restore optimization with new optimizatin node. */ > > > > + tree new_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl); > > > > + if (old_optimize != new_optimize) > > > > + cl_optimization_restore (&global_options, &global_options_set, > > > > + TREE_OPTIMIZATION (new_optimize)); > > > > + > > > > + /* Restore target option with the one from caller fndecl. */ > > > > + tree cur_target_opt = DECL_FUNCTION_SPECIFIC_TARGET (to->decl); > > > > + if (!cur_target_opt) > > > > + cur_target_opt = target_option_default_node; > > > > + cl_target_option_restore (&global_options, &global_options_set, > > > > + TREE_TARGET_OPTION (cur_target_opt)); > > > > + > > > > + /* Update target option as optimization changes. */ > > > > + targetm.target_option.override (); > > > > + > > > > + /* Rebuild target option node for caller fndecl and replace > > > > + with it if the node changes. */ > > > > + tree new_target_opt > > > > + = build_target_option_node (&global_options, &global_options_set); > > > > + if (cur_target_opt != new_target_opt) > > > > + DECL_FUNCTION_SPECIFIC_TARGET (to->decl) = new_target_opt; > > > > + > > > > + /* Restore the context with previous saved nodes. */ > > > > + if (old_optimize != new_optimize) > > > > + cl_optimization_restore (&global_options, &global_options_set, > > > > + TREE_OPTIMIZATION (old_optimize)); > > > > + cl_target_option_restore (&global_options, &global_options_set, > > > > + TREE_TARGET_OPTION (old_target_opt)); > > > > + } > > > > + > > > > + if (DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > > + set_cfun (cfun, true); > > > > + } > > > > > > > > /* If aliases are involved, redirect edge to the actual destination and > > > > possibly remove the aliases. */ > > > > diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > new file mode 100644 > > > > index 00000000000..c799e6ef23d > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > @@ -0,0 +1,35 @@ > > > > +/* { dg-lto-do link } */ > > > > +/* { dg-lto-options { { -flto -O1 } } } */ > > > > + > > > > +double m; > > > > +int n; > > > > + > > > > +__attribute__ ((optimize ("-funsafe-math-optimizations"))) > > > > +void > > > > +bar (int x) > > > > +{ > > > > + n = x; > > > > + m = n; > > > > +} > > > > + > > > > +__attribute__ ((flatten)) > > > > +void > > > > +foo (int x) > > > > +{ > > > > + bar (x); > > > > +} > > > > + > > > > +void > > > > +quux (void) > > > > +{ > > > > + ++n; > > > > +} > > > > + > > > > +int > > > > +main (void) > > > > +{ > > > > + foo (0); > > > > + quux (); > > > > + > > > > + return 0; > > > > +} > > > > -- > > > > 2.27.0 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459] 2022-07-01 8:40 ` Richard Biener 2022-07-01 9:20 ` Jan Hubicka @ 2022-07-01 9:42 ` Kewen.Lin 1 sibling, 0 replies; 8+ messages in thread From: Kewen.Lin @ 2022-07-01 9:42 UTC (permalink / raw) To: Richard Biener; +Cc: GCC Patches, Jakub Jelinek, Jan Hubicka, Martin Liška Hi Richi, Thanks for the insightful comments! on 2022/7/1 16:40, Richard Biener wrote: > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin <linkw@linux.ibm.com> wrote: >> >> Hi, >> >> Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html >> >> BR, >> Kewen >> >> on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: >>> Hi, >>> >>> PR105459 exposes one issue in inline_call handling that when it >>> decides to copy FP flags from callee to caller and rebuild the >>> optimization node for caller fndecl, it's possible that the target >>> option node is also necessary to be rebuilt. Without updating >>> target option node early, it can make nodes share the same target >>> option node wrongly, later when we want to unshare it somewhere >>> (like in target hook) it can get unexpected results, like ICE on >>> uninitialized secondary member of target globals exposed in this PR. > > I think that > > /* Reload global optimization flags. */ > if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > set_cfun (cfun, true); > > is supposed to do that via ix86_set_current_function which will eventually > re-build the target optimization node exactly for this reason. > > But with LTO we arrive here during WPA time only and there cfun is NULL > (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't > get the chance to fix things up here. > Yes, when I read this code, I was thinking if we can do some similar to get the hook to update target node, but as you pointed out, the cfun is NULL here. :( > Now, it should be fine to delay this fixup until we set the cfun at LTRANS > time but there we run into > > if (old_tree != new_tree) > { > cl_target_option_restore (&global_options, &global_options_set, > TREE_TARGET_OPTION (new_tree)); > ... > } > else if (flag_unsafe_math_optimizations > != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations > || (flag_excess_precision > != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)) > { > ... FIXUP! ... > > and old_tree != new_tree disables the fixup. > > When we refactor the above to always consider the FP flag change (so apply it > lazily), then this fixes the testcase in the PR as well. Thus something like > the attached. Good idea! Previously following the code for reload_optimization_node, I thought it's good to update the target node information up to date at the same time, but your proposal with delaying fixup till LTRANS looks better, IIUC WPA passes won't care about this information out of date or not. > > Ideally this stuff would be refactored to a target hook that can work without > the set_cfun, also working towards merging the target and optimization node > since they have to be kept in sync ... > > I think your proposed patch makes another variant through the maze to > do something at WPA time but that makes it all even more complicated :/ > > Sorry for the delay btw. > No problem! Thanks again!! BR, Kewen ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] inline: Rebuild target option node for caller [PR105459] 2022-06-06 6:20 [PATCH] inline: Rebuild target option node for caller [PR105459] Kewen.Lin 2022-06-23 2:03 ` PING^1 " Kewen.Lin @ 2022-07-08 11:37 ` Martin Liška 2022-07-11 3:37 ` Kewen.Lin 1 sibling, 1 reply; 8+ messages in thread From: Martin Liška @ 2022-07-08 11:37 UTC (permalink / raw) To: Kewen.Lin, GCC Patches; +Cc: Richard Biener, Jakub Jelinek, Jan Hubicka On 6/6/22 08:20, Kewen.Lin wrote: > |Hi, PR105459 exposes one issue in inline_call handling that when it decides to copy FP flags from callee to caller and rebuild the optimization node for caller fndecl, it's possible that the target option node is also necessary to be rebuilt. Without updating target option node early, it can make nodes share the same target option node wrongly, later when we want to unshare it somewhere (like in target hook) it can get unexpected results, like ICE on uninitialized secondary member of target globals exposed in this PR.| Hello. I think your patch seems reasonable. As you mentioned we need to keep pair of target and optimization nodes together and the only correct way is by using build_target_option_node for the DECL_FUNCTION_SPECIFIC_TARGET. Thanks for the fix. Martin ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] inline: Rebuild target option node for caller [PR105459] 2022-07-08 11:37 ` Martin Liška @ 2022-07-11 3:37 ` Kewen.Lin 0 siblings, 0 replies; 8+ messages in thread From: Kewen.Lin @ 2022-07-11 3:37 UTC (permalink / raw) To: Martin Liška; +Cc: Richard Biener, Jakub Jelinek, Jan Hubicka, GCC Patches on 2022/7/8 19:37, Martin Liška wrote: > On 6/6/22 08:20, Kewen.Lin wrote: >> |Hi, PR105459 exposes one issue in inline_call handling that when it decides to copy FP flags from callee to caller and rebuild the optimization node for caller fndecl, it's possible that the target option node is also necessary to be rebuilt. Without updating target option node early, it can make nodes share the same target option node wrongly, later when we want to unshare it somewhere (like in target hook) it can get unexpected results, like ICE on uninitialized secondary member of target globals exposed in this PR.| > > > Hello. > > I think your patch seems reasonable. As you mentioned we need to keep > pair of target and optimization nodes together and the only correct way > is by using build_target_option_node for the DECL_FUNCTION_SPECIFIC_TARGET. Hi Martin, Thanks for your comments!! Yeah, the idea of this patch is to make target option node up to date right away, but Richi thought it's too complicated and posted another patch to leave it to be fixed up later in LTRANS, Honza already approved it. I guessed that thread escaped from your mail radar somehow, it started from [1]. [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597595.html BR, Kewen ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-07-11 3:37 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-06 6:20 [PATCH] inline: Rebuild target option node for caller [PR105459] Kewen.Lin 2022-06-23 2:03 ` PING^1 " Kewen.Lin 2022-07-01 8:40 ` Richard Biener 2022-07-01 9:20 ` Jan Hubicka 2022-07-01 12:09 ` Richard Biener 2022-07-01 9:42 ` Kewen.Lin 2022-07-08 11:37 ` Martin Liška 2022-07-11 3:37 ` Kewen.Lin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).