From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by sourceware.org (Postfix) with ESMTPS id D1D1C383B784 for ; Fri, 1 Jul 2022 12:09:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D1D1C383B784 Received: by mail-qk1-x72f.google.com with SMTP id z16so1648213qkj.7 for ; Fri, 01 Jul 2022 05:09:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/Zp+TKvLg8E8R9rn0Mb33/FyV2vtqeRu2LEL3+eJq04=; b=jYx36D3U+SQcVfLY+ifdmolMYbLDqIfMnYnn6666bTx2JYhuLJjjNGhrm4Ghk0Mvks E7U5TbwcDvT1/lrqgiDWLBKo6V/8VeFd2ncMD9yDwV6Rvf4nCL5aUPk8EOtKBUwOo40T h1r4JLYdU7126pe/muJXc83ZCVFgXn6jzBCHu1v4WfLxVV7fi08z72l28PrLeDq4U6T8 xRIYOZ6jd52dX1ZsBbceSD4RAbdr8UyqgeISE4IOBKjrd7G6ESE0La/eGDcDQmPd03RW WNrEklxASk+eeaorGJ3tdAS8TR71JSbhaC8VPH7G7KzXygavS7ALLLQvAh18bHRALm/z NyPQ== X-Gm-Message-State: AJIora8ME3gbCJBfrZQT6mmD4hbVxuiYuN4eq8u1EaJmmxC2dPEiErCt ntu2EK3rKVPm/6KOGRQstk1m1Yj5t0JPYY2txpc= X-Google-Smtp-Source: AGRyM1s+0ZHVmWBbrU9ozDFlEFJJBaL3OSoQxZajYLvfuDScmeC+6KivIEZ2VPu9IEctMQfa2DpEvqG2oTuegC6AhQE= X-Received: by 2002:a05:620a:4103:b0:6b1:43c9:467b with SMTP id j3-20020a05620a410300b006b143c9467bmr9879660qko.4.1656677390046; Fri, 01 Jul 2022 05:09:50 -0700 (PDT) MIME-Version: 1.0 References: <75d4b8f9-5ed2-de4c-d606-510a66714c9a@linux.ibm.com> <05835d48-50e9-be2c-2b60-1b42e9c6e5d6@linux.ibm.com> In-Reply-To: From: Richard Biener Date: Fri, 1 Jul 2022 14:09:39 +0200 Message-ID: Subject: Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459] To: Jan Hubicka Cc: "Kewen.Lin" , Jakub Jelinek , GCC Patches , =?UTF-8?Q?Martin_Li=C5=A1ka?= Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2022 12:09:54 -0000 On Fri, Jul 1, 2022 at 11:20 AM Jan Hubicka wrote: > > > On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote: > > > > > > Hi, > > > > > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > > > > > BR, > > > Kewen > > > > > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > > > Hi, > > > > > > > > PR105459 exposes one issue in inline_call handling that when it > > > > decides to copy FP flags from callee to caller and rebuild the > > > > optimization node for caller fndecl, it's possible that the target > > > > option node is also necessary to be rebuilt. Without updating > > > > target option node early, it can make nodes share the same target > > > > option node wrongly, later when we want to unshare it somewhere > > > > (like in target hook) it can get unexpected results, like ICE on > > > > uninitialized secondary member of target globals exposed in this PR. > > > > I think that > > > > /* Reload global optimization flags. */ > > if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > > set_cfun (cfun, true); > > > > is supposed to do that via ix86_set_current_function which will eventually > > re-build the target optimization node exactly for this reason. > > > > But with LTO we arrive here during WPA time only and there cfun is NULL > > (and so is DECL_STRUCT_FUNCTION (to->decl)), so the target doesn't > > get the chance to fix things up here. > > I see this logic was added by Martin in 2017 and it indeed looks bit > odd, but I suppose it is intended primarily for early inliner where cfun > is non-NULL and we really need to update global state. > > > > Now, it should be fine to delay this fixup until we set the cfun at LTRANS > > time but there we run into > > > > if (old_tree != new_tree) > > { > > cl_target_option_restore (&global_options, &global_options_set, > > TREE_TARGET_OPTION (new_tree)); > > ... > > } > > else if (flag_unsafe_math_optimizations > > != TREE_TARGET_OPTION (new_tree)->x_ix86_unsafe_math_optimizations > > || (flag_excess_precision > > != TREE_TARGET_OPTION (new_tree)->x_ix86_excess_precision)) > > { > > ... FIXUP! ... > > > > and old_tree != new_tree disables the fixup. > > > > When we refactor the above to always consider the FP flag change (so apply it > > lazily), then this fixes the testcase in the PR as well. Thus something like > > the attached. > > > > Ideally this stuff would be refactored to a target hook that can work without > > the set_cfun, also working towards merging the target and optimization node > > since they have to be kept in sync ... > > > > I think your proposed patch makes another variant through the maze to > > do something at WPA time but that makes it all even more complicated :/ > > > > Sorry for the delay btw. > > > > Folks - any other opinions? > > Your patch looks reasonable to me... Indeed working on nodes directly > would be nicer, but that means bigger surgery in the optimization > handling right? Yeah, nothing like I feel doing right now ... I'm going to give the patch full testing now. Richard. > > Thanks, > Honza > > > > Thanks, > > Richard. > > > > > > Commit r12-3721 makes it get exact fp_expression info and causes > > > > more optimization chances then exposes this issue. Commit r11-5855 > > > > introduces two target options to shadow flag_excess_precision and > > > > flag_unsafe_math_optimizations and shows the need to rebuild target > > > > node in inline_call when optimization node changes. > > > > > > > > As commented in PR105459, I tried to postpone init_function_start > > > > in cgraph_node::expand, but abandoned it since I thought it just > > > > concealed the issue. And I also tried to adjust the target node > > > > when current function switching, but failed since we get the NULL > > > > cfun and fndecl in WPA phase. > > > > > > > > Bootstrapped and regtested on x86_64-redhat-linux, powerpc64-linux-gnu > > > > P8 and powerpc64le-linux-gnu P9. > > > > > > > > Any thoughts? Is it OK for trunk? > > > > > > > > BR, > > > > Kewen > > > > ----- > > > > > > > > PR tree-optimization/105459 > > > > > > > > gcc/ChangeLog: > > > > > > > > * ipa-inline-transform.cc (inline_call): Rebuild target option node > > > > once optimization node gets rebuilt. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > * gcc.dg/lto/pr105459_0.c: New test. > > > > --- > > > > gcc/ipa-inline-transform.cc | 50 +++++++++++++++++++++++++-- > > > > gcc/testsuite/gcc.dg/lto/pr105459_0.c | 35 +++++++++++++++++++ > > > > 2 files changed, 83 insertions(+), 2 deletions(-) > > > > create mode 100644 gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > > > > > diff --git a/gcc/ipa-inline-transform.cc b/gcc/ipa-inline-transform.cc > > > > index 07288e57c73..edba58377f4 100644 > > > > --- a/gcc/ipa-inline-transform.cc > > > > +++ b/gcc/ipa-inline-transform.cc > > > > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. If not see > > > > #include "ipa-modref.h" > > > > #include "symtab-thunks.h" > > > > #include "symtab-clones.h" > > > > +#include "target.h" > > > > > > > > int ncalls_inlined; > > > > int nfunctions_inlined; > > > > @@ -469,8 +470,53 @@ inline_call (struct cgraph_edge *e, bool update_original, > > > > } > > > > > > > > /* Reload global optimization flags. */ > > > > - if (reload_optimization_node && DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > > - set_cfun (cfun, true); > > > > + if (reload_optimization_node) > > > > + { > > > > + /* Only need to check and update target option node > > > > + when target_option_default_node is not NULL. */ > > > > + if (target_option_default_node) > > > > + { > > > > + /* Save the current context for optimization and target option > > > > + node. */ > > > > + tree old_optimize > > > > + = build_optimization_node (&global_options, &global_options_set); > > > > + tree old_target_opt > > > > + = build_target_option_node (&global_options, &global_options_set); > > > > + > > > > + /* Restore optimization with new optimizatin node. */ > > > > + tree new_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl); > > > > + if (old_optimize != new_optimize) > > > > + cl_optimization_restore (&global_options, &global_options_set, > > > > + TREE_OPTIMIZATION (new_optimize)); > > > > + > > > > + /* Restore target option with the one from caller fndecl. */ > > > > + tree cur_target_opt = DECL_FUNCTION_SPECIFIC_TARGET (to->decl); > > > > + if (!cur_target_opt) > > > > + cur_target_opt = target_option_default_node; > > > > + cl_target_option_restore (&global_options, &global_options_set, > > > > + TREE_TARGET_OPTION (cur_target_opt)); > > > > + > > > > + /* Update target option as optimization changes. */ > > > > + targetm.target_option.override (); > > > > + > > > > + /* Rebuild target option node for caller fndecl and replace > > > > + with it if the node changes. */ > > > > + tree new_target_opt > > > > + = build_target_option_node (&global_options, &global_options_set); > > > > + if (cur_target_opt != new_target_opt) > > > > + DECL_FUNCTION_SPECIFIC_TARGET (to->decl) = new_target_opt; > > > > + > > > > + /* Restore the context with previous saved nodes. */ > > > > + if (old_optimize != new_optimize) > > > > + cl_optimization_restore (&global_options, &global_options_set, > > > > + TREE_OPTIMIZATION (old_optimize)); > > > > + cl_target_option_restore (&global_options, &global_options_set, > > > > + TREE_TARGET_OPTION (old_target_opt)); > > > > + } > > > > + > > > > + if (DECL_STRUCT_FUNCTION (to->decl) == cfun) > > > > + set_cfun (cfun, true); > > > > + } > > > > > > > > /* If aliases are involved, redirect edge to the actual destination and > > > > possibly remove the aliases. */ > > > > diff --git a/gcc/testsuite/gcc.dg/lto/pr105459_0.c b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > new file mode 100644 > > > > index 00000000000..c799e6ef23d > > > > --- /dev/null > > > > +++ b/gcc/testsuite/gcc.dg/lto/pr105459_0.c > > > > @@ -0,0 +1,35 @@ > > > > +/* { dg-lto-do link } */ > > > > +/* { dg-lto-options { { -flto -O1 } } } */ > > > > + > > > > +double m; > > > > +int n; > > > > + > > > > +__attribute__ ((optimize ("-funsafe-math-optimizations"))) > > > > +void > > > > +bar (int x) > > > > +{ > > > > + n = x; > > > > + m = n; > > > > +} > > > > + > > > > +__attribute__ ((flatten)) > > > > +void > > > > +foo (int x) > > > > +{ > > > > + bar (x); > > > > +} > > > > + > > > > +void > > > > +quux (void) > > > > +{ > > > > + ++n; > > > > +} > > > > + > > > > +int > > > > +main (void) > > > > +{ > > > > + foo (0); > > > > + quux (); > > > > + > > > > + return 0; > > > > +} > > > > -- > > > > 2.27.0 > >