From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id F38643857803 for ; Thu, 16 Mar 2023 14:11:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F38643857803 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id DFE7521A27; Thu, 16 Mar 2023 14:11:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1678975861; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HOLOPHF37FH90RC14Bwld8lXmn3VklIsLqN1Z7jngg0=; b=fUxLGqKt7e923lj6Aeygddy/BZDLy5uRqcW3YOLo1KQ18Etne9AhkvN8KFwaUn2v80bIxE clsVjcryUPMn3u3XxftO4htxNju3x0+5uA0+H+o6wDdHjwp/xy1P+qG2sXINUqJeh0U1bB oSytpblMWjCOYa2UrSeTN3vO9IV3MSc= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1678975861; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HOLOPHF37FH90RC14Bwld8lXmn3VklIsLqN1Z7jngg0=; b=zmFeu/Prp2fvNQcV5su7hoHb5J2tm+rIx263rW+uDq9g/cZhofl/mjBmvvf2cJLnPcjz47 jCmRtERwxh3mYaCA== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id D13432C141; Thu, 16 Mar 2023 14:11:01 +0000 (UTC) Date: Thu, 16 Mar 2023 14:11:01 +0000 (UTC) From: Richard Biener To: Jakub Jelinek cc: Jan Hubicka , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 16 Mar 2023, Richard Biener wrote: > On Thu, 16 Mar 2023, Jakub Jelinek wrote: > > > On Thu, Mar 16, 2023 at 12:05:56PM +0000, Richard Biener wrote: > > > On Thu, 16 Mar 2023, Jakub Jelinek wrote: > > > > > > > On Fri, Nov 25, 2022 at 09:26:34PM +0100, Richard Biener via Gcc-patches wrote: > > > > > > We could > > > > > > probably keep tract if any instrumented code was ever inlined into a > > > > > > given function and perhaps just start ignoring attributes set on types? > > > > > > > > > > But ignoring attributes on types makes all indirect calls not > > > > > const/pure annotatable (OK, they are not pure annotatable because of > > > > > the above bug). I really don't see how to conservatively solve this > > > > > issue? > > > > > > > > > Maybe we can ignore all pure/const when the cgraph state is > > > > > in profile-instrumented state? Of course we have multiple "APIs" > > > > > to query that. > > > > > > > > I think that's the way to go. But we'd need to arrange somewhere during IPA > > > > to add vops to all those pure/const calls if -fprofile-generate (direct or > > > > indirect; not sure what exact flags) and after that make sure all our APIs > > > > don't return ECF_CONST, ECF_PURE or ECF_LOOPING_CONST_OR_PURE. > > > > Could be e.g. > > > > if (whatever) > > > > flags &= ~(ECF_CONST | ECF_PURE | ECF_LOOPING_CONST_OR_PURE); > > > > at the end of flags_from_decl_or_type, internal_fn_flags, what else? > > > > Although, perhaps internal_fn_flags don't need to change, because internal > > > > calls don't really have callees. > > > > > > > > Or is just ECF_CONST enough given that ECF_PURE is on fndecls and not > > > > types? > > > > > > > > Or perhaps just start ignoring TYPE_READONLY -> ECF_CONST in > > > > flags_from_decl_or_type if that flag is on? > > > > > > > > Is that rewriting currently what tree_profiling does in the > > > > /* Update call statements and rebuild the cgraph. */ > > > > FOR_EACH_DEFINED_FUNCTION (node) > > > > spot where it calls update_stmt on all call statements? > > > > > > > > If so, could we just set that global? flag instead or in addition to > > > > doing those node->set_const_flag (false, false); calls and > > > > change flags_from_decl_or_type, plus I guess lto1 should set that > > > > flag if it is global on start as well if > > > > !flag_auto_profile > > > > && (flag_branch_probabilities || flag_test_coverage || profile_arc_flag) > > > > ? > > > > > > > > I think just ignoring ECF_CONST from TYPE_READONLY might be best, if we > > > > have external functions like builtins from libc/libm and don't have their > > > > bodies, we can still treat them as const, the only problem is with the > > > > indirect calls where we don't know if we do or don't have a body for > > > > the callees and whether we've instrumented those or not. > > > > > > I think we want something reflected on the IL. Because of LTO I think > > > we cannot ignore externs (or we have to do massaging at stream-in). > > > > > > The following brute-force works for the testcase. I suppose since > > > we leave const/pure set on functions without body in the compile-time > > > TU (and ignore LTO there) we could do the same here. > > > > > > I also wonder if that whole function walk is necessary if not > > > profile_arc_flag || flag_test_coverage ... > > > > > > Honza - does this look OK to you? I'll test guarding the whole > > > outer loop with profile_arc_flag || flag_test_coverage separately. > > > > > > diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc > > > index ea9d7a23443..c8789618f96 100644 > > > --- a/gcc/tree-profile.cc > > > +++ b/gcc/tree-profile.cc > > > @@ -840,9 +840,29 @@ tree_profiling (void) > > > gimple_stmt_iterator gsi; > > > for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > > { > > > - gimple *stmt = gsi_stmt (gsi); > > > - if (is_gimple_call (stmt)) > > > - update_stmt (stmt); > > > + gcall *call = dyn_cast (gsi_stmt (gsi)); > > > + if (!call) > > > + continue; > > > + > > > + if (profile_arc_flag || flag_test_coverage) > > > + { > > > + /* We do not clear pure/const on decls without body. */ > > > + tree fndecl = gimple_call_fndecl (call); > > > + if (fndecl && !gimple_has_body_p (fndecl)) > > > + continue; > > > + > > > + /* Drop the const attribute from the call type (the pure > > > + attribute is not available on types). */ > > > + tree fntype = gimple_call_fntype (call); > > > + if (fntype && TYPE_READONLY (fntype)) > > > + gimple_call_set_fntype (call, build_qualified_type > > > + (fntype, (TYPE_QUALS > > > (fntype) > > > + & > > > ~TYPE_QUAL_CONST))); > > > + } > > > + > > > + /* Update virtual operands of calls to no longer const/pure > > > + functions. */ > > > + update_stmt (call); > > > } > > > } > > > > I have in the meantime briefly tested following. > > > > But if you want to the above way, then at least the testcase could be > > useful. Though, not sure if the above is all that is needed. Shouldn't > > set_const_flag_1 upon TREE_READONLY (node->decl) = 0; also adjust > > TREE_TYPE on the function to > > tree fntype = TREE_TYPE (node->decl); > > TREE_TYPE (node->decl) > > = build_qualified_type (fntype, > > TYPE_QUALS (fntype) & ~TYPE_QUAL_CONST); > > too (perhaps guarded on TREE_READONLY (fntype))? > > That would be unnecessary for the problem at hand I think. Nothing > should look at this type. > > Let's wait for Honzas opinion. The following is what I profile-bootstrapped and tested on x86_64-unknown-linux-gnu. Richard. >From d438a0d84cafced85c90204cba81de0f60ad0073 Mon Sep 17 00:00:00 2001 From: Richard Biener Date: Thu, 16 Mar 2023 13:51:19 +0100 Subject: [PATCH] tree-optimization/106912 - clear const attribute from fntype To: gcc-patches@gcc.gnu.org The following makes sure that after clearing pure/const from instrumented function declarations we are adjusting call statements fntype as well to handle indirect calls and because gimple_call_flags looks at both decl and fntype. Like the pure/const flag clearing on decls we refrain from touching calls to known functions that do not have a body in the current TU. PR tree-optimization/106912 * tree-profile.cc (tree_profiling): Update stmts only when profiling or testing coverage. Make sure to update calls fntype, stripping 'const' there. * gcc.dg/profile-generate-4.c: New testcase. --- gcc/testsuite/gcc.dg/profile-generate-4.c | 19 ++++++++++++ gcc/tree-profile.cc | 38 +++++++++++++++++------ 2 files changed, 47 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/profile-generate-4.c diff --git a/gcc/testsuite/gcc.dg/profile-generate-4.c b/gcc/testsuite/gcc.dg/profile-generate-4.c new file mode 100644 index 00000000000..c2b999fe4cb --- /dev/null +++ b/gcc/testsuite/gcc.dg/profile-generate-4.c @@ -0,0 +1,19 @@ +/* PR106912 */ +/* { dg-require-profiling "-fprofile-generate" } */ +/* { dg-options "-O2 -fprofile-generate -ftree-vectorize" } */ + +__attribute__ ((__simd__)) +__attribute__ ((__nothrow__ , __leaf__ , __const__, __noinline__)) +double foo (double x); + +void bar(double *f, int n) +{ + int i; + for (i = 0; i < n; i++) + f[i] = foo(f[i]); +} + +double foo(double x) +{ + return x * x / 3.0; +} diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc index ea9d7a23443..7854cd4bc31 100644 --- a/gcc/tree-profile.cc +++ b/gcc/tree-profile.cc @@ -835,16 +835,34 @@ tree_profiling (void) push_cfun (DECL_STRUCT_FUNCTION (node->decl)); - FOR_EACH_BB_FN (bb, cfun) - { - gimple_stmt_iterator gsi; - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) - { - gimple *stmt = gsi_stmt (gsi); - if (is_gimple_call (stmt)) - update_stmt (stmt); - } - } + if (profile_arc_flag || flag_test_coverage) + FOR_EACH_BB_FN (bb, cfun) + { + gimple_stmt_iterator gsi; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gcall *call = dyn_cast (gsi_stmt (gsi)); + if (!call) + continue; + + /* We do not clear pure/const on decls without body. */ + tree fndecl = gimple_call_fndecl (call); + if (fndecl && !gimple_has_body_p (fndecl)) + continue; + + /* Drop the const attribute from the call type (the pure + attribute is not available on types). */ + tree fntype = gimple_call_fntype (call); + if (fntype && TYPE_READONLY (fntype)) + gimple_call_set_fntype + (call, build_qualified_type (fntype, (TYPE_QUALS (fntype) + & ~TYPE_QUAL_CONST))); + + /* Update virtual operands of calls to no longer const/pure + functions. */ + update_stmt (call); + } + } /* re-merge split blocks. */ cleanup_tree_cfg (); -- 2.35.3