From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id 286B33858D35 for ; Thu, 16 Mar 2023 12:22:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 286B33858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 8C0601FE03; Thu, 16 Mar 2023 12:22:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1678969321; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NL+sSWFCBL5fJANDDDLRmlk9weX/ONr28NvyxZ/VsuM=; b=ysu2kgDtHo2m8HFqIeWfrBzIoMk0M6Ca1n9NNwN7IxAIvRJnVMSvaTcv/BCAQesCxfUlmF 9PuM2w67Vvo8QW8KpPB8BM6z9UdkQabKTdSn+Qr+qxEWPo7m7X/qFE42nqZyEOpH5I7Nlk 1uXRqPB++Zog2YeIVMy6z769xHyFxro= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1678969321; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=NL+sSWFCBL5fJANDDDLRmlk9weX/ONr28NvyxZ/VsuM=; b=0NGIAz9FZ8tkSUzPX4sXUGXaZ6XjV3DdTEII48X4T5UVFLoEn6pesagSIL0tXRVIEy/ESg JfKOOVgTbEhfIfCA== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 5D5FB2C141; Thu, 16 Mar 2023 12:22:01 +0000 (UTC) Date: Thu, 16 Mar 2023 12:22:01 +0000 (UTC) From: Richard Biener To: Jakub Jelinek cc: Jan Hubicka , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 16 Mar 2023, Jakub Jelinek wrote: > On Thu, Mar 16, 2023 at 12:05:56PM +0000, Richard Biener wrote: > > On Thu, 16 Mar 2023, Jakub Jelinek wrote: > > > > > On Fri, Nov 25, 2022 at 09:26:34PM +0100, Richard Biener via Gcc-patches wrote: > > > > > We could > > > > > probably keep tract if any instrumented code was ever inlined into a > > > > > given function and perhaps just start ignoring attributes set on types? > > > > > > > > But ignoring attributes on types makes all indirect calls not > > > > const/pure annotatable (OK, they are not pure annotatable because of > > > > the above bug). I really don't see how to conservatively solve this > > > > issue? > > > > > > > Maybe we can ignore all pure/const when the cgraph state is > > > > in profile-instrumented state? Of course we have multiple "APIs" > > > > to query that. > > > > > > I think that's the way to go. But we'd need to arrange somewhere during IPA > > > to add vops to all those pure/const calls if -fprofile-generate (direct or > > > indirect; not sure what exact flags) and after that make sure all our APIs > > > don't return ECF_CONST, ECF_PURE or ECF_LOOPING_CONST_OR_PURE. > > > Could be e.g. > > > if (whatever) > > > flags &= ~(ECF_CONST | ECF_PURE | ECF_LOOPING_CONST_OR_PURE); > > > at the end of flags_from_decl_or_type, internal_fn_flags, what else? > > > Although, perhaps internal_fn_flags don't need to change, because internal > > > calls don't really have callees. > > > > > > Or is just ECF_CONST enough given that ECF_PURE is on fndecls and not > > > types? > > > > > > Or perhaps just start ignoring TYPE_READONLY -> ECF_CONST in > > > flags_from_decl_or_type if that flag is on? > > > > > > Is that rewriting currently what tree_profiling does in the > > > /* Update call statements and rebuild the cgraph. */ > > > FOR_EACH_DEFINED_FUNCTION (node) > > > spot where it calls update_stmt on all call statements? > > > > > > If so, could we just set that global? flag instead or in addition to > > > doing those node->set_const_flag (false, false); calls and > > > change flags_from_decl_or_type, plus I guess lto1 should set that > > > flag if it is global on start as well if > > > !flag_auto_profile > > > && (flag_branch_probabilities || flag_test_coverage || profile_arc_flag) > > > ? > > > > > > I think just ignoring ECF_CONST from TYPE_READONLY might be best, if we > > > have external functions like builtins from libc/libm and don't have their > > > bodies, we can still treat them as const, the only problem is with the > > > indirect calls where we don't know if we do or don't have a body for > > > the callees and whether we've instrumented those or not. > > > > I think we want something reflected on the IL. Because of LTO I think > > we cannot ignore externs (or we have to do massaging at stream-in). > > > > The following brute-force works for the testcase. I suppose since > > we leave const/pure set on functions without body in the compile-time > > TU (and ignore LTO there) we could do the same here. > > > > I also wonder if that whole function walk is necessary if not > > profile_arc_flag || flag_test_coverage ... > > > > Honza - does this look OK to you? I'll test guarding the whole > > outer loop with profile_arc_flag || flag_test_coverage separately. > > > > diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc > > index ea9d7a23443..c8789618f96 100644 > > --- a/gcc/tree-profile.cc > > +++ b/gcc/tree-profile.cc > > @@ -840,9 +840,29 @@ tree_profiling (void) > > gimple_stmt_iterator gsi; > > for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > { > > - gimple *stmt = gsi_stmt (gsi); > > - if (is_gimple_call (stmt)) > > - update_stmt (stmt); > > + gcall *call = dyn_cast (gsi_stmt (gsi)); > > + if (!call) > > + continue; > > + > > + if (profile_arc_flag || flag_test_coverage) > > + { > > + /* We do not clear pure/const on decls without body. */ > > + tree fndecl = gimple_call_fndecl (call); > > + if (fndecl && !gimple_has_body_p (fndecl)) > > + continue; > > + > > + /* Drop the const attribute from the call type (the pure > > + attribute is not available on types). */ > > + tree fntype = gimple_call_fntype (call); > > + if (fntype && TYPE_READONLY (fntype)) > > + gimple_call_set_fntype (call, build_qualified_type > > + (fntype, (TYPE_QUALS > > (fntype) > > + & > > ~TYPE_QUAL_CONST))); > > + } > > + > > + /* Update virtual operands of calls to no longer const/pure > > + functions. */ > > + update_stmt (call); > > } > > } > > I have in the meantime briefly tested following. > > But if you want to the above way, then at least the testcase could be > useful. Though, not sure if the above is all that is needed. Shouldn't > set_const_flag_1 upon TREE_READONLY (node->decl) = 0; also adjust > TREE_TYPE on the function to > tree fntype = TREE_TYPE (node->decl); > TREE_TYPE (node->decl) > = build_qualified_type (fntype, > TYPE_QUALS (fntype) & ~TYPE_QUAL_CONST); > too (perhaps guarded on TREE_READONLY (fntype))? That would be unnecessary for the problem at hand I think. Nothing should look at this type. Let's wait for Honzas opinion. Richard. > > 2023-03-16 Jakub Jelinek > > PR tree-optimization/106912 > gcc/ > * calls.h (ignore_const_fntype): Declare. > * calls.cc (flags_from_decl_or_type): Ignore TYPE_READONLY > on types if ignore_const_fntype. > * tree-profile.cc: Include calls.h. > (tree_profiling): Set ignore_const_fntype if profile_arc_flag > or flag_test_coverage. > gcc/lto/ > * lto-lang.cc (lto_post_options): Set ignore_const_fntype if > profile_arc_flag or flag_test_coverage and not flag_auto_profile. > gcc/testsuite/ > * gcc.dg/pr106912.c: New test. > > --- gcc/calls.h.jj 2023-01-02 09:32:51.252868185 +0100 > +++ gcc/calls.h 2023-03-16 12:23:51.632460586 +0100 > @@ -134,5 +134,6 @@ extern void maybe_complain_about_tail_ca > > extern rtx rtx_for_static_chain (const_tree, bool); > extern bool cxx17_empty_base_field_p (const_tree); > +extern bool ignore_const_fntype; > > #endif // GCC_CALLS_H > --- gcc/calls.cc.jj 2023-02-21 11:44:48.460464845 +0100 > +++ gcc/calls.cc 2023-03-16 12:27:45.427032110 +0100 > @@ -800,6 +800,13 @@ is_tm_builtin (const_tree fndecl) > return false; > } > > +/* Set if flags_from_decl_or_type should ignore TYPE_READONLY of function > + types. This is used when tree-profile.cc instruments const calls, > + clears TREE_READONLY on FUNCTION_DECLs which have been instrumented, but > + for function types or indirect calls we don't know if the callees have been > + instrumented or not. */ > +bool ignore_const_fntype; > + > /* Detect flags (function attributes) from the function decl or type node. */ > > int > @@ -849,7 +856,7 @@ flags_from_decl_or_type (const_tree exp) > } > else if (TYPE_P (exp)) > { > - if (TYPE_READONLY (exp)) > + if (TYPE_READONLY (exp) && !ignore_const_fntype) > flags |= ECF_CONST; > > if (flag_tm > --- gcc/tree-profile.cc.jj 2023-01-02 09:32:27.923205268 +0100 > +++ gcc/tree-profile.cc 2023-03-16 12:57:29.130873893 +0100 > @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. > #include "alloc-pool.h" > #include "symbol-summary.h" > #include "symtab-thunks.h" > +#include "calls.h" > > static GTY(()) tree gcov_type_node; > static GTY(()) tree tree_interval_profiler_fn; > @@ -819,6 +820,10 @@ tree_profiling (void) > node->set_pure_flag (false, false); > } > > + /* From this point on, ignore TYPE_READONLY on function types. */ > + if (profile_arc_flag || flag_test_coverage) > + ignore_const_fntype = true; > + > /* Update call statements and rebuild the cgraph. */ > FOR_EACH_DEFINED_FUNCTION (node) > { > --- gcc/lto/lto-lang.cc.jj 2023-01-16 11:52:16.165732856 +0100 > +++ gcc/lto/lto-lang.cc 2023-03-16 12:58:00.420414840 +0100 > @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. > #include "lto-common.h" > #include "stringpool.h" > #include "attribs.h" > +#include "calls.h" > > /* LTO specific dumps. */ > int lto_link_dump_id, decl_merge_dump_id, partition_dump_id; > @@ -938,6 +939,11 @@ lto_post_options (const char **pfilename > if (!flag_merge_constants) > flag_merge_constants = 1; > > + /* If const functions were or might have been instrumented by > + tree-profiler, ignore TYPE_READONLY on function types. */ > + if (!flag_auto_profile && (profile_arc_flag || flag_test_coverage)) > + ignore_const_fntype = true; > + > /* Initialize the compiler back end. */ > return false; > } > --- gcc/testsuite/gcc.dg/pr106912.c.jj 2023-03-16 13:02:17.720639903 +0100 > +++ gcc/testsuite/gcc.dg/pr106912.c 2023-03-16 13:03:47.472323118 +0100 > @@ -0,0 +1,22 @@ > +/* PR tree-optimization/106912 */ > +/* { dg-do compile { target vect_simd_clones } } */ > +/* { dg-require-profiling "-fprofile-generate" } */ > +/* { dg-options "-O2 -ftree-vectorize -fprofile-generate" } */ > +/* { dg-additional-options "-fPIC" { target fpic } } */ > + > +__attribute__ ((__simd__, __nothrow__, __leaf__, __const__)) > +double foo (double x); > + > +void > +bar (double *f, int n) > +{ > + int i; > + for (i = 0; i < n; i++) > + f[i] = foo (f[i]); > +} > + > +double > +foo (double x) > +{ > + return x * x / 3.0; > +} > > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)