From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 0FB833858D35 for ; Thu, 16 Mar 2023 12:14:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0FB833858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678968844; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=ie5Tyllp8C+ffAsrAHRCKSyhTO8Pc+Bpx0Phbz2uG2M=; b=N2Vipd3SLmzkXXMJa0qn5vgsDt6IETDozXyzwx2UmcL7U+35+cgF7+2oZzI6dJJpwVs+oP Tqu/N1J6hIXO/W/elaZtwk7/jzCGbErR0O8bqIHu5FRsahwI+1l7IlVNi+IGxdlk+VRNdb 30VHImxs7UhArjNaAhAn0f/YiNOXI2o= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-215-fU1TqVzQMQOmatATuD8smQ-1; Thu, 16 Mar 2023 08:14:03 -0400 X-MC-Unique: fU1TqVzQMQOmatATuD8smQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A1922185A790; Thu, 16 Mar 2023 12:14:02 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 65BAC492B00; Thu, 16 Mar 2023 12:14:02 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 32GCDx7l2892450 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 16 Mar 2023 13:14:00 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 32GCDwqu2892449; Thu, 16 Mar 2023 13:13:58 +0100 Date: Thu, 16 Mar 2023 13:13:58 +0100 From: Jakub Jelinek To: Richard Biener Cc: Jan Hubicka , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const Message-ID: Reply-To: Jakub Jelinek References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Mar 16, 2023 at 12:05:56PM +0000, Richard Biener wrote: > On Thu, 16 Mar 2023, Jakub Jelinek wrote: > > > On Fri, Nov 25, 2022 at 09:26:34PM +0100, Richard Biener via Gcc-patches wrote: > > > > We could > > > > probably keep tract if any instrumented code was ever inlined into a > > > > given function and perhaps just start ignoring attributes set on types? > > > > > > But ignoring attributes on types makes all indirect calls not > > > const/pure annotatable (OK, they are not pure annotatable because of > > > the above bug). I really don't see how to conservatively solve this > > > issue? > > > > > Maybe we can ignore all pure/const when the cgraph state is > > > in profile-instrumented state? Of course we have multiple "APIs" > > > to query that. > > > > I think that's the way to go. But we'd need to arrange somewhere during IPA > > to add vops to all those pure/const calls if -fprofile-generate (direct or > > indirect; not sure what exact flags) and after that make sure all our APIs > > don't return ECF_CONST, ECF_PURE or ECF_LOOPING_CONST_OR_PURE. > > Could be e.g. > > if (whatever) > > flags &= ~(ECF_CONST | ECF_PURE | ECF_LOOPING_CONST_OR_PURE); > > at the end of flags_from_decl_or_type, internal_fn_flags, what else? > > Although, perhaps internal_fn_flags don't need to change, because internal > > calls don't really have callees. > > > > Or is just ECF_CONST enough given that ECF_PURE is on fndecls and not > > types? > > > > Or perhaps just start ignoring TYPE_READONLY -> ECF_CONST in > > flags_from_decl_or_type if that flag is on? > > > > Is that rewriting currently what tree_profiling does in the > > /* Update call statements and rebuild the cgraph. */ > > FOR_EACH_DEFINED_FUNCTION (node) > > spot where it calls update_stmt on all call statements? > > > > If so, could we just set that global? flag instead or in addition to > > doing those node->set_const_flag (false, false); calls and > > change flags_from_decl_or_type, plus I guess lto1 should set that > > flag if it is global on start as well if > > !flag_auto_profile > > && (flag_branch_probabilities || flag_test_coverage || profile_arc_flag) > > ? > > > > I think just ignoring ECF_CONST from TYPE_READONLY might be best, if we > > have external functions like builtins from libc/libm and don't have their > > bodies, we can still treat them as const, the only problem is with the > > indirect calls where we don't know if we do or don't have a body for > > the callees and whether we've instrumented those or not. > > I think we want something reflected on the IL. Because of LTO I think > we cannot ignore externs (or we have to do massaging at stream-in). > > The following brute-force works for the testcase. I suppose since > we leave const/pure set on functions without body in the compile-time > TU (and ignore LTO there) we could do the same here. > > I also wonder if that whole function walk is necessary if not > profile_arc_flag || flag_test_coverage ... > > Honza - does this look OK to you? I'll test guarding the whole > outer loop with profile_arc_flag || flag_test_coverage separately. > > diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc > index ea9d7a23443..c8789618f96 100644 > --- a/gcc/tree-profile.cc > +++ b/gcc/tree-profile.cc > @@ -840,9 +840,29 @@ tree_profiling (void) > gimple_stmt_iterator gsi; > for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > { > - gimple *stmt = gsi_stmt (gsi); > - if (is_gimple_call (stmt)) > - update_stmt (stmt); > + gcall *call = dyn_cast (gsi_stmt (gsi)); > + if (!call) > + continue; > + > + if (profile_arc_flag || flag_test_coverage) > + { > + /* We do not clear pure/const on decls without body. */ > + tree fndecl = gimple_call_fndecl (call); > + if (fndecl && !gimple_has_body_p (fndecl)) > + continue; > + > + /* Drop the const attribute from the call type (the pure > + attribute is not available on types). */ > + tree fntype = gimple_call_fntype (call); > + if (fntype && TYPE_READONLY (fntype)) > + gimple_call_set_fntype (call, build_qualified_type > + (fntype, (TYPE_QUALS > (fntype) > + & > ~TYPE_QUAL_CONST))); > + } > + > + /* Update virtual operands of calls to no longer const/pure > + functions. */ > + update_stmt (call); > } > } I have in the meantime briefly tested following. But if you want to the above way, then at least the testcase could be useful. Though, not sure if the above is all that is needed. Shouldn't set_const_flag_1 upon TREE_READONLY (node->decl) = 0; also adjust TREE_TYPE on the function to tree fntype = TREE_TYPE (node->decl); TREE_TYPE (node->decl) = build_qualified_type (fntype, TYPE_QUALS (fntype) & ~TYPE_QUAL_CONST); too (perhaps guarded on TREE_READONLY (fntype))? 2023-03-16 Jakub Jelinek PR tree-optimization/106912 gcc/ * calls.h (ignore_const_fntype): Declare. * calls.cc (flags_from_decl_or_type): Ignore TYPE_READONLY on types if ignore_const_fntype. * tree-profile.cc: Include calls.h. (tree_profiling): Set ignore_const_fntype if profile_arc_flag or flag_test_coverage. gcc/lto/ * lto-lang.cc (lto_post_options): Set ignore_const_fntype if profile_arc_flag or flag_test_coverage and not flag_auto_profile. gcc/testsuite/ * gcc.dg/pr106912.c: New test. --- gcc/calls.h.jj 2023-01-02 09:32:51.252868185 +0100 +++ gcc/calls.h 2023-03-16 12:23:51.632460586 +0100 @@ -134,5 +134,6 @@ extern void maybe_complain_about_tail_ca extern rtx rtx_for_static_chain (const_tree, bool); extern bool cxx17_empty_base_field_p (const_tree); +extern bool ignore_const_fntype; #endif // GCC_CALLS_H --- gcc/calls.cc.jj 2023-02-21 11:44:48.460464845 +0100 +++ gcc/calls.cc 2023-03-16 12:27:45.427032110 +0100 @@ -800,6 +800,13 @@ is_tm_builtin (const_tree fndecl) return false; } +/* Set if flags_from_decl_or_type should ignore TYPE_READONLY of function + types. This is used when tree-profile.cc instruments const calls, + clears TREE_READONLY on FUNCTION_DECLs which have been instrumented, but + for function types or indirect calls we don't know if the callees have been + instrumented or not. */ +bool ignore_const_fntype; + /* Detect flags (function attributes) from the function decl or type node. */ int @@ -849,7 +856,7 @@ flags_from_decl_or_type (const_tree exp) } else if (TYPE_P (exp)) { - if (TYPE_READONLY (exp)) + if (TYPE_READONLY (exp) && !ignore_const_fntype) flags |= ECF_CONST; if (flag_tm --- gcc/tree-profile.cc.jj 2023-01-02 09:32:27.923205268 +0100 +++ gcc/tree-profile.cc 2023-03-16 12:57:29.130873893 +0100 @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. #include "alloc-pool.h" #include "symbol-summary.h" #include "symtab-thunks.h" +#include "calls.h" static GTY(()) tree gcov_type_node; static GTY(()) tree tree_interval_profiler_fn; @@ -819,6 +820,10 @@ tree_profiling (void) node->set_pure_flag (false, false); } + /* From this point on, ignore TYPE_READONLY on function types. */ + if (profile_arc_flag || flag_test_coverage) + ignore_const_fntype = true; + /* Update call statements and rebuild the cgraph. */ FOR_EACH_DEFINED_FUNCTION (node) { --- gcc/lto/lto-lang.cc.jj 2023-01-16 11:52:16.165732856 +0100 +++ gcc/lto/lto-lang.cc 2023-03-16 12:58:00.420414840 +0100 @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. #include "lto-common.h" #include "stringpool.h" #include "attribs.h" +#include "calls.h" /* LTO specific dumps. */ int lto_link_dump_id, decl_merge_dump_id, partition_dump_id; @@ -938,6 +939,11 @@ lto_post_options (const char **pfilename if (!flag_merge_constants) flag_merge_constants = 1; + /* If const functions were or might have been instrumented by + tree-profiler, ignore TYPE_READONLY on function types. */ + if (!flag_auto_profile && (profile_arc_flag || flag_test_coverage)) + ignore_const_fntype = true; + /* Initialize the compiler back end. */ return false; } --- gcc/testsuite/gcc.dg/pr106912.c.jj 2023-03-16 13:02:17.720639903 +0100 +++ gcc/testsuite/gcc.dg/pr106912.c 2023-03-16 13:03:47.472323118 +0100 @@ -0,0 +1,22 @@ +/* PR tree-optimization/106912 */ +/* { dg-do compile { target vect_simd_clones } } */ +/* { dg-require-profiling "-fprofile-generate" } */ +/* { dg-options "-O2 -ftree-vectorize -fprofile-generate" } */ +/* { dg-additional-options "-fPIC" { target fpic } } */ + +__attribute__ ((__simd__, __nothrow__, __leaf__, __const__)) +double foo (double x); + +void +bar (double *f, int n) +{ + int i; + for (i = 0; i < n; i++) + f[i] = foo (f[i]); +} + +double +foo (double x) +{ + return x * x / 3.0; +} Jakub