From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 0DF583858C50 for ; Wed, 8 Feb 2023 17:10:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0DF583858C50 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=none smtp.mailfrom=kam.mff.cuni.cz Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 3300A282393; Wed, 8 Feb 2023 18:10:08 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1675876208; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=g/yEPPue8RMCcfg/Cd+qZ6dIrOU1fObZC4HaaKcdWtU=; b=QNr6yZVkaVfVDYQ7mkW3Q29ZBiR6yPzpiBasGYMzUwLGrUEiM7w+iMNslVqYs0tXzhme/Q mKyelQDu+kkoNADoKpq2oOYqa1y77FATwGGp/AsUZIVQHBdDJvkMBgJF6dSVB+ghiMDWDr 2YP/MTW3MLQBnjlQ511oYu5eDHXeS10= Date: Wed, 8 Feb 2023 18:10:08 +0100 From: Jan Hubicka To: Jakub Jelinek Cc: Richard Biener , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] cgraph: Handle simd clones in cgraph_node::set_{const,pure}_flag [PR106433] Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > Hi! > > The following testcase ICEs, because we determine only in late pure const > pass that bar is const (the content of the function loses a store to a > global var during dse3 and read from it during cddce2) and local-pure-const2 > makes it const. The cgraph ordering is that post IPA (in late IPA simd > clones are created) bar is processed first, then foo as its caller, then > foo.simdclone* and finally bar.simdclone*. Conceptually I think that is the > right ordering which allows for static simd clones to be removed. > > The reason for the ICE is that because bar was marked const, the call to > it lost vops before vectorization, and when we in foo.simdclone* try to > vectorize the call to bar, we replace it with bar.simdclone* which hasn't > been marked const and so needs vops, which we don't add. > > Now, because the simd clones are created from the same IL, just in a loop > with different argument/return value passing, I think generally if the base > function is determined to be const or pure, the simd clones should be too, > unless e.g. the vectorization causes different optimization decisions, but > then still the global memory reads if any shouldn't affect what the function > does and global memory stores shouldn't be reachable at runtime. My understanding of simd clones is bit limited, but I think you are right that they should have the same semantics as their caller. I think const may be one that makes compiler to ICE, but there are many other places where function body is analyzed and all its aliases/thunks and other variants should be updated too. For exmaple set_pure_flag, nothrow, noreturn and analysis done by modref, ipa-refernece etc. I wonder if we want to update them all and hide that in some abstraction? Next stage 1 I can work on inventing iterators for those kind of things as current approach combinindg direct walkters and function wrappers has become bit hard to maintain in cases like this. Honza > > So, the following patch changes set_{const,pure}_flag to mark also simd > clones. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2023-02-07 Jakub Jelinek > > PR tree-optimization/106433 > * cgraph.cc (set_const_flag_1): Recurse on simd clones too. > (cgraph_node::set_pure_flag): Call set_pure_flag_1 on simd clones too. > > * gcc.c-torture/compile/pr106433.c: New test. > > --- gcc/cgraph.cc.jj 2023-02-02 10:54:44.327473492 +0100 > +++ gcc/cgraph.cc 2023-02-06 12:28:22.040593063 +0100 > @@ -2764,6 +2764,9 @@ set_const_flag_1 (cgraph_node *node, boo > if (!set_const || alias->get_availability () > AVAIL_INTERPOSABLE) > set_const_flag_1 (alias, set_const, looping, changed); > } > + for (struct cgraph_node *n = node->simd_clones; n != NULL; > + n = n->simdclone->next_clone) > + set_const_flag_1 (n, set_const, looping, changed); > for (cgraph_edge *e = node->callers; e; e = e->next_caller) > if (e->caller->thunk > && (!set_const || e->caller->get_availability () > AVAIL_INTERPOSABLE)) > @@ -2876,6 +2879,9 @@ cgraph_node::set_pure_flag (bool pure, b > { > struct set_pure_flag_info info = {pure, looping, false}; > call_for_symbol_thunks_and_aliases (set_pure_flag_1, &info, !pure, true); > + for (struct cgraph_node *n = simd_clones; n != NULL; > + n = n->simdclone->next_clone) > + set_pure_flag_1 (n, &info); > return info.changed; > } > > --- gcc/testsuite/gcc.c-torture/compile/pr106433.c.jj 2023-02-06 12:37:26.963748811 +0100 > +++ gcc/testsuite/gcc.c-torture/compile/pr106433.c 2023-02-06 12:37:06.631041918 +0100 > @@ -0,0 +1,24 @@ > +/* PR tree-optimization/106433 */ > + > +int m, *p; > + > +__attribute__ ((simd)) int > +bar (int x) > +{ > + if (x) > + { > + if (m < 1) > + for (m = 0; m < 1; ++m) > + ++x; > + p = &x; > + for (;;) > + ++m; > + } > + return 0; > +} > + > +__attribute__ ((simd)) int > +foo (int x) > +{ > + return bar (x); > +} > > Jakub >