From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 609303858D1E for ; Tue, 7 Feb 2023 08:36:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 609303858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675758975; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=V4KuCd8i/fBQk01XFW/pde9YmtXgObIpQgTX7tpuZ+g=; b=jWrEQkA/LOOS+6vm6Etq2LKP0wRThJb+0Sk6svWuzm+UN4v9TN81JRtEfCn6v1Ebv11mXJ 40dR9SYXSsUHuKrDuX5/b3qo+aNB1gC7lKgt8Hget+Ii2Conbpgf5isurz5ytRHD7NtvfA 0xHCQY0ZtSTKX6WGDaB6FUl6zLxCpjY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-226-zrEAnnxFP3m4fZO0LjbgFQ-1; Tue, 07 Feb 2023 03:36:11 -0500 X-MC-Unique: zrEAnnxFP3m4fZO0LjbgFQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6DC5B80D0EF; Tue, 7 Feb 2023 08:36:11 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.223]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 26F8F401014C; Tue, 7 Feb 2023 08:36:11 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 3178a8IH957479 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 7 Feb 2023 09:36:08 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 3178a7v2957476; Tue, 7 Feb 2023 09:36:07 +0100 Date: Tue, 7 Feb 2023 09:36:06 +0100 From: Jakub Jelinek To: Jan Hubicka , Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] cgraph: Handle simd clones in cgraph_node::set_{const,pure}_flag [PR106433] Message-ID: Reply-To: Jakub Jelinek MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi! The following testcase ICEs, because we determine only in late pure const pass that bar is const (the content of the function loses a store to a global var during dse3 and read from it during cddce2) and local-pure-const2 makes it const. The cgraph ordering is that post IPA (in late IPA simd clones are created) bar is processed first, then foo as its caller, then foo.simdclone* and finally bar.simdclone*. Conceptually I think that is the right ordering which allows for static simd clones to be removed. The reason for the ICE is that because bar was marked const, the call to it lost vops before vectorization, and when we in foo.simdclone* try to vectorize the call to bar, we replace it with bar.simdclone* which hasn't been marked const and so needs vops, which we don't add. Now, because the simd clones are created from the same IL, just in a loop with different argument/return value passing, I think generally if the base function is determined to be const or pure, the simd clones should be too, unless e.g. the vectorization causes different optimization decisions, but then still the global memory reads if any shouldn't affect what the function does and global memory stores shouldn't be reachable at runtime. So, the following patch changes set_{const,pure}_flag to mark also simd clones. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2023-02-07 Jakub Jelinek PR tree-optimization/106433 * cgraph.cc (set_const_flag_1): Recurse on simd clones too. (cgraph_node::set_pure_flag): Call set_pure_flag_1 on simd clones too. * gcc.c-torture/compile/pr106433.c: New test. --- gcc/cgraph.cc.jj 2023-02-02 10:54:44.327473492 +0100 +++ gcc/cgraph.cc 2023-02-06 12:28:22.040593063 +0100 @@ -2764,6 +2764,9 @@ set_const_flag_1 (cgraph_node *node, boo if (!set_const || alias->get_availability () > AVAIL_INTERPOSABLE) set_const_flag_1 (alias, set_const, looping, changed); } + for (struct cgraph_node *n = node->simd_clones; n != NULL; + n = n->simdclone->next_clone) + set_const_flag_1 (n, set_const, looping, changed); for (cgraph_edge *e = node->callers; e; e = e->next_caller) if (e->caller->thunk && (!set_const || e->caller->get_availability () > AVAIL_INTERPOSABLE)) @@ -2876,6 +2879,9 @@ cgraph_node::set_pure_flag (bool pure, b { struct set_pure_flag_info info = {pure, looping, false}; call_for_symbol_thunks_and_aliases (set_pure_flag_1, &info, !pure, true); + for (struct cgraph_node *n = simd_clones; n != NULL; + n = n->simdclone->next_clone) + set_pure_flag_1 (n, &info); return info.changed; } --- gcc/testsuite/gcc.c-torture/compile/pr106433.c.jj 2023-02-06 12:37:26.963748811 +0100 +++ gcc/testsuite/gcc.c-torture/compile/pr106433.c 2023-02-06 12:37:06.631041918 +0100 @@ -0,0 +1,24 @@ +/* PR tree-optimization/106433 */ + +int m, *p; + +__attribute__ ((simd)) int +bar (int x) +{ + if (x) + { + if (m < 1) + for (m = 0; m < 1; ++m) + ++x; + p = &x; + for (;;) + ++m; + } + return 0; +} + +__attribute__ ((simd)) int +foo (int x) +{ + return bar (x); +} Jakub