From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id E6667385697A for ; Thu, 27 Oct 2022 10:09:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E6667385697A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.95,217,1661846400"; d="scan'208";a="88360639" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 27 Oct 2022 02:09:39 -0800 IronPort-SDR: gzLPzXQje6LDDzXKn4/N5MQnAYQi8GJKJYRxRIa407MmJPCpCw+LL5Ts/lptsqy7UM2wSKiwu4 WZ8Ze15d7XveTt7R+jQipqFtlvrV6Rv9KxuCKWYgqT8T2vE8IVyZH+BMoslwZlv7tO6Je8z6kg t+qzPRp4fhudawUdQSp9S0hYo+p4ZpTC/jHTTnGy8ZFWGoDeepxAl50ppFh1NoMAzQ81v9Ubos HJxifR2G2JERFEBpX9wYv7wZY2FoxTu88FcoWpaAR2rPWqv8ii3UaZWojWXpYRhaZJLMFVOUVy 5kY= From: Thomas Schwinge To: Sandra Loosemore , Jakub Jelinek CC: Jan Hubicka , Subject: Re: [PATCH v3] Re: OpenMP: Generate SIMD clones for functions with "declare target" In-Reply-To: References: <0b64e323-63f9-e4b7-eb7f-83f3b5e3125b@codesourcery.com> <001679b1-814a-c1db-5611-c663f6931d11@codesourcery.com> User-Agent: Notmuch/0.29.1+93~g67ed7df (https://notmuchmail.org) Emacs/26.3 (x86_64-pc-linux-gnu) Date: Thu, 27 Oct 2022 12:09:33 +0200 Message-ID: <871qqtd1cy.fsf@dem-tschwing-1.ger.mentorg.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-07.mgc.mentorg.com (139.181.222.7) To svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi! On 2022-10-26T20:27:19-0600, Sandra Loosemore wro= te: > On 10/20/22 08:07, Jakub Jelinek wrote: >> Thus, IMHO it is exactly the pass_omp_simd_clone pass where you want to >> implement this auto-simdization discovery, guarded with >> #ifdef ACCEL_COMPILER and the new option (which means it will be done >> only for gcn and not on the host right now). > > I'm running into a practical difficulty with making this controlled by a > static #ifdef: namely, testing. > > One of my test cases examines the .s output to make sure that the clones > are emitted as local symbols and not global. I have not been able to > find the symbol linkage information in any of the dump files Hmm, also some of '-fdump-ipa-all-details' doesn't help here? > and I have > also not been able to figure out how to get a .s file from the offload > compiler even outside of the DejaGnu test harness. (It's possible I am > just an extreme dummy about the latter problem, but so far none of my > colleagues here has been able to give me a recipe either.) Right, currently only 'scan-offload-tree-dump[...]', 'scan-offload-rtl-dump[...]' are implemented; I assume 'scan-offload-assembler[...]' could be added without too much effort. > On top of that, I worry that this should be tested more broadly than for > the one target we're presently focusing on (AMD GCN), and we'll get much > more regular test coverage if it's also enabled for x86_64 target which > has the necessary compute_vecsize_and_simdlen target hook. > > I remember Carlos O'Donnell used to have a favorite mantra, "design for > test". Heh, I don't remember him ever saying that to me -- but maybe that's because this is what I do anyway. ;-P > So, maybe generalize the new -fopenmp-target-simd-clone option > to take a parameter to force clones to be generated on the OpenMP host > for test purposes? The "declare target" directive already has a clause > > device_type(host|nohost|any) > > that defaults to "any"; maybe we could use that syntax like > -fopenmp-target-simd-clone=3Dany > and use the intersection of the two sets to determine what to > auto-generate clones for? Seems reasonable to me (but I'm missing a lot of context here). There anyway is a goal (far out) to get rid of compilation-time '#ifdef ACCEL_COMPILER' etc., and instead make such code dependent on a command-line flag (or some other state), so that it's possible to use the the same compiler for target (host) as well as offload target compilation. (For example, to simulate offloading compilation with standard x86_64-pc-linux-gnu GCC.) And/or, where you implement the logic to "make sure that the clones are emitted as local symbols and not global", do emit some "tag" in the dump file, and the scan for that? Random examples that I just remembered: 'gcc/omp-offload.cc:execute_oacc_loop_designation' handling of 'OMP_CLAUSE_NOHOST', and how that's scanned (host-side) in test cases such as 'libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c', 'libgomp/testsuite/libgomp.oacc-fortran/routine-nohost-1.f90'. 'gcc/config/nvptx/nvptx.cc:nvptx_find_sese' doing 'fprintf (dump_file, "SESE regions:"); [...]', and that's scanned in: libgomp/testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c-/* Match {N-= >N(.N)+} */ libgomp/testsuite/libgomp.oacc-c-c++-common/nvptx-sese-1.c:/* { dg-fina= l { scan-offload-rtl-dump "SESE regions:.* \[0-9\]+{\[0-9\]+->\[0-9\]+(\\.\= [0-9\]+)+}" "mach" } } */ (You'd be doing this at the 'scan-offload-tree-dump[...]' level, I suppose.) Gr=C3=BC=C3=9Fe Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955