From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 9B4D7397EC3D for ; Wed, 4 Aug 2021 13:57:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9B4D7397EC3D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: EcjP+8Aol7EkjokHqLnZZOYZayDgGGKfrcXdAMZ9oyluEkrwhNaTja2wuZtxW/nT723bnNj48l f4fwiGPmMW99kC+jdopczt848OHAF/AolDGi8R+EldHsqDv7aKP7c4MIbac//Swrh1d+SD7q+B UvbvQFdp3mEGlOp+g/xBA8JKsgWRSSzNhn0+RQ+/ehOSVqwtVVD+Qvv77U7It9O7DGSRzjct6u zr7humpmK2QhaDlAx6NAy3K/Aj0tPaZfM/LiD6Bi39vHBt9YYZEzw2GVms6T9ptFWTC7ubqttk W4woYMwtsKekMQklrCEg5VqH X-IronPort-AV: E=Sophos;i="5.84,294,1620720000"; d="scan'208";a="64446360" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 04 Aug 2021 05:57:02 -0800 IronPort-SDR: QwT6C54z9ja1UeSPPYqK/6CMJyuCQUwEJBqAIzzc8zFXsKLk7OB/GlkXRPGVVA+Ga3zEWXwQ9L HXsYXahqF0d+XVmBlNTMcpjK4AQMC5g9yvy2m1NYfn6SuVXzxzicu+mKe9p+Cjj3dqY74wMWPN SO61mI4AvxdFezQ72O0hypJXRbKmUeGRdiADifjDAlZGGMiixVDQmp8PzRCkDy7rPe0yepCYmR LnFX1kELrx2NEScBzxvuVlcr9/fvCJEWIXEfFhaSt3ZeUsaegXg87xUUnfM/oNpcAQer2zArSH +7U= From: Thomas Schwinge To: Julian Brown , CC: Tobias Burnus , Kwok Cheung Yeung , Jakub Jelinek Subject: Re: [PATCH 1/4] openacc: Middle-end worker-partitioning support In-Reply-To: <2ef7b2ebaf056858d6484a260c3897f844e2df4a.1614685766.git.julian@codesourcery.com> References: <2ef7b2ebaf056858d6484a260c3897f844e2df4a.1614685766.git.julian@codesourcery.com> User-Agent: Notmuch/0.29.3+94~g74c3f1b (https://notmuchmail.org) Emacs/27.1 (x86_64-pc-linux-gnu) Date: Wed, 4 Aug 2021 15:56:49 +0200 Message-ID: <87fsvps47y.fsf@euler.schwinge.homeip.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-07.mgc.mentorg.com (139.181.222.7) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-6.2 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 13:57:05 -0000 Hi! On 2021-03-02T04:20:11-0800, Julian Brown wrote: > This patch implements worker-partitioning support in the middle end, > by rewriting gimple. [...] Yay! > This version of the patch [...] > avoids moving SESE-region finding code out > of the NVPTX backend So that's 'struct bb_sese' and following functions. > since that code isn't used by the middle-end worker > partitioning neutering/broadcasting implementation yet. I understand correctly that "isn't used [...] yet" means that (a) "this isn't implemented yet" (on og11 etc.), and doesn't mean (b) "is missing from this patch submission"? ... thus from (a) it follows that we may later also drop from the og11 branch these changes? Relatedly, a nontrivial amount of data structures/logic/code did get duplicated from the nvptx back end, and modified slightly or not-so-slightly (RTL vs. GIMPLE plus certain implementation "details"). We should at least cross reference the two instances, to make sure that any changes to one are also propagated to the other. (I'll take care.) And then, do you (or anyone else, of course) happen to have any clever idea about how to avoid the duplication, and somehow combine the RTL vs. GIMPLE implementations? Given that we nowadays may use C++ -- do you foresee it feasible to have an abstract base class capturing basically the data structures, logic, common code, and then RTL-specialized plus GIMPLE-specialized classes inheriting from that? For example: $ sed -e s%parallel_g%parallel%g < gcc/oacc-neuter-bcast.c > gcc/oacc-n= euter-bcast.c_ $ git diff --no-index --word-diff -b --patience gcc/config/nvptx/nvptx.= c gcc/oacc-neuter-bcast.c_ [...] /* Loop structure of the function. The entire function is described as a NULL loop. */ @@ -3229,17 +80,21 @@ struct parallel basic_block forked_block; basic_block join_block; [-rtx_insn *forked_insn;-] [- rtx_insn *join_insn;-]{+gimple *forked_stmt;+} {+ gimple *join_stmt;+} [-rtx_insn *fork_insn;-] [- rtx_insn *joining_insn;-]{+gimple *fork_stmt;+} {+ gimple *joining_stmt;+} /* Basic blocks in this parallel, but not in child parallels. The FORKED and JOINING blocks are in the partition. The FORK and JOIN blocks are not. */ auto_vec blocks; {+tree record_type;+} {+ tree sender_decl;+} {+ tree receiver_decl;+} public: parallel (parallel *parent, unsigned mode); ~parallel (); @@ -3252,8 +107,12 @@ parallel::parallel (parallel *parent_, unsigned m= ask_) :parent (parent_), next (0), inner (0), mask (mask_), inner_mask (0) { forked_block =3D join_block =3D 0; [-forked_insn-]{+forked_stmt+} =3D [-join_insn-]{+join_stmt+} =3D [-0= ;-] [- fork_insn-]{+NULL;+} {+ fork_stmt+} =3D [-joining_insn-]{+joining_stmt+} =3D [-0;-]{+NULL;+= } {+ record_type =3D NULL_TREE;+} {+ sender_decl =3D NULL_TREE;+} {+ receiver_decl =3D NULL_TREE;+} if (parent) { @@ -3268,12 +127,54 @@ parallel::~parallel () delete next; } [...] /* Split basic blocks such that each forked and join unspecs are at the start of their basic blocks. Thus afterwards each block will @@ -3284,111 +185,168 @@ typedef auto_vec insn_bb_vec_t; used when finding partitions. */ static void [-nvptx_split_blocks (bb_insn_map_t-]{+omp_sese_split_blocks (bb_stmt_m= ap_t+} *map) { [-insn_bb_vec_t-]{+auto_vec+} worklist; basic_block block; [- rtx_insn *insn;-] /* Locate all the reorg instructions of interest. */ FOR_ALL_BB_FN (block, cfun) { [- bool seen_insn =3D false;-] /* Clear visited flag, for use by parallel locator */ block->flags &=3D ~BB_VISITED; [-FOR_BB_INSNS (block, insn)-]{+for (gimple_stmt_iterator gsi =3D= gsi_start_bb (block);+} {+ !gsi_end_p (gsi);+} {+ gsi_next (&gsi))+} { [...] /* Dump this parallel and all its inner parallels. */ static void [-nvptx_dump_pars-]{+omp_sese_dump_pars+} (parallel *par, unsigned dept= h) { fprintf (dump_file, "%u: mask %d {+(%s)+} head=3D%d, tail=3D%d\n", depth, par->mask, {+mask_name (par->mask),+} par->forked_block ? par->forked_block->index : -1, par->join_block ? par->join_block->index : -1); @@ -3399,10 +357,10 @@ nvptx_dump_pars (parallel *par, unsigned depth) fprintf (dump_file, " %d", block->index); fprintf (dump_file, "\n"); if (par->inner) [-nvptx_dump_pars-]{+omp_sese_dump_pars+} (par->inner, depth + 1); if (par->next) [-nvptx_dump_pars-]{+omp_sese_dump_pars+} (par->next, depth); } /* If BLOCK contains a fork/join marker, process it to create or @@ -3410,60 +368,84 @@ nvptx_dump_pars (parallel *par, unsigned depth) and then walk successor blocks. */ static parallel * [-nvptx_find_par (bb_insn_map_t-]{+omp_sese_find_par (bb_stmt_map_t+} *= map, parallel *par, basic_block block) { if (block->flags & BB_VISITED) return par; block->flags |=3D BB_VISITED; if [-(rtx_insn **endp-]{+(gimple **stmtp+} =3D map->get (block)) { [...] static parallel * [-nvptx_discover_pars (bb_insn_map_t-]{+omp_sese_discover_pars (bb_stmt= _map_t+} *map) { basic_block block; @@ -3502,3468 +485,1033 @@ nvptx_discover_pars (bb_insn_map_t *map) block =3D ENTRY_BLOCK_PTR_FOR_FN (cfun); block->flags &=3D ~BB_VISITED; parallel *par =3D [-nvptx_find_par-]{+omp_sese_find_par+} (map, 0, bl= ock); if (dump_file) { fprintf (dump_file, "\nLoops\n"); [-nvptx_dump_pars-]{+omp_sese_dump_pars+} (par, 0); fprintf (dump_file, "\n"); } return par; } (For brevity, I stripped out the parts where implementation "details" differ considerably.) Gr=C3=BC=C3=9Fe Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955