From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTP id 7010B385AC09 for ; Mon, 26 Jul 2021 14:29:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7010B385AC09 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-512-e6WQJN9SO2aAp3zpASL7Hg-1; Mon, 26 Jul 2021 10:29:08 -0400 X-MC-Unique: e6WQJN9SO2aAp3zpASL7Hg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D2ECC100A60E; Mon, 26 Jul 2021 14:29:07 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-112-143.ams2.redhat.com [10.36.112.143]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 45A49100760B; Mon, 26 Jul 2021 14:29:06 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 16QET4qc2919021 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 26 Jul 2021 16:29:04 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 16QET3GS2919020; Mon, 26 Jul 2021 16:29:03 +0200 Date: Mon, 26 Jul 2021 16:29:02 +0200 From: Jakub Jelinek To: Kwok Cheung Yeung Cc: GCC Patches Subject: Re: [WIP, OpenMP] OpenMP metadirectives support Message-ID: <20210726142902.GW2380545@tucnak> Reply-To: Jakub Jelinek References: <8d413974-0068-3a31-6ae5-d36c1be06d06@codesourcery.com> MIME-Version: 1.0 In-Reply-To: <8d413974-0068-3a31-6ae5-d36c1be06d06@codesourcery.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jul 2021 14:29:14 -0000 On Fri, Jul 09, 2021 at 12:16:15PM +0100, Kwok Cheung Yeung wrote: > This is a WIP implementation of metadirectives as defined in the OpenMP 5.0 > spec. I intend to add support for metadirectives as specified in OpenMP 5.1 > later (where the directive can be selected dynamically at runtime), but am > concentrating on the static part for now. Parsing has only been implemented > in the C frontend so far. I am especially interested in feedback regarding > certain aspects of the implementation before I become too committed to the > current design. Note, there is a partial overlap with the attribute syntax changes, see below. c-family/c-omp.c now has omp_directives table that should be updated for changes like this and then c_omp_categorize_directive that returns some information about the directives given a directive name (though, that name can be one, two or three tokens long, consider e.g. target enter data or cancellation point directives). For metadirective, I think very special case are declarative directives in them, I'd tend to sorry for them at least for now, I'm pretty sure many cases with them are just unimplementable and will need to be restricted in the standard, others can be implemented with lots of effort. Whether it is e.g. metadirective guarding declare target ... end declare target pair that would only conditionally set declare target and instead of a single bit to find out if something is declare target or not we'd until resolved need to compute it for all possibilities, or e.g. conditional declare reduction/declare mapper where the name lookup for reduction or map directives would be dependent on metadirective resolution later on, etc. I'm afraid a total nightmare nobody has really thought about details for it. > 1) When parsing each directive variant, a vector of tokens is constructed > and populated with the tokens for a regular equivalent pragma, along with > the tokens for its clauses and the body. The parser routine for that pragma > type is then called with these tokens, and the entire resulting parse tree > is stored as a sub-tree of the metadirective tree structure. > > This results in the body being parsed and stored once for each directive > variant. I believe this is necessary because the body is parsed differently > if there is a 'for' in the directive (using c_parser_omp_for_loop) compared > to if there is not, plus clauses in the directive (e.g. tile, collapse) can > change how the for loop is parsed. > > As an optimisation, identical body trees could be merged together, but that > can come later. I'm afraid it isn't just an optimization and we need to be as smart as possible. I'm not sure it is possible to parse everything many times, consider e.g. labels in the blocks, nested function definitions, variable definitions, etc. While OpenMP requires that essentially the code must be valid if the metadirective is replaced by any of those mentioned directives which rules quite some weirdo corner cases, nothing prevents e.g. two or more when directives to be standalone directives (which don't have any body and so whatever comes after them should be left parsed for later as normal statement sequence), one or more to be normal constructs that accept a structured block and one or more to be e.g. looping constructs (simd, for, distribute, taskloop or combined versions of those). Even when issues with labels etc. are somehow solved (e.g. for structured blocks we have the restriction that goto, break, continue, or switch into a case/default label, etc. can't be used to enter or exit the structured block which could mean some cases can be handled through renaming seen labels in all but one bodies), most important is to sync on where parsing should continue after the metadirective. I think it would be nice if the metadirective parsing at least made quick analysis on what kind of bodies the directives will want and can use the new c-omp.c infrastructure or if needed extend it (e.g. separate the C_OMP_DIR_CONSTRUCT category into C_OMP_DIR_CONSTRUCT and C_OMP_DIR_LOOPING_CONSTRUCT where the latter would be used for those that expect some omp loop after it). One option would be then to parse the body as the most restricted construct (looping (and determine highest needed collapse and ordered), then construct, then standalone) and be able to adjust what we parsed into what the different constructs need, but another option is the separate parsing of the code after the directive multiple times, but at least in the order of most restricted to least restricted, remember where to stop and don't parse it multiple times at least for directives that need the same thing. > > 2) Selectors in the device set (i.e. kind, isa, arch) resolve differently > depending on whether the program is running on a target or on the host. > Since we don't keep multiple versions of a function for each target on the > host compiler, resolving metadirectives with these selectors needs to be > delayed until after LTO streaming, at which point the host or offload > compiler can make the appropriate decision. How is this different from declare variant? For declare variant, it is true I'm never trying to resolve it already during parsing of the call and that probably should be changed, do a first attempt at that point. Initially I thought it typically will not be possible, but later clarification and strong desire of LLVM/ICC etc. to do everything or almost everything already during parsing suggests that it must be doable at least in some cases. E.g. we have restrictions that requires directive on which some decision could be dependent must appear only lexically before it or not at all, etc. So, similarly to that, metadirective ideally should see if something is impossible already during parsing (dunno if it should mean we wouldn't parse the body in that case, that would mean worse diagnostics), then repeat the checks during gimplification like declare variant is resolved there, then repeat again after IPA. Would be probably best if for metadirectives that resolve to executable directives we represent it by something like a magic IFN that is told everything needed to decide and can share as much code as possible with the declare variant decisions. It is true other compilers implement offloading quite differently from GCC, by repeating all of preprocessing, parsing etc. for the offloading target, so they can decide some metadirective/declare variant decisions earlier than we can. On the other side that approach has also quite some disadvantages, it is much harder to ensure ABI compatibility between the host and offload code if one can use #ifdefs and whatever to change layout of everything in between. For the checks during parsing, we'll need a different way how to track which directives are currently active (or defer anything with construct selectors till gimplification). It is true that resolving that during parsing goes against the goal to parse as many bodies together as possible, so we need to pick one or the other. Parsing what follows for all standalone directives isn't a problem of course, but if the metadirective has one when with for and another with simd, then parsing the loop just once would be a problem if there is metadirective in the body that wants to decide whether it is in for or simd and wants that decision be done during parsing. > One negative of this is that the metadirective Gimple representation lasts > beyond the OMP expand stage, when generally we would expect all OMP > directives to have been expanded to something else. > > 3) In the OpenMP examples (version 5.0.1), section 9.7, the example > metadirective.3.c does not work as expected. > > #pragma omp declare target > void exp_pi_diff(double *d, double my_pi){ > #pragma omp metadirective \ > when( construct={target}: distribute parallel for ) \ > default( parallel for simd) > ... > int main() > { > ... > #pragma omp target teams map(tofrom: d[0:N]) > exp_pi_diff(d,my_pi); > ... > exp_pi_diff(d,my_pi); The spec says in this case that the target construct is added to the construct set because of the function appearing in between omp declare target and omp end declare target, so the above is something that resolves statically to distribute parallel for. It is true that in OpenMP 5.1 the earlier For functions within a declare target block, the target trait is added to the beginning of the set as c 1 for any versions of the function that are generated for target regions so the total size of the set is increased by 1. has been mistakenly replaced with: For device routines, the target trait is added to the beginning of the set as c 1 for any versions of the procedure that are generated for target regions so the total size of the set is increased by 1. by that has been corrected in 5.2: C/C++: For functions that are declared in a code region that is delimited by a declare target directive and its paired end directive, the target trait is added to the beginning of the set as c 1 for any target variants that result from the directive so the total size of the set is increased by one. Fortran: If a declare target directive appears in the specification part of a procedure or in the specification part of a procedure interface body, the target trait is added to the beginning of the set as c 1 for any target variants that result from the directive so the total size of the set is increased by one. So, it is really a static decision that can be decided already during parsing. > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1505,6 +1505,7 @@ OBJS = \ > omp-general.o \ > omp-low.o \ > omp-oacc-kernels-decompose.o \ > + omp-expand-metadirective.o \ Spaces instead of tab. > @@ -1312,12 +1312,14 @@ static const struct omp_pragma_def omp_pragmas[] = { > { "allocate", PRAGMA_OMP_ALLOCATE }, > { "atomic", PRAGMA_OMP_ATOMIC }, > { "barrier", PRAGMA_OMP_BARRIER }, > + { "begin", PRAGMA_OMP_BEGIN }, > { "cancel", PRAGMA_OMP_CANCEL }, > { "cancellation", PRAGMA_OMP_CANCELLATION_POINT }, > { "critical", PRAGMA_OMP_CRITICAL }, > { "depobj", PRAGMA_OMP_DEPOBJ }, > - { "end", PRAGMA_OMP_END_DECLARE_TARGET }, > + { "end", PRAGMA_OMP_END }, > { "flush", PRAGMA_OMP_FLUSH }, > + { "metadirective", PRAGMA_OMP_METADIRECTIVE }, > { "requires", PRAGMA_OMP_REQUIRES }, > { "section", PRAGMA_OMP_SECTION }, > { "sections", PRAGMA_OMP_SECTIONS }, Please update for this also the omp_directives array. > +enum pragma_kind > +c_pp_lookup_pragma_by_name (const char *name) > +{ > + const int n_omp_pragmas = sizeof (omp_pragmas) / sizeof (*omp_pragmas); > + const int n_omp_pragmas_simd = sizeof (omp_pragmas_simd) > + / sizeof (*omp_pragmas_simd); > + > + void *result = bsearch (name, omp_pragmas, n_omp_pragmas, > + sizeof (*omp_pragmas), > + c_pp_lookup_pragma_by_name_1); > + if (!result) > + result = bsearch (name, omp_pragmas_simd, n_omp_pragmas_simd, > + sizeof (*omp_pragmas_simd), > + c_pp_lookup_pragma_by_name_1); > + > + if (result) > + { > + const struct omp_pragma_def *def > + = (const struct omp_pragma_def *) result; > + > + return (enum pragma_kind) def->id; > + } > + > + return PRAGMA_NONE; > +} I think this should be dropped and c_omp_categorize_directive should be used instead of it. Please add a function comment to show the grammar. See e.g. c_parser_omp_declare. > > +static void > +c_parser_omp_begin (c_parser *parser, bool *if_p) > +{ > + location_t loc = c_parser_peek_token (parser)->location; > + c_parser_consume_pragma(parser); Space before (. > + if (c_parser_peek_token (parser)->type == CPP_NAME) > + { > + const char *p = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value); > + > + if (strcmp (p, "metadirective") == 0) > + { > + char p_name[sizeof "#pragma omp teams distribute parallel for simd"]; > + omp_clause_mask mask (0); > + > + c_parser_consume_token (parser); > + c_parser_omp_metadirective (loc, parser, p_name, mask, NULL, if_p, > + true); metadirective, by not being itself combinable, doesn't need this p_name and mask stuff. That is used only for combined/composite construct when the p_name and mask need to be computed dynamically based on the exact parsing. The begin metadirective vs. metadirective difference is a boolean one that can be either passed as bool, or if the pragma token is passed one could look at its pragma kind. > + return; > + } > + } > + > + error_at (loc, "expected %"); "expected %" ? #pragma omp begin already appeared... > + c_parser_skip_to_pragma_eol (parser); > +} > + > +static void > +c_parser_omp_end (c_parser *parser) Similarly with the function comment. Jakub