From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id C143A385782C for ; Fri, 1 Jul 2022 16:55:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C143A385782C Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-642-7f1yZtJgM_6twJJZB0qKNQ-1; Fri, 01 Jul 2022 12:55:20 -0400 X-MC-Unique: 7f1yZtJgM_6twJJZB0qKNQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 39B1B101A589; Fri, 1 Jul 2022 16:55:20 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.30]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EE1BEC15D42; Fri, 1 Jul 2022 16:55:19 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 261GtHGO1652411 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 1 Jul 2022 18:55:17 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 261GtGtO1652410; Fri, 1 Jul 2022 18:55:16 +0200 Date: Fri, 1 Jul 2022 18:55:15 +0200 From: Jakub Jelinek To: Tobias Burnus Cc: gcc-patches Subject: Re: [Patch][v5] OpenMP: Move omp requires checks to libgomp Message-ID: Reply-To: Jakub Jelinek References: <7f9c91c1-a479-f94f-ac14-1d6827ce671b@codesourcery.com> <5576fa00-0ddd-8046-17c1-d1cea82bdcf5@codesourcery.com> <77331328-4961-9dab-db58-b5b03daf218c@codesourcery.com> <16ca2aa4-7e73-cf9d-9482-dd59f5b0cdae@codesourcery.com> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2022 16:55:25 -0000 On Fri, Jul 01, 2022 at 06:31:48PM +0200, Tobias Burnus wrote: > This is done in openmp.cc during parsing. The merging you quoted (in parse.cc) happens > after the whole input file has been parsed and resolved. For your test case, the > following error is shown: > > test.f90:1:15: > > 1 | subroutine foo > | 1 > Error: Program unit at (1) has OpenMP device constructs/routines but does not set !$OMP REQUIRES REVERSE_OFFLOAD but other program units do > test.f90:6:14: > > 6 | subroutine bar > | 1 > Error: Program unit at (1) has OpenMP device constructs/routines but does not set !$OMP REQUIRES UNIFIED_SHARED_MEMORY but other program units do Great. > > @@ -1764,6 +1781,20 @@ input_symtab (void) > > > } > > > } > > > > > > +static void > > > +omp_requires_to_name (char *buf, size_t size, unsigned int requires_mask) > > > +{ > > > + char *end = buf + size, *p = buf; > > > + if (requires_mask & GOMP_REQUIRES_UNIFIED_ADDRESS) > > > + p += snprintf (p, end - p, "unified_address"); > > > + if (requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY) > > > + p += snprintf (p, end - p, "%sunified_shared_memory", > > > + (p == buf ? "" : ", ")); > > > + if (requires_mask & GOMP_REQUIRES_REVERSE_OFFLOAD) > > > + p += snprintf (p, end - p, "%sreverse_offload", > > > + (p == buf ? "" : ", ")); > > So, what does this print if requires_mask is 0 (or just the target used bit > > set but not unified_address, unified_shared_memory nor reverse_offload)? > > Well, that's what libgomp/testsuite/libgomp.c-c++-common/requires-2.c (+ *-2-aux.c) > tests: > > /* { dg-error "OpenMP 'requires' directive with non-identical clauses in multiple compilation units: 'unified_shared_memory' vs. ''" "" { target *-*-* } 0 } */ > > I hope the '' vs. 'unified_shared_memory' is clear - but if you have a better wording. I must be missing how that works. Because the buf in the callers is uninitialized and this function doesn't store there anything if requires_mask == 0. Perhaps you're just lucky and the stack contains '\0' there? > Note that both: > no 'omp requires' > and > 'omp requires' with other clauses (such as the atomic ones or dynamic_allocators) > will lead to 0. Thus, if the wording is changed, it should fit for both cases. Maybe it would be better to simply use different error message for the 0 vs. non-0 case, canonicalized to non-0 vs. 0 order so that it is just 2 messages vs. 3 and wording like "OpenMP 'requires' directive with '....' clauses specified only in some compilation units" note: specified here ... note: but not here ... > > > + if (omp_requires_mask == 0) > > > + { > > > + omp_requires_mask = (omp_requires) val; > > > + requires_decl = tmp_decl; > > > + requires_fn = file_data->file_name; > > And similarly here, if some device construct is seen but requires > > directive isn't, not sure if in this version val would be 0 or something > > with the TARGET_USED bit set. In the latter case, only what is printed > > for no requires or just atomic related requires is a problem, in the former > > case due to the == 0 check mixing of 0 with non-zero would be ignored > > but mixing of non-zero with 0 wouldn't be. > > Here: 0 = "unset" in the sense that either TARGET_USE nor USM/UA/RO was > specified. If any of those is set, we get != 0. Ok. > > For mkoffload, the single results are merged - and TARGET_USE is stripped, > such that it is either 0 or a combination of USM/UA/RO I'd find it clearer if we never stripped that, so that even the library knows. The details will depend on the resolution of #3240. Whether say declare target and no device constructs and device related API calls etc. force it too or not. If not, you could get 0 even if you are actually registering something, just not target regions. If anything that will lead to GOMP_offload_register_ver actually means TARGET_USED, then it isn't necessary. But even if it isn't necessary, e.g. for backwards compatibility with GOMP_VERSION == 1 it will be easier to have that bit in. 0 will then mean older gcc built library or binary. > > > + } > > > + else if (omp_requires_mask != val && !error_emitted) > > > + { > > > + char buf[64], buf2[64]; > > Perhaps cleaner would be to size the buffers as > > sizeof ("unified_address,unified_shared_memory,reverse_offload") > > 64 is more, but just a wild guess and if further clauses are added later, > > it might be too small. > > I concur – except that ',' should be ', '. > (Likewise in libgomp/target.c) Good catch. > > Is > > c.c: > > #pragma omp requires unified_shared_memory > > d.c: > > void baz (void) { > > #pragma omp target > > ; > > } > > ok? > > This one is *already* streamed out as it creates a symbol and entry in > in offload_functions (baz.omp_fn.0). > > The code is rather for '#pragma omp target enter data map(x)' as this > only adds a library call and no symbol. > > > Pedantically reading current standard probably yes, but perhaps again > > something to be discussed. The question is what the requires directive > > in that case would do, nothing at all as there are no device constructs > > etc.? > > Isn't there a device construct – which happens to be empty? In d.c there is. But in c.c there isn't. So, the question is if the directive in c.c is just completely ignored (ok, aside from semantic checking) or if it should mean that if it is specified there, it must be specified elsewhere where device constructs etc. are used too. > > In that case omp_requires_mask & OMP_REQUIRES_TARGET_USED is right. > > But if it should influence the behavior anyway, the restriction should be > > Either all compilation units of a program that contain ... device > > constructs ... should include also requires directive with one of the > > unified_shared_memory, unified_address or reverse_offload clauses. > > In that case the test would be > > omp_requires_mask & (OMP_REQUIRES_TARGET_USED | OMP_REQUIRES_UNIFIED* | OMP_REQUIRES_REV*) > > I think I am lost – don't we effectively test this? We filter out > everything else in output_offload_tables. Thus, in input_offload_tables, > a single '==' will do. (We additionally know that TARGET_USED is set - > as otherwise there wouldn't be a symbol in the offload table.) > > Thus, it is unclear to me what you propose here. We want to get clarification from omp-lang on what is the intent. If the TARGET_USED bit is explicit, we can easily tweak the checks. > > > > +static void > > > +gomp_requires_to_name (char *buf, size_t size, int requires_mask) > > > +{ > > > + char *end = buf + size, *p = buf; > > > + if (requires_mask & GOMP_REQUIRES_UNIFIED_ADDRESS) > > > + p += snprintf (p, end - p, "unified_address"); > > > + if (requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY) > > > + p += snprintf (p, end - p, "%sunified_shared_memory", > > > + (p == buf ? "" : ", ")); > > > + if (requires_mask & GOMP_REQUIRES_REVERSE_OFFLOAD) > > > + p += snprintf (p, end - p, "%sreverse_offload", > > > + (p == buf ? "" : ", ")); > > > +} > > Same question as earlier. > > Same answer, except that in libgomp, this code is effectively only > reachable when omp_requires_mask != 0 as it reaches this code only if > either some additional flag was added (in register_ver) or when devices > were available, but those do not support a flag. I don't understand. Won't z.c: int v; void foo (void) { v++; } #pragma omp declare target enter (v, foo) void bar (void) { #pragma omp target foo (); } have omp_requires_mask == 0 (if TARGET_USED isn't explicit) but will GOMP_offload_register_var? > > Then under the lock, you can do the merging. > > But, IMHO the runtime library should repeat what is done in the offloading > > lto1, diagnose if there are differences between the masks in between > > different TUs, here at runtime on the program/shared library level. > > And IMHO the error you emit above is unnecessary, because (at least > > hopefully) the num_devices computation / device initialization should > > only happen on behalf of some device construct or device related OpenMP API > > routine, so at that point the shared library or program that does that > > should have its own mask and if something is dlopened later, it should > > either have compatible mask (nothing is diagnosed) or incompatible, but then > > it should be diagnosed like any other incompatibilities. > > OK – I will diagnose it always. > > Question: If it is not the same, should there just be a message to > stderr (gomp_error) or should libgomp abort (gomp_fatal)? I'd say gomp_fatal. It is an error rather than warning in lto1 too... > > I thought I've mentioned earlier it would be nice to rename the > > get_num_devices plugin hook because its API has changed, so that > > if one mixes old plugin with new libgomp or vice versa it doesn't > > break silently. > > As discussed off list, gomp_load_plugin_for_device calls if > (device->version_func () != GOMP_VERSION) and we did bump the GOMP_VERSION. Yeah, sorry for that. Jakub