From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 876113834F29 for ; Thu, 9 Jun 2022 12:46:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 876113834F29 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.91,287,1647331200"; d="scan'208";a="76892008" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 09 Jun 2022 04:46:41 -0800 IronPort-SDR: KwAymlskOMS4fgbWg/VqTUE0TN2YVT6RtES7hxN8XtzsV7t07UgqzGXHtGu3Q1sVyhZzCBrrrw CR4Fn10DPF5huR0dbIbG+Sl6DJDkbI3Y7RFQ5csGVm+qAlAdiJP5GmZcNw5uQPPR8ptGITCltc HeO4f7R9CFT7/+gcJOxXZEeBs4k5f+/237oDf4TOL6zybpzf1nzpiVoylNNExsTaVyyxgCr5x+ alo5ZNWVP2AUYzCEONxRLDeK3Uh1lCqlnyYZqje92BKY1K31u/QNT3/CslTT2O99CI8Q2yIiEs k/k= Message-ID: <7f9c91c1-a479-f94f-ac14-1d6827ce671b@codesourcery.com> Date: Thu, 9 Jun 2022 14:46:34 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Subject: Re: [Patch] OpenMP: Move omp requires checks to libgomp Content-Language: en-US To: Jakub Jelinek CC: gcc-patches References: <07fec82a-41cf-fdc5-6307-c068dd95ef1a@mentor.com> From: Tobias Burnus In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-07.mgc.mentorg.com (139.181.222.7) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jun 2022 12:46:46 -0000 On 09.06.22 13:40, Jakub Jelinek via Gcc-patches wrote: > On Wed, Jun 08, 2022 at 05:56:02AM +0200, Tobias Burnus wrote: >> + && lookup_attribute ("omp declare target", >> + DECL_ATTRIBUTES (current_function_decl))) >> + omp_requires_mask >> + =3D (enum omp_requires) (omp_requires_mask | OMP_REQUIRES_TARG= ET_USED); > I must admit it is unclear what the > "must appear lexically before any device constructs or device routines." > restriction actually means for device routines. > Is that lexically before definition of such device routines, or even thei= r > declarations? I have similar issues =E2=80=93 also for Fortran (and C++) module use. Henc= e, I had filled https://github.com/OpenMP/spec/issues/3240 (not publicly accessible); I added your issues to the list. > The above patch snippet is I believe for function definitions that were > arked as declare target before the definition somehow (another decl for > it merged with the new one or in between the begin/end). And is true > even for device_type (host), to rule that out you'd need to check for > "omp declare target host" attribute not being present. > I'm not against the above snippet perhaps adjusted for device_type(host), > but IMHO we want clarifications from omp-lang How to proceed for now? And does 'omp_is_initial_device()' on the host a device function or not? It can be hard-coded to 'true' ... > [...] > target update is also a device construct and the above snippet hasn't bee= n > added for it, ditto for interop which we don't implement yet. > But, my preference would be instead of adding these snippets to > c_parser_omp_target_{data,enter_data,exit_data,update} etc. move it from > c_parser_omp_target to c_parser_omp_all_clauses: > if (flag_openmp > && (mask & (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEVICE)) !=3D 0= ) > omp_requires_mask > =3D (enum omp_requires) (omp_requires_mask | OMP_REQUIRES_TARGET_U= SED); > (somewhere at the start of the function), because the definition of devic= e > constructs is exactly like that: > "device construct An OpenMP construct that accepts the device clause.= " Makes sense. [C++ cases] > Ditto. > For Fortran, is the above mostly not needed because requires need to be i= n > the specification part and device constructs are executable and appear in > the part after it? Do we allow requires in BLOCK's specification part? We don't allow it in BLOCK =E2=80=93 but there are issues related to USE-in= g modules, cf. OpenMP issue. >> --- a/gcc/gimplify.cc >> +++ b/gcc/gimplify.cc >> @@ -3644,6 +3644,9 @@ gimplify_call_expr (tree *expr_p, gimple_seq *pre_= p, bool want_value) >> + if (fndecl && flag_openmp && omp_runtime_api_call (fndecl, true)) >> + omp_requires_mask >> + =3D (enum omp_requires) (omp_requires_mask | OMP_REQUIRES_TARGET_= USED); > I'm sure device APIs were discussed, but I must be blind and I can't find= it > in either 5.0, 5.1 or 5.2. All I see is device constructs or device rout= ines > in those places where I'd also look for device related OpenMP runtime > library APIs. Though, if some routine calls omp_get_num_devices (), > certainly the library at that point needs to know > reverse_offload/unified_shared_memory/etc. requires because that determin= es > how many devices it has. So, what have I missed (aka on which place in t= he > standard the above snippet is based on)? It is based on your review comments from last year ("Something I miss in the patch is that for the device API calls") plus what requires some device initialization. But otherwise, I also did not see it. In terms of parsing, it makes no difference =E2=80=93 contrary to 'unified_shared_memory', where the parser could decide not to add implicit mapping, the compiler part is not affected by API calls. I cannot really make up my mind whether it should be required in this case or not. Maybe, it is not needed. > + const char *requires_section =3D ".gnu.gomp_requires"; >> + tree maskvar =3D build_decl (UNKNOWN_LOCATION, VAR_DECL, >> + get_identifier (".gomp_requires_mask")= , >> + unsigned_type_node); >> + SET_DECL_ALIGN (maskvar, TYPE_ALIGN (unsigned_type_node)); > Don't we want also DECL_USER_ALIGN (maskvar) =3D 1; so that > we never try to increase its alignment? Probably yes. > Is it an allocated section, or should it be better non-allocated and then > dealt with by mkoffload? > > Shouldn't the vars in that section be const, so that it is a read-only > section? > > Is unsigned_type_node what we want (say wouldn't be just unsigned_char_no= de > be enough, currently we just need 3 bits). Probably -that would be 8 bits, leaving 5 spare. I have not checked what Andrew et al. do with the pinned-memory support by -f, but that will likely use only 1 to 3 bits, if any. > Also, wonder if for HAVE_GAS_SHF_MERGE && flag_merge_constants > we shouldn't try to make that section mergeable. If it goes away during > linking and is replaced by something, then it doesn't matter, but otherwi= se, > as we don't record which TU had what flags, all we care about is that > there were some TUs which used device construct/routines (and device APIs= ?) > and used bitmask 7, other TUs that used bitmask 3 and others that used > bitmask 4. (maybe =E2=80=93 I am not sure about this, either.) > @@ -442,6 +463,14 @@ omp_finish_file (void) > } > else > { > +#ifndef ACCEL_COMPILER > + if (flag_openmp > + && (omp_requires_mask & OMP_REQUIRES_TARGET_USED) > + && (omp_requires_mask & (OMP_REQUIRES_UNIFIED_ADDRESS > + | OMP_REQUIRES_UNIFIED_SHARED_MEMORY > + | OMP_REQUIRES_REVERSE_OFFLOAD))) > + sorry ("OpenMP device offloading is not supported for this target")= ; > +#endif > I don't understand this snippet. Without named sections on the host, > I bet we simply don't support offloading at all, > the record_offload_symbol target hook is only non-trivially defined > for nvptx and nvptx isn't typical host for OpenMP offloading, > because we don't remember it anywhere. I thought that would address your: "This probably needs to sorry if the target doesn't support named sections. We probably don't support LTO in that case either though." > I wonder if we shouldn't rename it when we change the arguments, > so that if one mixes an older plugin with newer libgomp or vice versa > this is easily caught. Ok. >> +/* Start/end of .gnu.gomp.requires section of program, defined in > Isn't it .gnu.gomp_requires ? Yes. >> + crtoffloadbegin/end.o. */ >> +__attribute__((weak)) >> +extern const unsigned int __requires_mask_table[]; >> +__attribute__((weak)) >> +extern const unsigned int __requires_mask_table_end[]; > I must say it is unclear to me how this works. > It will only find one such array, say in the executable, > or if the executable doesn't have it, in one of the shared libraries. > > I think we want some solution that will work with OpenMP code > at least in the executable and some shared libraries it is linked against > (obviously another case is when a library with certain #pragma omp requir= es > is dlopened after we've finalized the number of devices, bet the options > in that case are either warn or fatal error). There is the general problem that GCC does not work well with having device routines in a shared library; it works fine if the device part is only in the library =E2=80=93 but calling from an .exe program's target region a declare-target function in a library won't work. Thus, we may need to find a more general solution for this. > The choices could be e.g. make __requires_mask_table{,_end} .hidden > and in the crtoffloadbegin.o (or end) unconditionally call some new libgo= mp > routine to register the table, but the disadvantage of that is that we co= uld > have many of those register calls even when there is nothing to register > (sure, the .ctor in crtoffloadbegin.o (or end) could compare the 2 addres= ses > and not call anything if the table is empty but there would be still code > executed during initialization of the library). > > Yet another possibility is linker plugin case. We already use it for the > case where we actually have some offloading LTO bytecode, transform it in= to > a data section and register with GOMP_offload_register*. > So, if we could e.g. at the same time also process those requires arrays, > diagnose at link time if multiple TUs with that set disagree on the mask > value and in the target data provide that mask to the library, that would > be much nicer. > And the masks either could be gathered from .gnu.gomp_requires or it can = be > somehow encoded in the offloading LTO or its associated data. > What is important though is that it will work even if we actually don't h= ave > any "omp declare target" functions or variables in the TU or the whole > executable or library, just the requires mask. But that can be dealt wit= h > e.g. by forcing the LTO sections even for that case or so. Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955