From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 093873858412 for ; Tue, 17 Oct 2023 13:12:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 093873858412 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 093873858412 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=68.232.129.153 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697548375; cv=none; b=ioII/rOzyF8fZJFO4Vi6rmXNtWI5MfHiiDL6jS4VMd7hPhrdE4GE3hlBh+OoqmWvRY+T1WMxIQ6MXcyfSSdPmVp4h2nuU3pOVzOwoP61loICqZGSH525v6lfT4JXyEUrs7Ez1lOGbifsgUjfqCNne3StxHqqbLpOTR05hVPy95M= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697548375; c=relaxed/simple; bh=dIjC7rrx+TAAiESzSTfeFdaC3rtQRSEf1lqv1fC22lQ=; h=Message-ID:Date:MIME-Version:Subject:To:From; b=aWKI07PtuA72mZ0HibhKkSzI+c7Lo6p5Uj3WzTbWfOYWbCBjr9/KvuSwrWThDc/lJyCsrpLex5HtLIvk50xXcDwJgWtFhvjHDALP3IB5FwSngLQWgw76UltSQ3p9tivmG/UtTgNrKKwNXN5cvaWneLSdTBdRS2GiAZuA53iajHk= ARC-Authentication-Results: i=1; server2.sourceware.org X-CSE-ConnectionGUID: O3a5h5kiRBCdNiBdDxcNGw== X-CSE-MsgGUID: dwVuM3icRvqgYTcv4nDayA== X-IronPort-AV: E=Sophos;i="6.03,232,1694764800"; d="scan'208";a="22445723" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 17 Oct 2023 05:12:51 -0800 IronPort-SDR: q/7IjEyu3akrCTcMCfCisFd5w82vPIKlgNmNWkKx0Kjy/b428WwOtKySXZzDkNzciZgg6oxLEO LENJ7dtIs1Ao6kIGxx3CW25aWvDUeAAt4/rRl3MUiiVx5Vjrx71UHE+d5LJ938Ho3zzboe/3Xh 25Y5Wm2JEW+G4cIUs8+HmEUS7QJAWLtwCjQ/vsOkl7dLjOiAtfpsyKo7fegdzufVD/D5YFBKLr zuDxpU1vSzS525hn2rufNB2LWE9vFT4cO5HDzytAtvHt3gGYsPoxwLrHDDTZC/y71J97JaWRqp +xE= Message-ID: Date: Tue, 17 Oct 2023 15:12:46 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++ Content-Language: en-US To: Kwok Cheung Yeung , gcc-patches CC: Jakub Jelinek References: From: Tobias Burnus In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,KAM_SHORT,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Kwok, hi Jakub, hi all, some first comments based on both playing around and reading the patch - and some generic comments to any patch reader. In general, the patch looks good. I just observe: * There is an issue with [[omp::decl(...)]]' * () - there is a C/C++ inconsistency in what is expected; it possibly affects more such conditions * Missed optimization for the host? * Bunch of minor comments On 08.10.23 15:13, Kwok Cheung Yeung wrote: > This patch adds support for the 'indirect' clause in the 'declare > target' directive in C/C++ (Fortran to follow) and adds the necessary > infrastructure to support indirect calls in target regions. This allows > one to pass in pointers to functions that have been declared as indirect > from the host to the target, then invoked via the passed-in pointer on > the target device. > [...] > The C++ support is currently limited to normal indirect calls - virtual > calls on objects do not currently work. I believe the main issue is that > the vtables are not currently copied across to the target. I have added > some handling for OBJ_TYPE_REF to prevent the compiler from ICEing when > it encounters a virtual call, but without the vtable this cannot work > properly. Side remark: Fortran polymorphic variables are similar. For them also a vtable needs to be copied. (For vtables, see also comment to 'libgomp.texi' far below.) * * * C++11 (and C23) attribute do not seem to be properly handled: [[omp::decl (declare target,indirect(1))]] int foo(void) { return 5; } [[omp::decl (declare target indirect)]] int bar(void) { return 8; } [[omp::directive (begin declare target,indirect)]]; int baz(void) { return 11; } [[omp::directive (end declare target)]]; While I get for the last one ("baz"): __attribute__((omp declare target, omp declare target block, omp declare ta= rget indirect)) the first two (foo and bar) do not have any attribute; if I remove the "ind= irect", I do get "__attribute__((omp declare target))". Hence, the omp::decl suppor= t seems to partially work. NOTE: C23 omp:: attribute support is still WIP and not yet in mainline. Recent draft: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633007= .html The following works - but there is not a testcase for either syntax: int bar(void) { return 8; } [[omp::directive(declare target to(bar) , indirect(1))]]; int baz(void) { return 11; } [[omp::directive ( declare target indirect enter(baz))]]; int bar(void) { return 8; } #pragma omp declare target to(bar) , indirect(1) int baz(void) { return 11; } #pragma omp declare target indirect enter(baz) (There is one for #pragma + 'to' in gomp/declare-target-indirect-2.c, howev= er.) Side remark: OpenMP 5.2 renamed 'to' to 'enter' (with deprecated alias 'to)= ; hence, I also use 'enter' above. The current testcases for indiredt use 'en= ter'. (Not that it should make a difference as the to/enter do work.) The following seems to work fine, but I think we do not have a testcase for it ('bar' has no indirect, foo and baz have it): #pragma omp begin declare target indirect(1) int foo(void) { return 5; } #pragma omp begin declare target indirect(0) int bar(void) { return 8; } int baz(void) { return 11; } #pragma omp declare target indirect enter(baz) #pragma omp end declare target #pragma omp end declare target * * * Possibly affecting other logical flags as well, but I do notice that gcc but not g++ accepts the following: #pragma omp begin declare target indirect("abs") #pragma omp begin declare target indirect(5.5) g++ shows: error: expected constant integer expression OpenMP requires 'constant boolean' expr (OpenMP 5.1) or 'expression of logical type','constant' (OpenMP 5.2), where for the latter = it has: "The OpenMP *logical type* supports logical variables and expressions in an= y base language. "[C / C++] Any OpenMP logical expression is a scalar expression. This docum= ent uses true as a generic term for a non-zero integer value and false as a generic term for= an integer value of zero." I am not quite sure what to expect here; in terms of C++, conv.bool surely = permits those for those pvalues "Boolean conversions". For C, I don't find the wor= ding in the standard but 'if("abc")' and 'if (5.5)' is accepted. * * * I notice that the {__builtin_,}GOMP_target_map_indirect_ptr call is inserte= d quite late, i.e. in omp-offload.cc. A dump and also looking at the *.s fil= es shows that the __builtin_GOMP_target_map_indirect_ptr / call GOMP_target_map_indirec= t_ptr do not only show up for the device but also for the host-fallback code. I think the latter is not required as a host pointer can be directly execut= ed on the host - and device -> host pointer like in omp target device(ancestor:1) do not need to be supported. Namely the current glossary (here git version but OpenMP 5.2 is very simila= r); note the "other than the host device": "indirect device invocation - An indirect call to the device version of a procedure on a device other than the host device, through a function pointe= r (C/C++), a pointer to a member function (C++) or a procedure pointer (Fortr= an) that refers to the host version of the procedure. Can't we use #ifdef ACCEL_COMPILER to optimize the host fallback? That way, we can also avoid generating the splay-tree on the host cf. LIBGOMP_OFFLOADED_ONLY. * * * #pragma omp begin declare target indirect(1) device_type(host) is accepted but it violates: OpenMP 5.1: "Restrictions to the declare target directive are as follows:" "If an indirect clause is present and invoked-by-fptr evaluates to true the= n the only permitted device_type clause is device_type(any)" [215:1-2] In OpenMP 5.2 that's in "7.8.3 indirect Clause" itself. * * * OpenMP permits pointers to member functions. Can you also a test for those? I bet it simply works but we should still test those. (For vtables, see also comment below.) class Foo { public: int f(int x); }; typedef int (Foo::*FooFptr)(int x); ... int my_call(Foo &foo) { FooFptr fn_ptr =3D &Foo::f; ... return std::invoke(fn_ptr, foo, 42); } * * * Side remarks for patch readers: Besides the existing offload functions/variables tables, a new indirect-functions table is created; functions there do aren't added to the function table (in the compiler-generated code). While the code obviously affects the performance, it does not hamper optimizations such as (modified libgomp.c-c++-common/declare-target-indirec= t-1.c): #pragma omp target map (from: x) { int (*foo_ptr) (void) =3D &foo; int (*bar_ptr) (void) =3D &bar; int (*baz_ptr) (void) =3D &baz; x =3D (*foo_ptr) () + (*bar_ptr) () + (*baz_ptr) (); } which still gives with -O1: _8 =3D *.omp_data_i_7(D).x; *_8 =3D 24; There was some discussion at OpenMP spec level to avoid the overhead via new assumptions which tell that 'indirect' can or cannot be encountered. That was a more resent side discussion of the generic questions in my Issue #3540. I think in general there aren't that many function pointers around and if the extra function call happens late enough (which seems to be the case, see above), it shouldn't be a real problem in most code, also because the indirect-reverse-lookup table is short and hopefully there won't be too many repeated lookups with either function pointers or C++ classes/Fortran polymorphic calls. * * * > --- a/gcc/omp-offload.cc +++ b/gcc/omp-offload.cc } + if > (omp_redirect_indirect_calls + && gimple_call_fndecl (stmt) =3D=3D > NULL_TREE) + { + gcall *orig_call =3D dyn_cast (stmt); + tree > call_fn =3D gimple_call_fn (stmt); + tree map_ptr_fn + =3D > builtin_decl_explicit (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR); This line is too long. Maybe use a 'enum built_in_function' temporary? > --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -304,7 +304,7 > @@ The OpenMP 4.5 specification is fully supported. @item Iterators in > @code{target update} motion clauses and @code{map} clauses @tab N @tab > @item Indirect calls to the device version of a procedure or function > in - @code{target} regions @tab N @tab + @code{target} regions @tab P > @tab Only C and C++ I think we need a new entry to handle the virtual part. However, it looks as if that's a new OpenMP 5.2 feature. Can you add an entry under "Other new OpenMP 5.2 features2? At least I cannot find any existing entry and I only see in OpenMP 5.2: "Invoking a virtual member function of an object on a device other than the device on which the object was constructed results in unspeci=EF=AC=81e= d behavior, unless the object is accessible and was constructed on the host device." [OpenMP 5.2, 287:10-12] in "Restrictions to the target constr= uct". > --- a/libgomp/target.c +++ b/libgomp/target.c @@ -2256,11 +2256,14 @@ > gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned > version, void **host_funcs_end =3D ((void ***) host_table)[1]; void > **host_var_table =3D ((void ***) host_table)[2]; void **host_vars_end =3D > ((void ***) host_table)[3]; + void **host_ind_func_table =3D ((void ***) > host_table)[4]; + void **host_ind_funcs_end =3D ((void ***) host_table)[5= ]; This code assumes that all calls have now 6 arguments. But that's not true = for old code. It seems as if you have to bump the version number and only access th= ose values when the version number is sufficiently large. Thanks, Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955