From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 21B933858422 for ; Thu, 9 Nov 2023 16:00:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 21B933858422 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 21B933858422 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=68.232.137.252 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699545625; cv=none; b=PlVVn2aoQCCSwsGAh2KqTiyhmx/EA8GdkaMPrV5hPP5/kNag48MY8/EFdhfjKkIgG6wnVHElcV2i9AqWjjNG+7assRObnMTrnrYySfkkfUbUTlcwjEKAUhydsxoIPFF+vUA11feoVQzUdIm0dh9WgWnXjsf9AQTVl89xkv2R9VI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699545625; c=relaxed/simple; bh=tiL0MyrEyWdATTXQDv4bDC05X4XnuuUEKvPQEI7r74Y=; h=Message-ID:Date:MIME-Version:Subject:To:From; b=JK2o0y/dAcrkwNEIBexVVWdlgXdRLhZl15Gv+Z0Vf0UNJoqGndkLI3VtjKOlBf7NKlIQI1qsKQngBxSDaSnn78JrdaN4hF4NYG/vT2wXrexAmVW+ImMw9r0vvzOZcWacEd0tzb3+5qpVkMi9bsEGwPJsTwxemdN9KfoGIe/Lgyo= ARC-Authentication-Results: i=1; server2.sourceware.org X-CSE-ConnectionGUID: cHUA60SdSMC7bdxBv5ZJeQ== X-CSE-MsgGUID: vGcDsJvZRguSWxhPGf9CAg== X-IronPort-AV: E=Sophos;i="6.03,289,1694764800"; d="scan'208";a="22355788" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa4.mentor.iphmx.com with ESMTP; 09 Nov 2023 08:00:16 -0800 IronPort-SDR: WI+qXt5punCMU2xEWbmcA42xWvQ2ge2fFucz/kwNAQrx1zVTQjpxkb4voZCk2C7cZRS/M9cGQH Pv3H6+ZsytG9PKXzWAnjPFZCGsTbg1aTOr0HfY1UPVlu/hbeBoo+kNbMO6rQPAEkjHiOAM18Y9 d0WQmdIJ7A7pUgAivdPWTG1LgCcLq3iv4k7Ae1zze0KJPgWySV1tsHlSOGP2izzgBR0TIj5Uge jpseMrhXP+WQRAF3nRgEZCoo+AxAKpnBEGjQhbR6FOEzAOSEdNOz1rBfdd9NhY05xa8sEyybEO B2o= Message-ID: Date: Thu, 9 Nov 2023 17:00:11 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] openmp: Add support for the 'indirect' clause in C/C++ To: Thomas Schwinge , Kwok Cheung Yeung CC: , Jakub Jelinek References: <37f412ee-58e7-4bde-a763-591268e8f8f4@codesourcery.com> <87wmurru61.fsf@euler.schwinge.homeip.net> Content-Language: en-US From: Tobias Burnus In-Reply-To: <87wmurru61.fsf@euler.schwinge.homeip.net> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: quoted-printable X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,HEADER_FROM_DIFFERENT_DOMAINS,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Thomas, hi Kwok, (Skipping some valid (review) comments and bare remarks.) On 09.11.23 13:24, Thomas Schwinge wrote: > Also, assuming that the order of appearance of 'IND_FUNC_MAP' does matter > as it does for 'FUNC_MAP', ... https://github.com/MentorEmbedded/nvptx-to= ols/pull/29 ... It should matter. Thus, we should indeed update nvptx-tools for this! For hello-world it probably does not show up that easily as there are only very few such tagged functions. But especially once it gets used for C++ virtual functions, the number of function will be that large that the ordering issue is likely to occur in the real world. (I shouldn't have missed this =E2=80=93 given that I debugging and reported= the original issue.) [...] > Maybe, though, we should generally have separate tags for offloading use? > Possibly aliasing (in value) the LTO ones -- but maybe actually not, to > improve "type safety". I shall look into that, later. Regarding LTO, my long-term plans is to have the variables visible to the c= ompiler, i.e. writing indeed something like: __offload_vars[10] =3D [&A, &my_var, ... ]; and then set __offload_vars's node to force_output. The IPA can then see th= at 'A's address is used (such that '&A' does not disappear) but it can still do optimizatio= ns which are currently ruled out because we do set 'force_output'. Currently, we set force_output to all such nodes, but that prevents several= optimizations which could be done - we just don't want that the variable disappears. (There is = a PR about the missed optimization.) >> --- a/gcc/tree-core.h >> +++ b/gcc/tree-core.h >> @@ -350,6 +350,9 @@ enum omp_clause_code { >> /* OpenMP clause: doacross ({source,sink}:vec). */ >> OMP_CLAUSE_DOACROSS, >> >> + /* OpenMP clause: indirect [(constant-integer-expression)]. */ >> + OMP_CLAUSE_INDIRECT, >> + >> /* Internal structure to hold OpenACC cache directive's variable-lis= t. >> #pragma acc cache (variable-list). */ >> OMP_CLAUSE__CACHE_, > In this position here, isn't 'OMP_CLAUSE_INDIRECT' applicable to the > 'OMP_CLAUSE_RANGE_CHECK' in 'gcc/tree.h:OMP_CLAUSE_SIZE' and > 'gcc/tree.h:OMP_CLAUSE_DECL': > > #define OMP_CLAUSE_SIZE(NODE) = \ > OMP_CLAUSE_OPERAND (OMP_CLAUSE_RANGE_CHECK (OMP_CLAUSE_CHECK (NODE= ), \ > OMP_CLAUSE_FROM, = \ > OMP_CLAUSE__CACHE_), 1) > > #define OMP_CLAUSE_DECL(NODE) = \ > OMP_CLAUSE_OPERAND (OMP_CLAUSE_RANGE_CHECK (OMP_CLAUSE_CHECK (NODE= ), \ > OMP_CLAUSE_PRIVATE, = \ > OMP_CLAUSE__SCANTEMP_), 0) We may need to check whether the range check is correct for the other item or whether some others sneaked in as well. But I concur, the OMP_CLAUSE_INDIRECT indeed looks misplaced. (BTW: OMP_CLAUSE_INDIRECT is only used intermittendly in the C/C++ FEs and not in the ME as it is soon turned into an attribute string.) > I would've assumed handling for 'OMP_CLAUSE_INDIRECT' to also be > necessary in the following places: > > - 'gcc/c-family/c-omp.cc:c_omp_split_clauses' "split_clauses" applies only to combined composite constructs like 'target'+'parallel' +'for' + 'simd' where clauses have to be added to the right constituent clause(s). Declarative directives cannot be combined. > - 'gcc/cp/pt.cc:tsubst_omp_clauses', > - 'gcc/gimplify.cc:gimplify_scan_omp_clauses', > 'gcc/gimplify.cc:gimplify_adjust_omp_clauses' > - 'gcc/omp-low.cc:scan_sharing_clauses' (twice) > - 'gcc/tree-nested.cc:convert_nonlocal_omp_clauses', > 'gcc/tree-nested.cc:convert_local_omp_clauses' > - 'gcc/tree-pretty-print.cc:dump_omp_clause' Most of those seem to relate to executable directives =E2=80=93 and not to declarative ones, where we attach DECL_ATTRIBUTES to a decl and process them. For functions, the pretty printer prints the attributes. Here, we use "omp declare target indirect" as attribute. We use noclone,noinline attributes for 'declare target', thus, there should be no issue on this side and regarding tsubst_omp_clauses, as the clause is either present or not (either bare or with a parse-time constant logical), there is not much post processing needed. Thus, I bet that there is nothing to do for those. > Please verify, and add handling as well as test cases as necessary, or, > as applicable, put 'case OMP_CLAUSE_INDIRECT:' next to > 'default: gcc_unreachable ();' etc., if indeed that clause is not > expected there. What's the point of having it next to default if it is gcc_unreachable? I mean there are several others which shouldn't be needed like OMP_CLAUSE_DEVICE_TYPE which also does not show up at gcc/cp/pt.cc. > In this file here: > >> +++ b/libgomp/config/accel/target-indirect.c >> >> [...] >> +volatile void **GOMP_INDIRECT_ADDR_MAP =3D NULL; >> >> [...] >> +build_indirect_map (void) >> +{ >> + size_t num_ind_funcs =3D 0; >> + volatile void **map_entry; >> [...] >> + for (map_entry =3D GOMP_INDIRECT_ADDR_MAP; *map_entry; >> + map_entry +=3D 2, num_ind_funcs++); >> [...] >> + map_entry =3D GOMP_INDIRECT_ADDR_MAP; >> + >> + for (int i =3D 0; i < num_ind_funcs; i++, array++) >> + { >> + indirect_splay_tree_key k =3D &array->key; >> + k->host_addr =3D (uint64_t) *map_entry++; >> + k->target_addr =3D (uint64_t) *map_entry++; >> [...] >> +} >> [...] >> +#else >> [...] >> +void * >> +GOMP_target_map_indirect_ptr (void *ptr) >> [...] >> + for (volatile void **map_entry =3D GOMP_INDIRECT_ADDR_MAP; *map_entry= ; >> + map_entry +=3D 2) >> + if (*map_entry =3D=3D ptr) >> + return (void *) *(map_entry + 1); >> + >> + return ptr; >> +} >> + >> +#endif > ..., I'm curious why certain variables are declared 'volatile'? Is that > really the right approach for whatever exactly the (concurrency?) > requirements here are? The variable GOMP_INDIRECT_ADDR_MAP itself is set non-concurrently via GOMP= _OFFLOAD_load_image. When the kernel is run, it should not be touched. Thus, I concur that 'volatile' should not be needed at all. >> --- a/libgomp/config/gcn/team.c >> +++ b/libgomp/config/gcn/team.c >> @@ -30,6 +30,7 @@ >> +extern void build_indirect_map (void); > Why not generally have a prototype for this (new > 'libgomp/config/accel/target-indirect.h', or maybe just > 'libgomp/libgomp.h'?)? > >> @@ -45,6 +46,9 @@ gomp_gcn_enter_kernel (void) >> { >> int threadid =3D __builtin_gcn_dim_pos (1); >> > Shouldn't this: > >> + /* Initialize indirect function support. */ >> + build_indirect_map (); >> + > ... be called inside here: > >> if (threadid =3D=3D 0) >> { > ..., so that it's only executed by one thread? (concur) > Also, for my understanding: why is 'build_indirect_map' done at kernel > invocation time (here) instead of at image load time? The splay_tree is generated on the device itself - and we currently do not start a kernel during GOMP_OFFLOAD_load_image. We could, the question is whether it makes sense. (Generating the splay_tree on the host for the device is a hassle and error prone as it needs to use device pointers at the end.) >> +++ b/libgomp/testsuite/libgomp.c-c++-common/declare-target-indirect-2.c >> + switch (i % 3) >> + { >> + case 0: fn_ptr[i] =3D &foo; >> + case 1: fn_ptr[i] =3D &bar; >> + case 2: fn_ptr[i] =3D &baz; >> + } > [...]/libgomp.c-c++-common/declare-target-indirect-2.c:20:27: warnin= g: this statement may fall through [-Wimplicit-fallthrough=3D] Indeed a 'break;' would be good. Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstra=C3=9Fe 201= , 80634 M=C3=BCnchen; Gesellschaft mit beschr=C3=A4nkter Haftung; Gesch=C3= =A4ftsf=C3=BChrer: Thomas Heurung, Frank Th=C3=BCrauf; Sitz der Gesellschaf= t: M=C3=BCnchen; Registergericht M=C3=BCnchen, HRB 106955