From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id B20773858D37 for ; Mon, 10 Oct 2022 10:38:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B20773858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665398336; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:in-reply-to:in-reply-to: references:references; bh=LPgrv+565ORJw++ZTgI0/7NZG6oj/mN8oEpWta79AWM=; b=FhhNlCeiSZVDbzbIjx5Lqxqo+6Lpn4MQZLfreLY4ywyWs9QEGf+aQossVmz0HmaJoz8MLA V85jF09P/MK9+MKuc2BELXNUqVAW9YCP3Wbl4XHjtOsb5czLr/fSwZnFgndBPgKcVZT1s/ nGDHt9H3vsgwhoEwnuVzKTWLWJCBeWg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-561-gF1DD6hMOyaDtQSqdxeGHA-1; Mon, 10 Oct 2022 06:38:52 -0400 X-MC-Unique: gF1DD6hMOyaDtQSqdxeGHA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2362A1019C89; Mon, 10 Oct 2022 10:38:52 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.55]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D37E2C2C8D5; Mon, 10 Oct 2022 10:38:51 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 29AAcmcd3174263 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Mon, 10 Oct 2022 12:38:49 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 29AAclTr3174262; Mon, 10 Oct 2022 12:38:47 +0200 Date: Mon, 10 Oct 2022 12:38:47 +0200 From: Jakub Jelinek To: Julian Brown Cc: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org, Tobias Burnus Subject: Re: [PATCH v4 4/4] OpenMP/OpenACC: Unordered/non-constant component offset struct mapping Message-ID: Reply-To: Jakub Jelinek References: <3ff03cb463d35ffe96b1271a146f24899b2cb573.1665351785.git.julian@codesourcery.com> MIME-Version: 1.0 In-Reply-To: <3ff03cb463d35ffe96b1271a146f24899b2cb573.1665351785.git.julian@codesourcery.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Oct 09, 2022 at 02:51:37PM -0700, Julian Brown wrote: > This patch adds support for non-constant component offsets in "map" > clauses for OpenMP (and the equivalants for OpenACC), which are not able > to be sorted into order at compile time. Normally struct accesses in > such clauses are gathered together and sorted into increasing address > order after a "GOMP_MAP_STRUCT" node: if we have variable indices, > that is no longer possible. > > This patch adds support for such mappings by adding a new variant of > GOMP_MAP_STRUCT that does not require the list of nodes following to > be in sorted order. This passes right down to the runtime: the list is > sorted in libgomp according to the dynamic values of the offsets after > the newly-introduced GOMP_MAP_STRUCT_UNORD node. > > This mostly affects arrays of structs indexed by variables in C and C++, > but can also affect derived-type arrays with constant indexes when they > have an array descriptor. I don't understand why this is needed. We have a restriction that ought to make all such cases invalid: "If multiple list items are explicitly mapped on the same construct and have the same containing array or have base pointers that share original storage, and if any of the list items do not have corresponding list items that are present in the device data environment prior to a task encountering the construct, then the list items must refer to the same array elements of either the containing array or the implicit array of the base pointers." So, all those #pragma omp target map(t->a[i].p, t->a[j].p) etc. cases are invalid unless i == j, so IMNSHO one doesn't need to worry about unordered cases. > +#if defined(_GNU_SOURCE) || defined(__GNUC__) > +static int > +compare_addr_r (const void *a, const void *b, void *data) > +{ > + void **hostaddrs = (void **) data; > + int ai = *(int *) a, bi = *(int *) b; > + if (hostaddrs[ai] < hostaddrs[bi]) > + return -1; > + else if (hostaddrs[ai] > hostaddrs[bi]) > + return 1; > + return 0; > +} > +#endif Note, not all glibcs have qsort_r and _GNU_SOURCE macro being defined doesn't imply glibc. You'd need something like _GLIBC_PREREQ macro and require 2.8 or later. > + > static inline __attribute__((always_inline)) struct target_mem_desc * > gomp_map_vars_internal (struct gomp_device_descr *devicep, > struct goacc_asyncqueue *aq, size_t mapnum, > @@ -968,6 +982,17 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, > tgt->device_descr = devicep; > tgt->prev = NULL; > struct gomp_coalesce_buf cbuf, *cbufp = NULL; > + size_t hostaddr_idx; > + > +#if !defined(_GNU_SOURCE) && defined(__GNUC__) > + /* If we don't have _GNU_SOURCE (thus no qsort_r), but we are compiling with > + GCC (and why wouldn't we be?), we can use this nested function for > + regular qsort. */ > + int compare_addr (const void *a, const void *b) > + { > + return compare_addr_r (a, b, (void *) &hostaddrs[hostaddr_idx]); > + } > +#endif Please don't use nested functions in libgomp. > + int *order = NULL; > + if ((kind & typemask) == GOMP_MAP_STRUCT_UNORD) > + { > + order = (int *) gomp_alloca (sizeof (int) * sizes[i]); > + for (int j = 0; j < sizes[i]; j++) > + order[j] = j; > +#ifdef _GNU_SOURCE > + qsort_r (order, sizes[i], sizeof (int), &compare_addr_r, > + &hostaddrs[i + 1]); > +#elif defined(__GNUC__) > + hostaddr_idx = i + 1; > + qsort (order, sizes[i], sizeof (int), &compare_addr); > +#else > +#error no threadsafe qsort > +#endif This is too ugly. If this is really needed (see above) and you need fallback for missing qsort_t, better sort array of tuples containing the order number and some data pointer needed for the comparison routine. Jakub