public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [RFC] Offloading Support in libgomp
       [not found]                         ` <20130913131556.GD1817@tucnak.redhat.com>
@ 2013-09-13 16:19                           ` Jakub Jelinek
  2013-09-13 16:22                             ` Marek Polacek
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2013-09-13 16:19 UTC (permalink / raw)
  To: Michael V. Zolotukhin
  Cc: Kirill Yukhin, Richard Henderson, gcc-patches, triegel

On Fri, Sep 13, 2013 at 03:15:56PM +0200, Jakub Jelinek wrote:
> On Fri, Sep 13, 2013 at 05:11:09PM +0400, Michael V. Zolotukhin wrote:
> > > FYI, I'm attaching a WIP patch with the splay tree stuff.
> > Thanks, I'll take a look.  By the way, isn't it better to move splay-tree
> > implementation to a separate file?
> 
> As it is just a few routines, heavily modified from include/splay-tree.h
> (e.g. the data structures contain all the target.c specific stuff), and will be
> used just in target.c, I think it is fine to keep it in target.c.

Anyway, here is an updated patch that moves the splay stuff into
splay-tree.h and cleans up a bunch of other things.

Will commit once the http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01044.html
issue is resolved.

2013-09-13  Jakub Jelinek  <jakub@redhat.com>

	* ipa-prop.c (ipa_compute_jump_functions_for_edge): Return early
	for internal calls.

--- gcc/ipa-prop.c.jj	2013-09-13 16:48:54.000000000 +0200
+++ gcc/ipa-prop.c	2013-09-13 17:28:28.086058903 +0200
@@ -1551,6 +1551,8 @@ ipa_compute_jump_functions_for_edge (str
     return;
   vec_safe_grow_cleared (args->jump_functions, arg_num);
 
+  if (gimple_call_internal_p (call))
+    return;
   if (ipa_func_spec_opts_forbid_analysis_p (cs->caller))
     return;
 


	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-13 16:19                           ` [RFC] Offloading Support in libgomp Jakub Jelinek
@ 2013-09-13 16:22                             ` Marek Polacek
  2013-09-15  9:12                               ` Jakub Jelinek
  0 siblings, 1 reply; 12+ messages in thread
From: Marek Polacek @ 2013-09-13 16:22 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Michael V. Zolotukhin, Kirill Yukhin, Richard Henderson,
	gcc-patches, triegel

On Fri, Sep 13, 2013 at 05:35:27PM +0200, Jakub Jelinek wrote:
> On Fri, Sep 13, 2013 at 03:15:56PM +0200, Jakub Jelinek wrote:
> > On Fri, Sep 13, 2013 at 05:11:09PM +0400, Michael V. Zolotukhin wrote:
> > > > FYI, I'm attaching a WIP patch with the splay tree stuff.
> > > Thanks, I'll take a look.  By the way, isn't it better to move splay-tree
> > > implementation to a separate file?
> > 
> > As it is just a few routines, heavily modified from include/splay-tree.h
> > (e.g. the data structures contain all the target.c specific stuff), and will be
> > used just in target.c, I think it is fine to keep it in target.c.
> 
> Anyway, here is an updated patch that moves the splay stuff into
> splay-tree.h and cleans up a bunch of other things.
> 
> Will commit once the http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01044.html
> issue is resolved.
> 
> 2013-09-13  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* ipa-prop.c (ipa_compute_jump_functions_for_edge): Return early
> 	for internal calls.

Seems like a wrong patch is attached.

	Marek

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-13 16:22                             ` Marek Polacek
@ 2013-09-15  9:12                               ` Jakub Jelinek
  2013-09-15 12:51                                 ` Michael V. Zolotukhin
  2019-12-07 14:23                                 ` Thomas Schwinge
  0 siblings, 2 replies; 12+ messages in thread
From: Jakub Jelinek @ 2013-09-15  9:12 UTC (permalink / raw)
  To: Marek Polacek
  Cc: Michael V. Zolotukhin, Kirill Yukhin, Richard Henderson,
	gcc-patches, triegel

On Fri, Sep 13, 2013 at 05:41:03PM +0200, Marek Polacek wrote:
> On Fri, Sep 13, 2013 at 05:35:27PM +0200, Jakub Jelinek wrote:
> > On Fri, Sep 13, 2013 at 03:15:56PM +0200, Jakub Jelinek wrote:
> > > On Fri, Sep 13, 2013 at 05:11:09PM +0400, Michael V. Zolotukhin wrote:
> > > > > FYI, I'm attaching a WIP patch with the splay tree stuff.
> > > > Thanks, I'll take a look.  By the way, isn't it better to move splay-tree
> > > > implementation to a separate file?
> > > 
> > > As it is just a few routines, heavily modified from include/splay-tree.h
> > > (e.g. the data structures contain all the target.c specific stuff), and will be
> > > used just in target.c, I think it is fine to keep it in target.c.
> > 
> > Anyway, here is an updated patch that moves the splay stuff into
> > splay-tree.h and cleans up a bunch of other things.
> > 
> > Will commit once the http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01044.html
> > issue is resolved.
> > 
> > 2013-09-13  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	* ipa-prop.c (ipa_compute_jump_functions_for_edge): Return early
> > 	for internal calls.
> 
> Seems like a wrong patch is attached.

You're right, here is the right one.

2013-09-13  Jakub Jelinek  <jakub@redhat.com>

	* splay-tree.h: New file.
	* target.c: Include stdbool.h.
	(splay_tree_node, splay_tree, splay_tree_key): New typedefs.
	(struct target_mem_desc, struct splay_tree_key_s): New structures.
	(splay_compare): New inline function.
	(gomp_get_num_devices): New function.
	(resolve_device): Use default_device_var ICV.  Add temporarily
	magic testing device number 257.
	(dev_splay_tree, dev_env_lock): New variables.
	(gomp_map_vars_existing, gomp_map_vars, gomp_unmap_tgt,
	gomp_unmap_vars, gomp_update): New functions.
	(GOMP_target, GOMP_target_data, GOMP_target_end_data,
	GOMP_target_update): Add support for magic testing device number 257.
	* libgomp.h (struct target_mem_desc): Forward declare.
	(struct gomp_task_icv): Add default_device_var and target_data.
	(gomp_get_num_devices): New prototype.
	* env.c (gomp_global_icv): Add default_device_var initializer.
	(parse_int): New function.
	(handle_omp_display_env): Print OMP_DEFAULT_DEVICE.
	(initialize_env): Initialize default_device_var.
	(omp_set_default_device): Set default_device_var ICV.
	(omp_get_default_device): Query default_device_var ICV.
	(omp_get_num_devices): Call gomp_get_num_devices.
	(omp_get_num_teams, omp_get_team_num, omp_is_initial_device): Add
	comments.

--- libgomp/splay-tree.h.jj	2013-09-13 16:32:48.381973559 +0200
+++ libgomp/splay-tree.h	2013-09-13 16:41:38.059701560 +0200
@@ -0,0 +1,232 @@
+/* A splay-tree datatype.
+   Copyright 1998-2013
+   Free Software Foundation, Inc.
+   Contributed by Mark Mitchell (mark@markmitchell.com).
+
+   This file is part of the GNU OpenMP Library (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* The splay tree code copied from include/splay-tree.h and adjusted,
+   so that all the data lives directly in splay_tree_node_s structure
+   and no extra allocations are needed.
+
+   Files including this header should before including it add:
+typedef struct splay_tree_node_s *splay_tree_node;
+typedef struct splay_tree_s *splay_tree;
+typedef struct splay_tree_key_s *splay_tree_key;
+   define splay_tree_key_s structure, and define
+   splay_compare inline function.  */
+
+/* For an easily readable description of splay-trees, see:
+
+     Lewis, Harry R. and Denenberg, Larry.  Data Structures and Their
+     Algorithms.  Harper-Collins, Inc.  1991.
+
+   The major feature of splay trees is that all basic tree operations
+   are amortized O(log n) time for a tree with n nodes.  */
+
+/* The nodes in the splay tree.  */
+struct splay_tree_node_s {
+  struct splay_tree_key_s key;
+  /* The left and right children, respectively.  */
+  splay_tree_node left;
+  splay_tree_node right;
+};
+
+/* The splay tree.  */
+struct splay_tree_s {
+  splay_tree_node root;
+};
+
+/* Rotate the edge joining the left child N with its parent P.  PP is the
+   grandparents' pointer to P.  */
+
+static inline void
+rotate_left (splay_tree_node *pp, splay_tree_node p, splay_tree_node n)
+{
+  splay_tree_node tmp;
+  tmp = n->right;
+  n->right = p;
+  p->left = tmp;
+  *pp = n;
+}
+
+/* Rotate the edge joining the right child N with its parent P.  PP is the
+   grandparents' pointer to P.  */
+
+static inline void
+rotate_right (splay_tree_node *pp, splay_tree_node p, splay_tree_node n)
+{
+  splay_tree_node tmp;
+  tmp = n->left;
+  n->left = p;
+  p->right = tmp;
+  *pp = n;
+}
+
+/* Bottom up splay of KEY.  */
+
+static void
+splay_tree_splay (splay_tree sp, splay_tree_key key)
+{
+  if (sp->root == NULL)
+    return;
+
+  do {
+    int cmp1, cmp2;
+    splay_tree_node n, c;
+
+    n = sp->root;
+    cmp1 = splay_compare (key, &n->key);
+
+    /* Found.  */
+    if (cmp1 == 0)
+      return;
+
+    /* Left or right?  If no child, then we're done.  */
+    if (cmp1 < 0)
+      c = n->left;
+    else
+      c = n->right;
+    if (!c)
+      return;
+
+    /* Next one left or right?  If found or no child, we're done
+       after one rotation.  */
+    cmp2 = splay_compare (key, &c->key);
+    if (cmp2 == 0
+	|| (cmp2 < 0 && !c->left)
+	|| (cmp2 > 0 && !c->right))
+      {
+	if (cmp1 < 0)
+	  rotate_left (&sp->root, n, c);
+	else
+	  rotate_right (&sp->root, n, c);
+	return;
+      }
+
+    /* Now we have the four cases of double-rotation.  */
+    if (cmp1 < 0 && cmp2 < 0)
+      {
+	rotate_left (&n->left, c, c->left);
+	rotate_left (&sp->root, n, n->left);
+      }
+    else if (cmp1 > 0 && cmp2 > 0)
+      {
+	rotate_right (&n->right, c, c->right);
+	rotate_right (&sp->root, n, n->right);
+      }
+    else if (cmp1 < 0 && cmp2 > 0)
+      {
+	rotate_right (&n->left, c, c->right);
+	rotate_left (&sp->root, n, n->left);
+      }
+    else if (cmp1 > 0 && cmp2 < 0)
+      {
+	rotate_left (&n->right, c, c->left);
+	rotate_right (&sp->root, n, n->right);
+      }
+  } while (1);
+}
+
+/* Insert a new NODE into SP.  The NODE shouldn't exist in the tree.  */
+
+static void
+splay_tree_insert (splay_tree sp, splay_tree_node node)
+{
+  int comparison = 0;
+
+  splay_tree_splay (sp, &node->key);
+
+  if (sp->root)
+    comparison = splay_compare (&sp->root->key, &node->key);
+
+  if (sp->root && comparison == 0)
+    abort ();
+  else
+    {
+      /* Insert it at the root.  */
+      if (sp->root == NULL)
+	node->left = node->right = NULL;
+      else if (comparison < 0)
+	{
+	  node->left = sp->root;
+	  node->right = node->left->right;
+	  node->left->right = NULL;
+	}
+      else
+	{
+	  node->right = sp->root;
+	  node->left = node->right->left;
+	  node->right->left = NULL;
+	}
+
+      sp->root = node;
+    }
+}
+
+/* Remove node with KEY from SP.  It is not an error if it did not exist.  */
+
+static void
+splay_tree_remove (splay_tree sp, splay_tree_key key)
+{
+  splay_tree_splay (sp, key);
+
+  if (sp->root && splay_compare (&sp->root->key, key) == 0)
+    {
+      splay_tree_node left, right;
+
+      left = sp->root->left;
+      right = sp->root->right;
+
+      /* One of the children is now the root.  Doesn't matter much
+	 which, so long as we preserve the properties of the tree.  */
+      if (left)
+	{
+	  sp->root = left;
+
+	  /* If there was a right child as well, hang it off the
+	     right-most leaf of the left child.  */
+	  if (right)
+	    {
+	      while (left->right)
+		left = left->right;
+	      left->right = right;
+	    }
+	}
+      else
+	sp->root = right;
+    }
+}
+
+/* Lookup KEY in SP, returning NODE if present, and NULL
+   otherwise.  */
+
+static splay_tree_key
+splay_tree_lookup (splay_tree sp, splay_tree_key key)
+{
+  splay_tree_splay (sp, key);
+
+  if (sp->root && splay_compare (&sp->root->key, key) == 0)
+    return &sp->root->key;
+  else
+    return NULL;
+}
--- libgomp/target.c.jj	2013-09-09 17:41:02.290429613 +0200
+++ libgomp/target.c	2013-09-13 16:41:24.514770386 +0200
@@ -26,15 +26,383 @@
    creation and termination.  */
 
 #include "libgomp.h"
+#include <stdbool.h>
 #include <stdlib.h>
 #include <string.h>
 
+/* Forward declaration for a node in the tree.  */
+typedef struct splay_tree_node_s *splay_tree_node;
+typedef struct splay_tree_s *splay_tree;
+typedef struct splay_tree_key_s *splay_tree_key;
+
+struct target_mem_desc {
+  /* Reference count.  */
+  uintptr_t refcount;
+  /* All the splay nodes allocated together.  */
+  splay_tree_node array;
+  /* Start of the target region.  */
+  uintptr_t tgt_start;
+  /* End of the targer region.  */
+  uintptr_t tgt_end;
+  /* Handle to free.  */
+  void *to_free;
+  /* Previous target_mem_desc.  */
+  struct target_mem_desc *prev;
+  /* Number of items in following list.  */
+  size_t list_count;
+  /* List of splay keys to remove (or decrease refcount)
+     at the end of region.  */
+  splay_tree_key list[];
+};
+
+struct splay_tree_key_s {
+  /* Address of the host object.  */
+  uintptr_t host_start;
+  /* Address immediately after the host object.  */
+  uintptr_t host_end;
+  /* Descriptor of the target memory.  */
+  struct target_mem_desc *tgt;
+  /* Offset from tgt->tgt_start to the start of the target object.  */
+  uintptr_t tgt_offset;
+  /* Reference count.  */
+  uintptr_t refcount;
+  /* True if data should be copied from device to host at the end.  */
+  bool copy_from;
+};
+
+/* The comparison function.  */
+
+static int
+splay_compare (splay_tree_key x, splay_tree_key y)
+{
+  if (x->host_start == x->host_end
+      && y->host_start == y->host_end)
+    return 0;
+  if (x->host_end <= y->host_start)
+    return -1;
+  if (x->host_start >= y->host_end)
+    return 1;
+  return 0;
+}
+
+#include "splay-tree.h"
+
+attribute_hidden int
+gomp_get_num_devices (void)
+{
+  /* FIXME: Scan supported accelerators when called the first time.  */
+  return 0;
+}
+
 static int
 resolve_device (int device)
 {
+  if (device == -1)
+    {
+      struct gomp_task_icv *icv = gomp_icv (false);
+      device = icv->default_device_var;
+    }
+  /* FIXME: Temporary hack for testing non-shared address spaces on host.  */
+  if (device == 257)
+    return 257;
+  if (device >= gomp_get_num_devices ())
+    return -1;
   return -1;
 }
 
+/* These variables would be per-accelerator (which doesn't have shared address
+   space.  */
+static struct splay_tree_s dev_splay_tree;
+static gomp_mutex_t dev_env_lock;
+
+/* Handle the case where splay_tree_lookup found oldn for newn.
+   Helper function of gomp_map_vars.  */
+
+static inline void
+gomp_map_vars_existing (splay_tree_key oldn, splay_tree_key newn,
+			unsigned char kind)
+{
+  if (oldn->host_start > newn->host_start
+      || oldn->host_end < newn->host_end)
+    gomp_fatal ("Trying to map into device [%p..%p) object when"
+		"[%p..%p) is already mapped",
+		(void *) newn->host_start, (void *) newn->host_end,
+		(void *) oldn->host_start, (void *) oldn->host_end);
+  if (((kind & 7) == 2 || (kind & 7) == 3)
+      && !oldn->copy_from
+      && oldn->host_start == newn->host_start
+      && oldn->host_end == newn->host_end)
+    oldn->copy_from = true;
+  oldn->refcount++;
+}
+
+static struct target_mem_desc *
+gomp_map_vars (size_t mapnum, void **hostaddrs, size_t *sizes,
+	       unsigned char *kinds, bool is_target)
+{
+  size_t i, tgt_align, tgt_size, not_found_cnt = 0;
+  struct splay_tree_key_s cur_node;
+  struct target_mem_desc *tgt
+    = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
+  tgt->list_count = mapnum;
+  tgt->refcount = 1;
+
+  if (mapnum == 0)
+    return tgt;
+
+  tgt_align = sizeof (void *);
+  tgt_size = 0;
+  if (is_target)
+    {
+      size_t align = 4 * sizeof (void *);
+      tgt_align = align;
+      tgt_size = mapnum * sizeof (void *);
+    }
+
+  gomp_mutex_lock (&dev_env_lock);
+  for (i = 0; i < mapnum; i++)
+    {
+      cur_node.host_start = (uintptr_t) hostaddrs[i];
+      if ((kinds[i] & 7) != 4)
+	cur_node.host_end = cur_node.host_start + sizes[i];
+      else
+	cur_node.host_end = cur_node.host_start + sizeof (void *);
+      splay_tree_key n = splay_tree_lookup (&dev_splay_tree, &cur_node);
+      if (n)
+	{
+	  tgt->list[i] = n;
+	  gomp_map_vars_existing (n, &cur_node, kinds[i]);
+	}
+      else
+	{
+	  size_t align = (size_t) 1 << (kinds[i] >> 3);
+	  tgt->list[i] = NULL;
+	  not_found_cnt++;
+	  if (tgt_align < align)
+	    tgt_align = align;
+	  tgt_size = (tgt_size + align - 1) & ~(align - 1);
+	  tgt_size += cur_node.host_end - cur_node.host_start;
+	}
+    }
+
+  if (not_found_cnt || is_target)
+    {
+      /* FIXME: This would be accelerator memory allocation, not
+	 host, and should allocate tgt_align aligned tgt_size block
+	 of memory.  */
+      tgt->to_free = gomp_malloc (tgt_size + tgt_align - 1);
+      tgt->tgt_start = (uintptr_t) tgt->to_free;
+      tgt->tgt_start = (tgt->tgt_start + tgt_align - 1) & ~(tgt_align - 1);
+      tgt->tgt_end = tgt->tgt_start + tgt_size;
+    }
+
+  tgt_size = 0;
+  if (is_target)
+    tgt_size = mapnum * sizeof (void *);
+
+  if (not_found_cnt)
+    {
+      tgt->array = gomp_malloc (not_found_cnt * sizeof (*tgt->array));
+      splay_tree_node array = tgt->array;
+
+      for (i = 0; i < mapnum; i++)
+	if (tgt->list[i] == NULL)
+	  {
+	    splay_tree_key k = &array->key;
+	    k->host_start = (uintptr_t) hostaddrs[i];
+	    if ((kinds[i] & 7) != 4)
+	      k->host_end = k->host_start + sizes[i];
+	    else
+	      k->host_end = k->host_start + sizeof (void *);
+	    splay_tree_key n
+	      = splay_tree_lookup (&dev_splay_tree, k);
+	    if (n)
+	      {
+		tgt->list[i] = n;
+		gomp_map_vars_existing (n, k, kinds[i]);
+	      }
+	    else
+	      {
+		size_t align = (size_t) 1 << (kinds[i] >> 3);
+		tgt->list[i] = k;
+		tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		k->tgt = tgt;
+		k->tgt_offset = tgt_size;
+		tgt_size += k->host_end - k->host_start;
+		if ((kinds[i] & 7) == 2 || (kinds[i] & 7) == 3)
+		  k->copy_from = true;
+		k->refcount = 1;
+		tgt->refcount++;
+		array->left = NULL;
+		array->right = NULL;
+		splay_tree_insert (&dev_splay_tree, array);
+		switch (kinds[i] & 7)
+		  {
+		  case 0: /* ALLOC */
+		  case 2: /* FROM */
+		    break;
+		  case 1: /* TO */
+		  case 3: /* TOFROM */
+		    /* FIXME: This is supposed to be copy from host to device
+		       memory.  Perhaps add some smarts, like if copying
+		       several adjacent fields from host to target, use some
+		       host buffer to avoid sending each var individually.  */
+		    memcpy ((void *) (tgt->tgt_start + k->tgt_offset),
+			    (void *) k->host_start,
+			    k->host_end - k->host_start);
+		    break;
+		  case 4: /* POINTER */
+		    cur_node.host_start
+		      = (uintptr_t) *(void **) k->host_start;
+		    /* Add bias to the pointer value.  */
+		    cur_node.host_start += sizes[i];
+		    cur_node.host_end = cur_node.host_start + 1;
+		    n = splay_tree_lookup (&dev_splay_tree, &cur_node);
+		    if (n == NULL)
+		      {
+			/* Could be possibly zero size array section.  */
+			cur_node.host_end--;
+			n = splay_tree_lookup (&dev_splay_tree, &cur_node);
+			if (n == NULL)
+			  {
+			    cur_node.host_start--;
+			    n = splay_tree_lookup (&dev_splay_tree, &cur_node);
+			    cur_node.host_start++;
+			  }
+		      }
+		    if (n == NULL)
+		      gomp_fatal ("Pointer target of array section "
+				  "wasn't mapped");
+		    cur_node.host_start -= n->host_start;
+		    cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset
+					  + cur_node.host_start;
+		    /* At this point tgt_offset is target address of the
+		       array section.  Now subtract bias to get what we want
+		       to initialize the pointer with.  */
+		    cur_node.tgt_offset -= sizes[i];
+		    /* FIXME: host to device copy, see above FIXME comment.  */
+		    memcpy ((void *) (tgt->tgt_start + k->tgt_offset),
+			    (void *) &cur_node.tgt_offset,
+			    sizeof (void *));
+		    break;
+		  }
+		array++;
+	      }
+	  }
+    }
+  if (is_target)
+    {
+      for (i = 0; i < mapnum; i++)
+	{
+	  cur_node.tgt_offset = tgt->list[i]->tgt->tgt_start
+				+ tgt->list[i]->tgt_offset;
+	  /* FIXME: host to device copy, see above FIXME comment.  */
+	  memcpy ((void *) (tgt->tgt_start + i * sizeof (void *)),
+		  (void *) &cur_node.tgt_offset,
+		  sizeof (void *));
+	}
+    }
+
+  gomp_mutex_unlock (&dev_env_lock);
+  return tgt;
+}
+
+static void
+gomp_unmap_tgt (struct target_mem_desc *tgt)
+{
+  /* FIXME: Deallocate on target the tgt->tgt_start .. tgt->tgt_end
+     region.  */
+  if (tgt->tgt_end)
+    free (tgt->to_free);
+
+  free (tgt->array);
+  free (tgt);
+}
+
+static void
+gomp_unmap_vars (struct target_mem_desc *tgt)
+{
+  if (tgt->list_count == 0)
+    {
+      free (tgt);
+      return;
+    }
+
+  size_t i;
+  gomp_mutex_lock (&dev_env_lock);
+  for (i = 0; i < tgt->list_count; i++)
+    if (tgt->list[i]->refcount > 1)
+      tgt->list[i]->refcount--;
+    else
+      {
+	splay_tree_key k = tgt->list[i];
+	if (k->copy_from)
+	  /* FIXME: device to host copy.  */
+	  memcpy ((void *) k->host_start,
+		  (void *) (k->tgt->tgt_start + k->tgt_offset),
+		  k->host_end - k->host_start);
+	splay_tree_remove (&dev_splay_tree, k);
+	if (k->tgt->refcount > 1)
+	  k->tgt->refcount--;
+	else
+	  gomp_unmap_tgt (k->tgt);
+      }
+
+  if (tgt->refcount > 1)
+    tgt->refcount--;
+  else
+    gomp_unmap_tgt (tgt);
+  gomp_mutex_unlock (&dev_env_lock);
+}
+
+static void
+gomp_update (size_t mapnum, void **hostaddrs, size_t *sizes,
+	     unsigned char *kinds)
+{
+  size_t i;
+  struct splay_tree_key_s cur_node;
+
+  if (mapnum == 0)
+    return;
+
+  gomp_mutex_lock (&dev_env_lock);
+  for (i = 0; i < mapnum; i++)
+    if (sizes[i])
+      {
+	cur_node.host_start = (uintptr_t) hostaddrs[i];
+	cur_node.host_end = cur_node.host_start + sizes[i];
+	splay_tree_key n = splay_tree_lookup (&dev_splay_tree, &cur_node);
+	if (n)
+	  {
+	    if (n->host_start > cur_node.host_start
+		|| n->host_end < cur_node.host_end)
+	      gomp_fatal ("Trying to update [%p..%p) object when"
+			  "only [%p..%p) is mapped",
+			  (void *) cur_node.host_start,
+			  (void *) cur_node.host_end,
+			  (void *) n->host_start,
+			  (void *) n->host_end);
+	    if ((kinds[i] & 7) == 1)
+	      /* FIXME: host to device copy.  */
+	      memcpy ((void *) (n->tgt->tgt_start + n->tgt_offset
+				+ cur_node.host_start - n->host_start),
+		      (void *) cur_node.host_start,
+		      cur_node.host_end - cur_node.host_start);
+	    else if ((kinds[i] & 7) == 2)
+	      /* FIXME: device to host copy.  */
+	      memcpy ((void *) cur_node.host_start,
+		      (void *) (n->tgt->tgt_start + n->tgt_offset
+				+ cur_node.host_start - n->host_start),
+		      cur_node.host_end - cur_node.host_start);
+	  }
+	else
+	  gomp_fatal ("Trying to update [%p..%p) object that is not mapped",
+		      (void *) cur_node.host_start,
+		      (void *) cur_node.host_end);
+      }
+  gomp_mutex_unlock (&dev_env_lock);
+}
+
 /* Called when encountering a target directive.  If DEVICE
    is -1, it means use device-var ICV.  If it is -2 (or any other value
    larger than last available hw device, use host fallback.
@@ -49,32 +417,77 @@ GOMP_target (int device, void (*fn) (voi
 	     size_t mapnum, void **hostaddrs, size_t *sizes,
 	     unsigned char *kinds)
 {
-  if (resolve_device (device) == -1)
+  device = resolve_device (device);
+  if (device == -1)
     {
+      /* Host fallback.  */
       fn (hostaddrs);
       return;
     }
+  if (device == 257)
+    {
+      struct target_mem_desc *tgt
+	= gomp_map_vars (mapnum, hostaddrs, sizes, kinds, true);
+      fn ((void *) tgt->tgt_start);
+      gomp_unmap_vars (tgt);
+    }
 }
 
 void
 GOMP_target_data (int device, size_t mapnum, void **hostaddrs, size_t *sizes,
 		  unsigned char *kinds)
 {
-  if (resolve_device (device) == -1)
-    return;
+  device = resolve_device (device);
+  if (device == -1)
+    {
+      /* Host fallback.  */
+      struct gomp_task_icv *icv = gomp_icv (false);
+      if (icv->target_data)
+	{
+	  /* Even when doing a host fallback, if there are any active
+	     #pragma omp target data constructs, need to remember the
+	     new #pragma omp target data, otherwise GOMP_target_end_data
+	     would get out of sync.  */
+	  struct target_mem_desc *tgt
+	    = gomp_map_vars (0, NULL, NULL, NULL, false);
+	  tgt->prev = icv->target_data;
+	  icv->target_data = tgt;
+	}
+      return;
+    }
+
+  if (device == 257)
+    {
+      struct target_mem_desc *tgt
+	= gomp_map_vars (mapnum, hostaddrs, sizes, kinds, false);
+      struct gomp_task_icv *icv = gomp_icv (true);
+      tgt->prev = icv->target_data;
+      icv->target_data = tgt;
+    }
 }
 
 void
 GOMP_target_end_data (void)
 {
+  struct gomp_task_icv *icv = gomp_icv (false);
+  if (icv->target_data)
+    {
+      struct target_mem_desc *tgt = icv->target_data;
+      icv->target_data = tgt->prev;
+      gomp_unmap_vars (tgt);
+    }
 }
 
 void
 GOMP_target_update (int device, size_t mapnum, void **hostaddrs, size_t *sizes,
 		    unsigned char *kinds)
 {
-  if (resolve_device (device) == -1)
+  device = resolve_device (device);
+  if (device == -1)
     return;
+
+  if (device == 257)
+    gomp_update (mapnum, hostaddrs, sizes, kinds);
 }
 
 void
--- libgomp/libgomp.h.jj	2013-09-09 17:41:02.388429108 +0200
+++ libgomp/libgomp.h	2013-09-13 12:19:13.489052710 +0200
@@ -214,18 +214,23 @@ struct gomp_team_state
   unsigned long static_trip;
 };
 
-/* These are the OpenMP 3.0 Internal Control Variables described in
+struct target_mem_desc;
+
+/* These are the OpenMP 4.0 Internal Control Variables described in
    section 2.3.1.  Those described as having one copy per task are
    stored within the structure; those described as having one copy
    for the whole program are (naturally) global variables.  */
-
+   
 struct gomp_task_icv
 {
   unsigned long nthreads_var;
   enum gomp_schedule_type run_sched_var;
   int run_sched_modifier;
+  int default_device_var;
   bool dyn_var;
   bool nest_var;
+  /* Internal ICV.  */
+  struct target_mem_desc *target_data;
 };
 
 extern struct gomp_task_icv gomp_global_icv;
@@ -496,6 +501,10 @@ extern void gomp_team_start (void (*) (v
 			     struct gomp_team *);
 extern void gomp_team_end (void);
 
+/* target.c */
+
+extern int gomp_get_num_devices (void);
+
 /* work.c */
 
 extern void gomp_init_work_share (struct gomp_work_share *, bool, unsigned);
--- libgomp/env.c.jj	2013-09-09 17:41:02.335429381 +0200
+++ libgomp/env.c	2013-09-12 17:39:42.435446713 +0200
@@ -56,6 +56,7 @@ struct gomp_task_icv gomp_global_icv = {
   .nthreads_var = 1,
   .run_sched_var = GFS_DYNAMIC,
   .run_sched_modifier = 1,
+  .default_device_var = 0,
   .dyn_var = false,
   .nest_var = false
 };
@@ -188,6 +189,24 @@ parse_unsigned_long (const char *name, u
   return false;
 }
 
+/* Parse a positive int environment variable.  Return true if one was
+   present and it was successfully parsed.  */
+
+static bool
+parse_int (const char *name, int *pvalue, bool allow_zero)
+{
+  unsigned long value;
+  if (!parse_unsigned_long (name, &value, allow_zero))
+    return false;
+  if (value > INT_MAX)
+    {
+      gomp_error ("Invalid value for environment variable %s", name);
+      return false;
+    }
+  *pvalue = (int) value;
+  return true;
+}
+
 /* Parse an unsigned long list environment variable.  Return true if one was
    present and it was successfully parsed.  */
 
@@ -658,8 +677,9 @@ handle_omp_display_env (bool proc_bind,
 
 /* FIXME: Unimplemented OpenMP 4.0 environment variables.
   fprintf (stderr, "  OMP_PLACES = ''\n");
-  fprintf (stderr, "  OMP_CANCELLATION = ''\n");
-  fprintf (stderr, "  OMP_DEFAULT_DEVICE = ''\n"); */
+  fprintf (stderr, "  OMP_CANCELLATION = ''\n"); */
+  fprintf (stderr, "  OMP_DEFAULT_DEVICE = '%d'\n",
+	   gomp_global_icv.default_device_var);
 
   if (verbose)
     {
@@ -699,6 +719,7 @@ initialize_env (void)
   parse_boolean ("OMP_DYNAMIC", &gomp_global_icv.dyn_var);
   parse_boolean ("OMP_NESTED", &gomp_global_icv.nest_var);
   parse_boolean ("OMP_PROC_BIND", &bind_var);
+  parse_int ("OMP_DEFAULT_DEVICE", &gomp_global_icv.default_device_var, true);
   parse_unsigned_long ("OMP_MAX_ACTIVE_LEVELS", &gomp_max_active_levels_var,
 		       true);
   parse_unsigned_long ("OMP_THREAD_LIMIT", &gomp_thread_limit_var, false);
@@ -881,36 +902,41 @@ omp_get_proc_bind (void)
 void
 omp_set_default_device (int device_num)
 {
-  (void) device_num;
+  struct gomp_task_icv *icv = gomp_icv (true);
+  icv->default_device_var = device_num >= 0 ? device_num : 0;
 }
 
 int
 omp_get_default_device (void)
 {
-  return 0;
+  struct gomp_task_icv *icv = gomp_icv (false);
+  return icv->default_device_var;
 }
 
 int
 omp_get_num_devices (void)
 {
-  return 0;
+  return gomp_get_num_devices ();
 }
 
 int
 omp_get_num_teams (void)
 {
+  /* Hardcoded to 1 on host, MIC, HSAIL?  Maybe variable on PTX.  */
   return 1;
 }
 
 int
 omp_get_team_num (void)
 {
+  /* Hardcoded to 0 on host, MIC, HSAIL?  Maybe variable on PTX.  */
   return 0;
 }
 
 int
 omp_is_initial_device (void)
 {
+  /* Hardcoded to 1 on host, should be 0 on MIC, HSAIL, PTX.  */
   return 1;
 }
 


	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-15  9:12                               ` Jakub Jelinek
@ 2013-09-15 12:51                                 ` Michael V. Zolotukhin
  2013-09-15 18:15                                   ` Michael V. Zolotukhin
  2019-12-07 14:23                                 ` Thomas Schwinge
  1 sibling, 1 reply; 12+ messages in thread
From: Michael V. Zolotukhin @ 2013-09-15 12:51 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Marek Polacek, Kirill Yukhin, Richard Henderson, gcc-patches, triegel

Hi Jakub,
This patch looks ok for me in general, but I am a bit worried about using
splay-trees.  Couldn't we end up with their worst case linear performance
instead desired log?

Imagine the following scenario:
  #pragma parallel ... // to produce N-threads
  {
  #  pragma target map (i1, i2, ...iK)
    {
      // some code to offload using i1, i2, ... iK
    }
  }
Libgomp will start N-1 new threads, and all of them would want to look up
mappings for i1,i2,...iK in the splay tree.  The first one wouldn't find
anything and would map and insert all the values to the tree.  But the following
ones would look-up these addresses in the exactly same order, which will lead to
totally unbalanced tree.

Am I missing anything or is it a real problem?

Thanks, Michael
> 	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-15 12:51                                 ` Michael V. Zolotukhin
@ 2013-09-15 18:15                                   ` Michael V. Zolotukhin
  2013-09-16  7:17                                     ` Jakub Jelinek
  0 siblings, 1 reply; 12+ messages in thread
From: Michael V. Zolotukhin @ 2013-09-15 18:15 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Marek Polacek, Kirill Yukhin, Richard Henderson, gcc-patches, triegel

> Libgomp will start N-1 new threads, and all of them would want to look up
> mappings for i1,i2,...iK in the splay tree.  The first one wouldn't find
> anything and would map and insert all the values to the tree.  But the following
> ones would look-up these addresses in the exactly same order, which will lead to
> totally unbalanced tree.
> 
> Am I missing anything or is it a real problem?
On second thought, this access order doesn't necessarily mean accessing in
ascending/descending keys order, so there is no problem here.

Thanks, Michael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-15 18:15                                   ` Michael V. Zolotukhin
@ 2013-09-16  7:17                                     ` Jakub Jelinek
  2013-09-16  7:18                                       ` Michael V. Zolotukhin
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2013-09-16  7:17 UTC (permalink / raw)
  To: Michael V. Zolotukhin
  Cc: Marek Polacek, Kirill Yukhin, Richard Henderson, gcc-patches, triegel

On Sun, Sep 15, 2013 at 07:41:24PM +0400, Michael V. Zolotukhin wrote:
> > Libgomp will start N-1 new threads, and all of them would want to look up
> > mappings for i1,i2,...iK in the splay tree.  The first one wouldn't find
> > anything and would map and insert all the values to the tree.  But the following
> > ones would look-up these addresses in the exactly same order, which will lead to
> > totally unbalanced tree.
> > 
> > Am I missing anything or is it a real problem?
> On second thought, this access order doesn't necessarily mean accessing in
> ascending/descending keys order, so there is no problem here.

Yes, splay tree can get totally unbalanced and you can have a linear lookup
time, the O(log n) lookup time is amortized.  But, if you e.g. really do
lookup sorted keys (which is not given, the compiler puts vars into the
clauses based on the user order or in the order references to those vars are
discovered, plus for array sections pointer kinds which usually have
different addresses go immediately after the data ones), you really can have
one O(n) lookup if you've looked e.g. the highest address last time and now
you're looking up the lowest and the tree is totally unbalanced, but then
won't the following lookups be all O(1), because the keys you are looking up
will be always immediately in the right child?

Anyway, if the splay trees ever cause issues in real-world, it is not hard
to just replace them by something else (R-B trees, AVL trees or similar).

	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-16  7:17                                     ` Jakub Jelinek
@ 2013-09-16  7:18                                       ` Michael V. Zolotukhin
  2013-09-16  8:07                                         ` Jakub Jelinek
  0 siblings, 1 reply; 12+ messages in thread
From: Michael V. Zolotukhin @ 2013-09-16  7:18 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Marek Polacek, Kirill Yukhin, Richard Henderson, gcc-patches, triegel

> Yes, splay tree can get totally unbalanced and you can have a linear lookup
> time, the O(log n) lookup time is amortized.  But, if you e.g. really do
> lookup sorted keys (which is not given, the compiler puts vars into the
> clauses based on the user order or in the order references to those vars are
> discovered, plus for array sections pointer kinds which usually have
> different addresses go immediately after the data ones), you really can have
> one O(n) lookup if you've looked e.g. the highest address last time and now
> you're looking up the lowest and the tree is totally unbalanced, but then
> won't the following lookups be all O(1), because the keys you are looking up
> will be always immediately in the right child?
If the first time the lookup was in increasing keys order, and then we are
looking up in decreasing keys order, then yes, there is no problem and at the
beginning the element we are looking for would be very close to root, so it
would be fast (at the end I guess there would be still O(log N)).  The problem
would be if the order of the 2nd lookup is the same as the order of the 1st
lookup.

> Anyway, if the splay trees ever cause issues in real-world, it is not hard
> to just replace them by something else (R-B trees, AVL trees or similar).
Yes, agreed.

Michael
> 	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-16  7:18                                       ` Michael V. Zolotukhin
@ 2013-09-16  8:07                                         ` Jakub Jelinek
  2013-09-16  8:19                                           ` Michael V. Zolotukhin
  0 siblings, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2013-09-16  8:07 UTC (permalink / raw)
  To: Michael V. Zolotukhin
  Cc: Marek Polacek, Kirill Yukhin, Richard Henderson, gcc-patches, triegel

On Mon, Sep 16, 2013 at 11:15:16AM +0400, Michael V. Zolotukhin wrote:
> > Yes, splay tree can get totally unbalanced and you can have a linear lookup
> > time, the O(log n) lookup time is amortized.  But, if you e.g. really do
> > lookup sorted keys (which is not given, the compiler puts vars into the
> > clauses based on the user order or in the order references to those vars are
> > discovered, plus for array sections pointer kinds which usually have
> > different addresses go immediately after the data ones), you really can have
> > one O(n) lookup if you've looked e.g. the highest address last time and now
> > you're looking up the lowest and the tree is totally unbalanced, but then
> > won't the following lookups be all O(1), because the keys you are looking up
> > will be always immediately in the right child?
> If the first time the lookup was in increasing keys order, and then we are
> looking up in decreasing keys order, then yes, there is no problem and at the
> beginning the element we are looking for would be very close to root, so it
> would be fast (at the end I guess there would be still O(log N)).  The problem
> would be if the order of the 2nd lookup is the same as the order of the 1st
> lookup.

No.  If you insert 1 to 100 into a splay tree in ascending order (that will
give you a totally unbalanced tree), and then lookup 1 to 100 in the
ascending order again, then the lookup of 1 will be expensive (100
comparisons), but then for each following lookup it
will cost just 2 comparisons, so for the 100 lookups you'll need 298
comparisons, i.e. ~ 3 comparisons per lookup on average (rather than the 6-7
lookups you'd get for balanced AVL tree or similar).  Splay trees
actually behave very nicely if the lookups are done in sorted orders or
if you usually look up similar addresses in sequence (which is quite likely,
usually the splay tree will contain addresses of #pragma omp declare target
vars (and selected functions) and typically lookups for #pragma omp target
will be usually for function local variables which will have similar
addresses), and if what you lookup is completely random, then you wouldn't
end up with an unbalanced tree.

	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-16  8:07                                         ` Jakub Jelinek
@ 2013-09-16  8:19                                           ` Michael V. Zolotukhin
  0 siblings, 0 replies; 12+ messages in thread
From: Michael V. Zolotukhin @ 2013-09-16  8:19 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Marek Polacek, Kirill Yukhin, Richard Henderson, gcc-patches, triegel

> No.  If you insert 1 to 100 into a splay tree in ascending order (that will
> give you a totally unbalanced tree), and then lookup 1 to 100 in the
> ascending order again, then the lookup of 1 will be expensive (100
> comparisons), but then for each following lookup it
> will cost just 2 comparisons, so for the 100 lookups you'll need 298
> comparisons, i.e. ~ 3 comparisons per lookup on average (rather than the 6-7
> lookups you'd get for balanced AVL tree or similar).  Splay trees
> actually behave very nicely if the lookups are done in sorted orders or
> if you usually look up similar addresses in sequence (which is quite likely,
> usually the splay tree will contain addresses of #pragma omp declare target
> vars (and selected functions) and typically lookups for #pragma omp target
> will be usually for function local variables which will have similar
> addresses), and if what you lookup is completely random, then you wouldn't
> end up with an unbalanced tree.
Maybe you are right, so splay trees might be the best choice here indeed.

Michael
> 	Jakub

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
       [not found]                   ` <20130913112930.GC30181@msticlxl57.ims.intel.com>
       [not found]                     ` <20130913123614.GB1817@tucnak.redhat.com>
@ 2014-12-22 18:33                     ` Thomas Schwinge
  1 sibling, 0 replies; 12+ messages in thread
From: Thomas Schwinge @ 2014-12-22 18:33 UTC (permalink / raw)
  To: Jakub Jelinek, gcc-patches; +Cc: Michael V. Zolotukhin, Kirill Yukhin

[-- Attachment #1: Type: text/plain, Size: 3661 bytes --]

Hi!

On Fri, 13 Sep 2013 15:29:30 +0400, "Michael V. Zolotukhin" <michael.v.zolotukhin@gmail.com> wrote:
> Here is the first patch for adding plugins support in libgomp - could you please
> take a look at it?

> --- a/libgomp/target.c
> +++ b/libgomp/target.c

> +/* This structure describes accelerator device.
> +   It contains name of the corresponding libgomp plugin, function handlers for
> +   interaction with the device, ID-number of the device, and information about
> +   mapped memory.  */
> +struct gomp_device_descr
> +{
> +  /* This is the ID number of device.  It could be specified in DEVICE-clause of
> +     TARGET construct.  */
> +  int id;

Committed to gomp-4_0-branch in r219028:

commit dac5fae7f58dab3c3f271d2676d20aa653a51e9a
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Mon Dec 22 18:24:03 2014 +0000

    libgomp: Remove id member from struct gomp_device_descr.
    
    This ID field is not actually used anywhere, and it is redundant, because all
    devices are kept in a devices array, so the IDth device is simply devices[ID].
    
    	libgomp/
    	* libgomp.h (struct gomp_device_descr): Remove id member.  Update
    	all users.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@219028 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp | 3 +++
 libgomp/libgomp.h      | 4 ----
 libgomp/oacc-host.c    | 1 -
 libgomp/target.c       | 3 ---
 4 files changed, 3 insertions(+), 8 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index 3439797..eabf737 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,5 +1,8 @@
 2014-12-22  Thomas Schwinge  <thomas@codesourcery.com>
 
+	* libgomp.h (struct gomp_device_descr): Remove id member.  Update
+	all users.
+
 	* libgomp.h (gomp_init_tables): Remove const qualifier from struct
 	gomp_device_descr.  Change all users.
 	* oacc-int.h (base_dev, goacc_register): Likewise.
diff --git libgomp/libgomp.h libgomp/libgomp.h
index 440bfce..b6d216b 100644
--- libgomp/libgomp.h
+++ libgomp/libgomp.h
@@ -736,10 +736,6 @@ struct gomp_device_descr
   /* Capabilities of device (supports OpenACC, OpenMP).  */
   unsigned int capabilities;
 
-  /* This is the ID number of device.  It could be specified in DEVICE-clause of
-     TARGET construct.  */
-  int id;
-
   /* This is the ID number of device among devices of the same type.  */
   int target_id;
 
diff --git libgomp/oacc-host.c libgomp/oacc-host.c
index 3b90259..2a82517 100644
--- libgomp/oacc-host.c
+++ libgomp/oacc-host.c
@@ -38,7 +38,6 @@ static struct gomp_device_descr host_dispatch =
     .capabilities = (GOMP_OFFLOAD_CAP_OPENACC_200
 		     | GOMP_OFFLOAD_CAP_NATIVE_EXEC
 		     | GOMP_OFFLOAD_CAP_SHARED_MEM),
-    .id = 0,
 
     .is_initialized = false,
     .offload_regions_registered = false,
diff --git libgomp/target.c libgomp/target.c
index d823045..226b95b 100644
--- libgomp/target.c
+++ libgomp/target.c
@@ -1108,7 +1108,6 @@ gomp_target_init (void)
 		current_device.openacc.target_data = NULL;
 		for (i = 0; i < new_num_devices; i++)
 		  {
-		    current_device.id = num_devices + 1;
 		    current_device.target_id = i;
 		    devices[num_devices] = current_device;
 		    gomp_mutex_init (&devices[num_devices].mem_map.lock);
@@ -1136,9 +1135,7 @@ gomp_target_init (void)
 		{
 		  struct gomp_device_descr device_tmp = devices[d];
 		  devices[d] = devices[i];
-		  devices[d].id = d + 1;
 		  devices[i] = device_tmp;
-		  devices[i].id = i + 1;
 
 		  break;
 		}


Grüße,
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2013-09-15  9:12                               ` Jakub Jelinek
  2013-09-15 12:51                                 ` Michael V. Zolotukhin
@ 2019-12-07 14:23                                 ` Thomas Schwinge
  2019-12-18 17:16                                   ` Thomas Schwinge
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Schwinge @ 2019-12-07 14:23 UTC (permalink / raw)
  To: Jakub Jelinek, Julian Brown, gcc-patches


[-- Attachment #1.1: Type: text/plain, Size: 2276 bytes --]

Hi!

This is from very early days of libgomp offloading support:

On 2013-09-14T21:29:56+0200, Jakub Jelinek <jakub@redhat.com> wrote:
> --- libgomp/target.c.jj	2013-09-09 17:41:02.290429613 +0200
> +++ libgomp/target.c	2013-09-13 16:41:24.514770386 +0200

> +static void
> +gomp_unmap_tgt (struct target_mem_desc *tgt)
> +{
> +  /* FIXME: Deallocate on target the tgt->tgt_start .. tgt->tgt_end
> +     region.  */
> +  if (tgt->tgt_end)
> +    free (tgt->to_free);
> +
> +  free (tgt->array);
> +  free (tgt);
> +}
> +
> +static void
> +gomp_unmap_vars (struct target_mem_desc *tgt)
> +{
> +  if (tgt->list_count == 0)
> +    {
> +      free (tgt);
> +      return;
> +    }
> +
> +  size_t i;
> +  gomp_mutex_lock (&dev_env_lock);
> +  for (i = 0; i < tgt->list_count; i++)
> +    if (tgt->list[i]->refcount > 1)
> +      tgt->list[i]->refcount--;
> +    else
> +      {
> +	splay_tree_key k = tgt->list[i];
> +	if (k->copy_from)
> +	  /* FIXME: device to host copy.  */
> +	  memcpy ((void *) k->host_start,
> +		  (void *) (k->tgt->tgt_start + k->tgt_offset),
> +		  k->host_end - k->host_start);
> +	splay_tree_remove (&dev_splay_tree, k);
> +	if (k->tgt->refcount > 1)
> +	  k->tgt->refcount--;
> +	else
> +	  gomp_unmap_tgt (k->tgt);
> +      }
> +
> +  if (tgt->refcount > 1)
> +    tgt->refcount--;
> +  else
> +    gomp_unmap_tgt (tgt);
> +  gomp_mutex_unlock (&dev_env_lock);
> +}

(These days, the code is structured a little bit differently.)

I was debugging an OpenACC memory mapping issue that lead to host-side
memory corruption, and asked our dear friend Valgrind for help, which
quickly pointed me to the (current revision) of the code cited above.  I
fixed the things on the OpenACC side, but also propose the attached patch
adding a safeguard to "Assert in
'libgomp/target.c:gomp_unmap_vars_internal' that we're not unmapping
'tgt' while it's still in use": the following 'tgt->list_count'
iterations as well as the following 'gomp_unref_tgt' would read 'free'd
memory.  OK to commit?  If approving this patch, please respond with
"Reviewed-by: NAME <EMAIL>" so that your effort will be recorded in the
commit log, see <https://gcc.gnu.org/wiki/Reviewed-by>.


Grüße
 Thomas



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-Assert-in-libgomp-target.c-gomp_unmap_vars_int.trunk.patch --]
[-- Type: text/x-diff, Size: 1196 bytes --]

From eed754b5d9545a605ed930742df5c733927fad04 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 6 Dec 2019 19:24:26 +0100
Subject: [PATCH] Assert in 'libgomp/target.c:gomp_unmap_vars_internal' that
 we're not unmapping 'tgt' while it's still in use

	libgomp/
	* target.c (gomp_unmap_vars_internal): Add a safeguard to
	'gomp_remove_var'.
---
 libgomp/target.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libgomp/target.c b/libgomp/target.c
index 84d6daa76ca..c24d7b1b1aa 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1193,7 +1193,15 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
 				      + tgt->list[i].offset),
 			    tgt->list[i].length);
       if (do_unmap)
-	gomp_remove_var (devicep, k);
+	{
+	  struct target_mem_desc *k_tgt = k->tgt;
+	  bool is_tgt_unmapped = gomp_remove_var (devicep, k);
+	  /* It would be bad if TGT got unmapped while we're still iterating
+	     over its LIST_COUNT, and also expect to use it in the following
+	     code.  */
+	  assert (!is_tgt_unmapped
+		  || k_tgt != tgt);
+	}
     }
 
   if (aq)
-- 
2.17.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 658 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Offloading Support in libgomp
  2019-12-07 14:23                                 ` Thomas Schwinge
@ 2019-12-18 17:16                                   ` Thomas Schwinge
  0 siblings, 0 replies; 12+ messages in thread
From: Thomas Schwinge @ 2019-12-18 17:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Julian Brown


[-- Attachment #1.1: Type: text/plain, Size: 297 bytes --]

Hi!

On 2019-12-07T15:22:33+0100, I wrote:
> [...] propose the attached patch
> adding a safeguard [...]

See attached "Assert in 'libgomp/target.c:gomp_unmap_vars_internal' that
we're not unmapping 'tgt' while it's still in use"; committed to trunk in
r279534.


Grüße
 Thomas



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.2: 0001-Assert-in-libgomp-target.c-gomp_unmap_vars_int.trunk.patch --]
[-- Type: text/x-diff, Size: 1741 bytes --]

From 7c82035afd9b018956fca3f670b2564ec6f0f7ca Mon Sep 17 00:00:00 2001
From: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Wed, 18 Dec 2019 17:01:33 +0000
Subject: [PATCH] Assert in 'libgomp/target.c:gomp_unmap_vars_internal' that
 we're not unmapping 'tgt' while it's still in use

	libgomp/
	* target.c (gomp_unmap_vars_internal): Add a safeguard to
	'gomp_remove_var'.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279534 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog |  3 +++
 libgomp/target.c  | 10 +++++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 472519c7e3e..541a2c7610c 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2019-12-18  Thomas Schwinge  <thomas@codesourcery.com>
 
+	* target.c (gomp_unmap_vars_internal): Add a safeguard to
+	'gomp_remove_var'.
+
 	* target.c (gomp_to_device_kind_p): Handle 'GOMP_MAP_FORCE_FROM'
 	like 'GOMP_MAP_FROM'.
 
diff --git a/libgomp/target.c b/libgomp/target.c
index a3cdb34bd51..67cd80a3c35 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1225,7 +1225,15 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
 				      + tgt->list[i].offset),
 			    tgt->list[i].length);
       if (do_unmap)
-	gomp_remove_var (devicep, k);
+	{
+	  struct target_mem_desc *k_tgt = k->tgt;
+	  bool is_tgt_unmapped = gomp_remove_var (devicep, k);
+	  /* It would be bad if TGT got unmapped while we're still iterating
+	     over its LIST_COUNT, and also expect to use it in the following
+	     code.  */
+	  assert (!is_tgt_unmapped
+		  || k_tgt != tgt);
+	}
     }
 
   if (aq)
-- 
2.17.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 658 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-12-18 17:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20130826141117.GF21876@tucnak.zalov.cz>
     [not found] ` <20130827112609.GA4093@msticlxl57.ims.intel.com>
     [not found]   ` <20130827113956.GH21876@tucnak.zalov.cz>
     [not found]     ` <20130827115538.GB4093@msticlxl57.ims.intel.com>
     [not found]       ` <20130828093428.GO21876@tucnak.zalov.cz>
     [not found]         ` <20130910150126.GA2059@msticlxl57.ims.intel.com>
     [not found]           ` <20130910151453.GB1817@tucnak.redhat.com>
     [not found]             ` <20130910153053.GB2059@msticlxl57.ims.intel.com>
     [not found]               ` <20130910153624.GD1817@tucnak.redhat.com>
     [not found]                 ` <20130910153810.GC2059@msticlxl57.ims.intel.com>
     [not found]                   ` <20130913112930.GC30181@msticlxl57.ims.intel.com>
     [not found]                     ` <20130913123614.GB1817@tucnak.redhat.com>
     [not found]                       ` <20130913131109.GD30181@msticlxl57.ims.intel.com>
     [not found]                         ` <20130913131556.GD1817@tucnak.redhat.com>
2013-09-13 16:19                           ` [RFC] Offloading Support in libgomp Jakub Jelinek
2013-09-13 16:22                             ` Marek Polacek
2013-09-15  9:12                               ` Jakub Jelinek
2013-09-15 12:51                                 ` Michael V. Zolotukhin
2013-09-15 18:15                                   ` Michael V. Zolotukhin
2013-09-16  7:17                                     ` Jakub Jelinek
2013-09-16  7:18                                       ` Michael V. Zolotukhin
2013-09-16  8:07                                         ` Jakub Jelinek
2013-09-16  8:19                                           ` Michael V. Zolotukhin
2019-12-07 14:23                                 ` Thomas Schwinge
2019-12-18 17:16                                   ` Thomas Schwinge
2014-12-22 18:33                     ` Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).