public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [gomp4] Support multi-dimensional pointer based arrays in OpenACC data clauses
@ 2017-01-10  8:27 Chung-Lin Tang
  2018-10-16 12:56 ` [PATCH, OpenACC, 0/8] Multi-dimensional dynamic array support for " Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2017-01-10  8:27 UTC (permalink / raw)
  To: gcc-patches; +Cc: Thomas Schwinge, Cesar Philippidis

[-- Attachment #1: Type: text/plain, Size: 4414 bytes --]

This patch implements support for dynamically allocated multi-dimensional arrays
in OpenACC data clauses. To illustrate, these kinds of arrays now work:

int **a;
float *f[100];
double ***d;

#pragma acc parallel copy (a[0:100][x:y], f[10:20][0:30]) copyout(d[x:y][x:y][x:y])
{
 ...
}

The pointer-to-array-rows kind of case is supposedly also supported in the OpenACC
spec (e.g. int (*x)[50]), though support for that is currently still TBD. I've
rejected those cases in omp-low.

Instead of using multiple continuous map entries like pset/pointer maps, I've
opted to use a different style. The compiler creates a descriptor on stack, and
passes the pointer into libgomp. libgomp will then process and exchange it
for the actual target dynamic array pointer before kernel launch.

Tested and committed to gomp-4_0-branch. This will probably also be sent some
time during the next stage-1 for trunk.

Chung-Lin

2017-01-10  Chung-Lin Tang  <cltang@codesourcery.com>

        gcc/c/
        * c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
        parameter, adjust recursive call site, add cases for allowing
        pointer based multi-dimensional arrays for OpenACC.
        (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
        handle non-contiguous case to create dynamic array map.

        gcc/cp/
        * semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
        parameter, adjust recursive call site, add cases for allowing
        pointer based multi-dimensional arrays for OpenACC.
        (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
        handle non-contiguous case to create dynamic array map.

        gcc/
        * gimplify.c (gimplify_scan_omp_clauses): For dynamic array map kinds,
        make sure bias in each dimension are put into firstprivate variables.
        * tree-pretty-print.c (dump_omp_clauses): Add cases for printing
        GOMP_MAP_DYNAMIC_ARRAY map kinds.
        * omp-low.c (struct omp_context):
        Add 'hash_map<tree_operand_hash, tree> *dynamic_arrays' field, also
        added include of "tree-hash-traits.h".
        (append_field_to_record_type): New function.
        (create_dynamic_array_descr_type): Likewise.
        (create_dynamic_array_descr_init_code): Likewise.
        (new_omp_context): Add initialize of dynamic_arrays field.
        (delete_omp_context): Add delete of dynamic_arrays field.
        (scan_sharing_clauses): For dynamic array map kinds, check for
        supported dimension structure, and install dynamic array variable into
        current omp_context.
        (lower_omp_target): Add handling for dynamic array map kinds.
        (dynamic_array_lookup): New function.
        (dynamic_array_reference_start): Likewise.
        (scan_for_op): Likewise.
        (scan_for_reference): Likewise.
        (da_create_bias): Likewise.
        (da_dimension_peel): Likewise.
        (lower_omp_1): Add case to look for start of dynamic array reference,
        and handle bias adjustments for the code sequence.

        include/
        * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define.
        (enum gomp_map_kind): Add GOMP_MAP_DYNAMIC_ARRAY,
        GOMP_MAP_DYNAMIC_ARRAY_TO, GOMP_MAP_DYNAMIC_ARRAY_FROM,
        GOMP_MAP_DYNAMIC_ARRAY_TOFROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO,
        GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM,
        GOMP_MAP_DYNAMIC_ARRAY_ALLOC, GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC,
        GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT.
        (GOMP_MAP_DYNAMIC_ARRAY_P): Define.

        libgomp/
        * target.c (struct da_dim): New struct declaration.
        (struct da_descr_type): Likewise.
        (struct da_info): Likewise.
        (gomp_dynamic_array_count_rows): New function.
        (gomp_dynamic_array_compute_info): Likewise.
        (gomp_dynamic_array_fill_rows_1): Likewise.
        (gomp_dynamic_array_fill_rows): Likewise.
        (gomp_dynamic_array_create_ptrblock): Likewise.
        (gomp_map_vars): Add code to handle dynamic array map kinds.
        * testsuite/libgomp.oacc-c-c++-common/da-1.c: New test.
        * testsuite/libgomp.oacc-c-c++-common/da-2.c: New test.
        * testsuite/libgomp.oacc-c-c++-common/da-3.c: New test.
        * testsuite/libgomp.oacc-c-c++-common/da-4.c: New test.
        * testsuite/libgomp.oacc-c-c++-common/da-utils.h: New test.

[-- Attachment #2: openacc-da.patch --]
[-- Type: text/x-patch, Size: 48585 bytes --]

Index: gcc/c/c-typeck.c
===================================================================
--- gcc/c/c-typeck.c	(revision 244258)
+++ gcc/c/c-typeck.c	(revision 244259)
@@ -11926,7 +11926,7 @@
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -11982,7 +11982,8 @@
     }
 
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -12142,6 +12143,21 @@
 		    }
 		}
 	    }
+
+	  /* For OpenACC, if the low_bound/length suggest this is a subarray,
+	     and is referenced through by a pointer, then mark this as
+	     non-contiguous.  */
+	  if (ort == C_ORT_ACC
+	      && types.length () > 0
+	      && (TREE_CODE (low_bound) != INTEGER_CST
+		  || integer_nonzerop (low_bound)
+		  || (length && (TREE_CODE (length) != INTEGER_CST
+				 || !tree_int_cst_equal (size, length)))))
+	    {
+	      tree x = types.last ();
+	      if (TREE_CODE (x) == POINTER_TYPE)
+		non_contiguous = true;
+	    }
 	}
       else if (length == NULL_TREE)
 	{
@@ -12183,13 +12199,16 @@
       /* If there is a pointer type anywhere but in the very first
 	 array-section-subscript, the array section can't be contiguous.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
-	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST
+	  && ort != C_ORT_ACC)
 	{
 	  error_at (OMP_CLAUSE_LOCATION (c),
 		    "array section is not contiguous in %qs clause",
 		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
 	  return error_mark_node;
 	}
+      else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	non_contiguous = true;
     }
   else
     {
@@ -12217,10 +12236,11 @@
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree first = handle_omp_array_sections_1 (c, OMP_CLAUSE_DECL (c), types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -12253,6 +12273,7 @@
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree da_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -12276,6 +12297,13 @@
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      da_dims = tree_cons (low_bound, length, da_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -12368,6 +12396,14 @@
 		size = size_binop (MULT_EXPR, size, l);
 	    }
 	}
+      if (non_contiguous)
+	{
+	  int kind = OMP_CLAUSE_MAP_KIND (c);
+	  OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_DYNAMIC_ARRAY);
+	  OMP_CLAUSE_DECL (c) = t;
+	  OMP_CLAUSE_SIZE (c) = da_dims;
+	  return false;
+	}
       if (side_effects)
 	size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
       if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION)
Index: gcc/cp/semantics.c
===================================================================
--- gcc/cp/semantics.c	(revision 244258)
+++ gcc/cp/semantics.c	(revision 244259)
@@ -4482,7 +4482,7 @@
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -4565,7 +4565,8 @@
       && TREE_CODE (TREE_CHAIN (t)) == FIELD_DECL)
     TREE_CHAIN (t) = omp_privatize_field (TREE_CHAIN (t), false);
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -4737,6 +4738,21 @@
 		    }
 		}
 	    }
+
+	  /* For OpenACC, if the low_bound/length suggest this is a subarray,
+	     and is referenced through by a pointer, then mark this as
+	     non-contiguous.  */
+	  if (ort == C_ORT_ACC
+	      && types.length () > 0
+	      && (TREE_CODE (low_bound) != INTEGER_CST
+		  || integer_nonzerop (low_bound)
+		  || (length && (TREE_CODE (length) != INTEGER_CST
+				 || !tree_int_cst_equal (size, length)))))
+	    {
+	      tree x = types.last ();
+	      if (TREE_CODE (x) == POINTER_TYPE)
+		non_contiguous = true;
+	    }
 	}
       else if (length == NULL_TREE)
 	{
@@ -4778,13 +4794,16 @@
       /* If there is a pointer type anywhere but in the very first
 	 array-section-subscript, the array section can't be contiguous.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
-	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST
+	  && ort != C_ORT_ACC)
 	{
 	  error_at (OMP_CLAUSE_LOCATION (c),
 		    "array section is not contiguous in %qs clause",
 		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
 	  return error_mark_node;
 	}
+      else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	non_contiguous = true;
     }
   else
     {
@@ -4812,10 +4831,11 @@
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree first = handle_omp_array_sections_1 (c, OMP_CLAUSE_DECL (c), types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -4849,6 +4869,7 @@
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree da_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -4874,6 +4895,13 @@
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      da_dims = tree_cons (low_bound, length, da_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -4961,6 +4989,14 @@
 	}
       if (!processing_template_decl)
 	{
+	  if (non_contiguous)
+	    {
+	      int kind = OMP_CLAUSE_MAP_KIND (c);
+	      OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_DYNAMIC_ARRAY);
+	      OMP_CLAUSE_DECL (c) = t;
+	      OMP_CLAUSE_SIZE (c) = da_dims;
+	      return false;
+	    }
 	  if (side_effects)
 	    size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
 	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION)
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	(revision 244258)
+++ gcc/gimplify.c	(revision 244259)
@@ -6928,9 +6928,29 @@
 	  if (OMP_CLAUSE_SIZE (c) == NULL_TREE)
 	    OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl)
 				  : TYPE_SIZE_UNIT (TREE_TYPE (decl));
-	  if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
-			     NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
+	  if (OMP_CLAUSE_SIZE (c)
+	      && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST
+	      && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
 	    {
+	      tree dims = OMP_CLAUSE_SIZE (c);
+	      for (tree t = dims; t; t = TREE_CHAIN (t))
+		{
+		  /* If a dimension bias isn't a constant, we have to ensure
+		     that the value gets transferred to the offload target.  */
+		  tree low_bound = TREE_PURPOSE (t);
+		  if (TREE_CODE (low_bound) != INTEGER_CST)
+		    {
+		      low_bound = get_initialized_tmp_var (low_bound, pre_p,
+							   NULL);
+		      omp_add_variable (ctx, low_bound, 
+					GOVD_FIRSTPRIVATE | GOVD_SEEN);
+		      TREE_PURPOSE (t) = low_bound;
+		    }
+		}
+	    }
+	  else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
+				  NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
+	    {
 	      remove = true;
 	      break;
 	    }
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c	(revision 244258)
+++ gcc/tree-pretty-print.c	(revision 244259)
@@ -737,6 +737,33 @@
 	case GOMP_MAP_LINK:
 	  pp_string (pp, "link");
 	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_TO:
+	  pp_string (pp, "to,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FROM:
+	  pp_string (pp, "from,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_TOFROM:
+	  pp_string (pp, "tofrom,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO:
+	  pp_string (pp, "force_to,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM:
+	  pp_string (pp, "force_from,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM:
+	  pp_string (pp, "force_tofrom,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_ALLOC:
+	  pp_string (pp, "alloc,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC:
+	  pp_string (pp, "force_alloc,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT:
+	  pp_string (pp, "force_present,dynamic_array");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -758,6 +785,10 @@
 	    case GOMP_MAP_TO_PSET:
 	      pp_string (pp, " [pointer set, len: ");
 	      break;
+	    case GOMP_MAP_DYNAMIC_ARRAY:
+	      gcc_assert (TREE_CODE (OMP_CLAUSE_SIZE (clause)) == TREE_LIST);
+	      pp_string (pp, " [dimensions: ");
+	      break;
 	    default:
 	      pp_string (pp, " [len: ");
 	      break;
Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	(revision 244258)
+++ gcc/omp-low.c	(revision 244259)
@@ -84,6 +84,7 @@
 #include "hsa.h"
 #include "params.h"
 #include "tree-ssa-propagate.h"
+#include "tree-hash-traits.h"
 
 /* Lowering of OMP parallel and workshare constructs proceeds in two
    phases.  The first phase scans the function looking for OMP statements
@@ -203,6 +204,9 @@
 
   /* True if this construct can be cancelled.  */
   bool cancellable;
+
+  /* Hash map of dynamic arrays in this context.  */
+  hash_map<tree_operand_hash, tree> *dynamic_arrays;
 };
 
 /* A structure holding the elements of:
@@ -1619,7 +1623,136 @@
   return error_mark_node;
 }
 
+/* Helper function for create_dynamic_array_descr_type(), to append a new field
+   to a record type.  */
 
+static void
+append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type)
+{
+  tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type);
+  DECL_CONTEXT (fld) = record_type;
+
+  for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p))
+    ;
+  *p = fld;
+}
+
+/* Create type for dynamic array descriptor. Returns created type, and
+   returns the number of dimensions in *DIM_NUM.  */
+
+static tree
+create_dynamic_array_descr_type (tree decl, tree dims, int *dim_num)
+{
+  int n = 0;
+  tree da_descr_type, name, x;
+  gcc_assert (TREE_CODE (dims) == TREE_LIST);
+
+  da_descr_type = lang_hooks.types.make_type (RECORD_TYPE);
+  name = create_tmp_var_name (".omp_dynamic_array_descr_type");
+  name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, da_descr_type);
+  DECL_ARTIFICIAL (name) = 1;
+  DECL_NAMELESS (name) = 1;
+  TYPE_NAME (da_descr_type) = name;
+  TYPE_ARTIFICIAL (da_descr_type) = 1;
+
+  /* Main starting pointer/array.  */
+  tree main_var_type = TREE_TYPE (decl);
+  if (TREE_CODE (main_var_type) == REFERENCE_TYPE)
+    main_var_type = TREE_TYPE (main_var_type);
+  append_field_to_record_type (da_descr_type, DECL_NAME (decl),
+			       (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE
+				? main_var_type
+				: build_pointer_type (main_var_type)));
+  /* Number of dimensions.  */
+  append_field_to_record_type (da_descr_type, get_identifier ("$dim_num"),
+			       sizetype);
+
+  for (x = dims; x; x = TREE_CHAIN (x), n++)
+    {
+      char *fldname;
+      /* One for the start index.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_base", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the length.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_length", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the element size.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_elem_size", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for is_array flag.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_is_array", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+    }
+
+  layout_type (da_descr_type);
+  *dim_num = n;
+  return da_descr_type;
+}
+
+/* Generate code sequence for initializing dynamic array descriptor.  */
+
+static void
+create_dynamic_array_descr_init_code (tree da_descr, tree da_var,
+				      tree dimensions, int da_dim_num,
+				      gimple_seq *ilist)
+{
+  tree fld, fldref;
+  tree da_descr_type = TREE_TYPE (da_descr);
+  tree dim_type = TREE_TYPE (da_var);
+
+  fld = TYPE_FIELDS (da_descr_type);
+  fldref = omp_build_component_ref (da_descr, fld);
+  gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE
+			    ? build_fold_addr_expr (da_var) : da_var), ilist);
+
+  if (TREE_CODE (dim_type) == REFERENCE_TYPE)
+    dim_type = TREE_TYPE (dim_type);
+
+  fld = TREE_CHAIN (fld);
+  fldref = omp_build_component_ref (da_descr, fld);
+  gimplify_assign (fldref, build_int_cst (sizetype, da_dim_num), ilist);
+
+  while (dimensions)
+    {
+      tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions));
+      tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions));
+      tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type));
+      tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE
+			   ? integer_one_node : integer_zero_node);
+      /* Set base.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size);
+      gimplify_assign (fldref, dim_base, ilist);
+
+      /* Set length.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size);
+      gimplify_assign (fldref, dim_length, ilist);
+
+      /* Set elem_size.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_elem_size = fold_convert (sizetype, dim_elem_size);
+      gimplify_assign (fldref, dim_elem_size, ilist);
+
+      /* Set is_array flag.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_is_array = fold_convert (sizetype, dim_is_array);
+      gimplify_assign (fldref, dim_is_array, ilist);
+
+      dimensions = TREE_CHAIN (dimensions);
+      dim_type = TREE_TYPE (dim_type);
+    }
+  gcc_assert (TREE_CHAIN (fld) == NULL_TREE);
+}
+
 /* Debugging dumps for parallel regions.  */
 void dump_omp_region (FILE *, struct omp_region *, int);
 void debug_omp_region (struct omp_region *);
@@ -1760,6 +1893,8 @@
 
   ctx->cb.decl_map = new hash_map<tree, tree>;
 
+  ctx->dynamic_arrays = new hash_map<tree_operand_hash, tree>;
+
   return ctx;
 }
 
@@ -1834,6 +1969,8 @@
   if (is_task_ctx (ctx))
     finalize_task_copyfn (as_a <gomp_task *> (ctx->stmt));
 
+  delete ctx->dynamic_arrays;
+
   XDELETE (ctx);
 }
 
@@ -2144,6 +2281,42 @@
 	      install_var_local (decl, ctx);
 	      break;
 	    }
+
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	      && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	    {
+	      tree da_decl = OMP_CLAUSE_DECL (c);
+	      tree da_dimensions = OMP_CLAUSE_SIZE (c);
+	      tree da_type = TREE_TYPE (da_decl);
+	      bool by_ref = (TREE_CODE (da_type) == ARRAY_TYPE
+			     ? true : false);
+
+	      /* Checking code to ensure we only have arrays at top dimension.
+		 This limitation might be lifted in the future.  */
+	      if (TREE_CODE (da_type) == REFERENCE_TYPE)
+		da_type = TREE_TYPE (da_type);
+	      tree t = da_type, prev_t = NULL_TREE;
+	      while (t)
+		{
+		  if (TREE_CODE (t) == ARRAY_TYPE && prev_t)
+		    {
+		      error_at (gimple_location (ctx->stmt), "array types are"
+				" only allowed at outermost dimension of"
+				" dynamic array");
+		      break;
+		    }
+		  prev_t = t;
+		  t = TREE_TYPE (t);
+		}
+
+	      install_var_field (da_decl, by_ref, 3, ctx);
+	      tree new_var = install_var_local (da_decl, ctx);
+
+	      bool existed = ctx->dynamic_arrays->put (new_var, da_dimensions);
+	      gcc_assert (!existed);
+	      break;
+	    }
+
 	  if (DECL_P (decl))
 	    {
 	      if (DECL_SIZE (decl)
@@ -16359,6 +16532,15 @@
 	  case GOMP_MAP_FORCE_PRESENT:
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
+	  case GOMP_MAP_DYNAMIC_ARRAY_TO:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_TOFROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_ALLOC:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT:
 	  case GOMP_MAP_LINK:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
@@ -16421,7 +16603,14 @@
 	if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 			   && OMP_CLAUSE_MAP_IN_REDUCTION (c)))
 	  {
-	    x = build_receiver_ref (var, true, ctx);
+	    tree var_type = TREE_TYPE (var);
+	    bool rcv_by_ref =
+	      (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	       && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))
+	       && TREE_CODE (var_type) != ARRAY_TYPE
+	       ? false : true);
+
+	    x = build_receiver_ref (var, rcv_by_ref, ctx);
 	    tree new_var = lookup_decl (var, ctx);
 
 	    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
@@ -16665,6 +16854,25 @@
 		    avar = build_fold_addr_expr (avar);
 		    gimplify_assign (x, avar, &ilist);
 		  }
+		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+			 && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY))
+		  {
+		    int da_dim_num;
+		    tree dimensions = OMP_CLAUSE_SIZE (c);
+
+		    tree da_descr_type =
+		      create_dynamic_array_descr_type (OMP_CLAUSE_DECL (c),
+						       dimensions, &da_dim_num);
+		    tree da_descr =
+		      create_tmp_var_raw (da_descr_type, ".$omp_da_descr");
+		    gimple_add_tmp_var (da_descr);
+
+		    create_dynamic_array_descr_init_code
+		      (da_descr, ovar, dimensions, da_dim_num, &ilist);
+
+		    gimplify_assign (x, build_fold_addr_expr (da_descr),
+				     &ilist);
+		  }
 		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
 		  {
 		    gcc_checking_assert (is_gimple_omp_oacc (ctx->stmt));
@@ -16725,6 +16933,9 @@
 		  s = TREE_TYPE (s);
 		s = TYPE_SIZE_UNIT (s);
 	      }
+	    else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+		     && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY))
+	      s = NULL_TREE;
 	    else
 	      s = OMP_CLAUSE_SIZE (c);
 	    if (s == NULL_TREE)
@@ -17406,7 +17617,202 @@
 		       gimple_build_omp_return (false));
 }
 
+/* Helper to lookup dynamic array through nested omp contexts. Returns
+   TREE_LIST of dimensions, and the CTX where it was found in *CTX_P.  */
 
+static tree
+dynamic_array_lookup (tree t, omp_context **ctx_p)
+{
+  omp_context *c = *ctx_p;
+  while (c)
+    {
+      tree *dims = c->dynamic_arrays->get (t);
+      if (dims)
+	{
+	  *ctx_p = c;
+	  return *dims;
+	}
+      c = c->outer;
+    }
+  return NULL_TREE;
+}
+
+/* Tests if this gimple STMT is the start of a dynamic array access sequence.
+   Returns true if found, and also returns the gimple operand ptr and
+   dimensions tree list through *OUT_REF and *OUT_DIMS respectively.  */
+
+static bool
+dynamic_array_reference_start (gimple *stmt, omp_context **ctx_p,
+			       tree **out_ref, tree *out_dims)
+{
+  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+    for (unsigned i = 1; i < gimple_num_ops (stmt); i++)
+      {
+	tree *op = gimple_op_ptr (stmt, i), dims;
+	if (TREE_CODE (*op) == ARRAY_REF)
+	  op = &TREE_OPERAND (*op, 0);
+	if (TREE_CODE (*op) == MEM_REF)
+	  op = &TREE_OPERAND (*op, 0);
+	if ((dims = dynamic_array_lookup (*op, ctx_p)) != NULL_TREE)
+	  {
+	    *out_ref = op;
+	    *out_dims = dims;
+	    return true;
+	  }
+      }
+  return false;
+}
+
+static tree
+scan_for_op (tree *tp, int *walk_subtrees, void *data)
+{
+  struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
+  tree t = *tp;
+  tree op = (tree) wi->info;
+  *walk_subtrees = 1;
+  if (operand_equal_p (t, op, 0))
+    {
+      wi->info = tp;
+      return t;
+    }
+  return NULL_TREE;
+}
+
+static tree *
+scan_for_reference (gimple *stmt, tree op)
+{
+  struct walk_stmt_info wi;
+  memset (&wi, 0, sizeof (wi));
+  wi.info = op;
+  if (walk_gimple_op (stmt, scan_for_op, &wi))
+    return (tree *) wi.info;
+  return NULL;
+}
+
+static tree
+da_create_bias (tree orig_bias, tree unit_type)
+{
+  return build2 (MULT_EXPR, sizetype, fold_convert (sizetype, orig_bias),
+		 TYPE_SIZE_UNIT (unit_type));
+}
+
+/* Main worker for adjusting dynamic array accesses, handles the adjustment
+   of many cases of statement forms, and called multiple times to 'peel' away
+   each dimension.  */
+
+static gimple_stmt_iterator
+da_dimension_peel (omp_context *da_ctx,
+		   gimple_stmt_iterator da_gsi, tree orig_da,
+		   tree *da_op_p, tree *da_type_p, tree *da_dims_p)
+{
+  gimple *stmt = gsi_stmt (da_gsi);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs = gimple_assign_rhs1 (stmt);
+
+  if (gimple_num_ops (stmt) == 2
+      && TREE_CODE (rhs) == MEM_REF
+      && operand_equal_p (*da_op_p, TREE_OPERAND (rhs, 0), 0)
+      && !operand_equal_p (orig_da, TREE_OPERAND (rhs, 0), 0)
+      && (TREE_OPERAND (rhs, 1) == NULL_TREE
+	  || integer_zerop (TREE_OPERAND (rhs, 1))))
+    {
+      gcc_assert (TREE_CODE (TREE_TYPE (*da_type_p)) == POINTER_TYPE);
+      *da_type_p = TREE_TYPE (*da_type_p);
+    }
+  else 
+    {
+      gimple *g;
+      gimple_seq ilist = NULL;
+      tree bias, t;
+      tree op = *da_op_p;
+      tree orig_type = *da_type_p;
+      tree orig_bias = TREE_PURPOSE (*da_dims_p);
+      bool by_ref = false;
+
+      if (TREE_CODE (orig_bias) != INTEGER_CST)
+	orig_bias = lookup_decl (orig_bias, da_ctx);
+
+      if (gimple_num_ops (stmt) == 2)
+	{
+	  if (TREE_CODE (rhs) == ADDR_EXPR)
+	    {
+	      rhs = TREE_OPERAND (rhs, 0);
+	      *da_dims_p = NULL_TREE;
+	    }
+
+	  if (TREE_CODE (rhs) == ARRAY_REF
+	      && TREE_CODE (TREE_OPERAND (rhs, 0)) == MEM_REF
+	      && operand_equal_p (TREE_OPERAND (TREE_OPERAND (rhs, 0), 0),
+				  *da_op_p, 0))
+	    {
+	      bias = da_create_bias (orig_bias,
+				     TREE_TYPE (TREE_TYPE (orig_type)));
+	      *da_type_p = TREE_TYPE (TREE_TYPE (orig_type));
+	    }
+	  else if (TREE_CODE (rhs) == ARRAY_REF
+		   && TREE_CODE (TREE_OPERAND (rhs, 0)) == VAR_DECL
+		   && operand_equal_p (TREE_OPERAND (rhs, 0), *da_op_p, 0))
+	    {
+	      tree ptr_type = build_pointer_type (orig_type);
+	      op = create_tmp_var (ptr_type);
+	      gimplify_assign (op, build_fold_addr_expr (TREE_OPERAND (rhs, 0)),
+			       &ilist);
+	      bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+	      *da_type_p = TREE_TYPE (orig_type);
+	      orig_type = ptr_type;
+	      by_ref = true;
+	    }
+	  else if (TREE_CODE (rhs) == MEM_REF
+		   && operand_equal_p (*da_op_p, TREE_OPERAND (rhs, 0), 0)
+		   && TREE_OPERAND (rhs, 1) != NULL_TREE)
+	    {
+	      bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+	      *da_type_p = TREE_TYPE (orig_type);
+	    }
+	  else if (TREE_CODE (lhs) == MEM_REF
+		   && operand_equal_p (*da_op_p, TREE_OPERAND (lhs, 0), 0))
+	    {
+	      if (*da_dims_p != NULL_TREE)
+		{
+		  gcc_assert (TREE_CHAIN (*da_dims_p) == NULL_TREE);
+		  bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+		  *da_type_p = TREE_TYPE (orig_type);
+		}
+	      else
+		/* This should be the end of the dynamic array access
+		   sequence.  */
+		return da_gsi;
+	    }
+	  else
+	    gcc_unreachable ();
+	}
+      else if (gimple_num_ops (stmt) == 3
+	       && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+	       && operand_equal_p (*da_op_p, rhs, 0))
+	{
+	  bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+	}
+      else
+	gcc_unreachable ();
+
+      bias = fold_build1 (NEGATE_EXPR, sizetype, bias);
+      bias = fold_build2 (POINTER_PLUS_EXPR, orig_type, op, bias);
+
+      t = create_tmp_var (by_ref ? build_pointer_type (orig_type) : orig_type);
+
+      g = gimplify_assign (t, bias, &ilist);
+      gsi_insert_seq_before (&da_gsi, ilist, GSI_NEW_STMT);
+      *da_op_p = gimple_assign_lhs (g);
+
+      if (by_ref)
+	*da_op_p = build2 (MEM_REF, TREE_TYPE (orig_type), *da_op_p,
+			   build_int_cst (orig_type, 0));
+      *da_dims_p = TREE_CHAIN (*da_dims_p);
+    }
+
+  return da_gsi;
+}
+
 /* Callback for lower_omp_1.  Return non-NULL if *tp needs to be
    regimplified.  If DATA is non-NULL, lower_omp_1 is outside
    of OMP context, but with task_shared_vars set.  */
@@ -17681,6 +18087,51 @@
 	  }
       /* FALLTHRU */
     default:
+
+      /* If we detect the start of a dynamic array reference sequence, scan
+	 and do the needed adjustments.  */
+      tree da_dims, *da_op_p;
+      omp_context *da_ctx = ctx;
+      if (da_ctx && dynamic_array_reference_start (stmt, &da_ctx,
+						   &da_op_p, &da_dims))
+	{
+	  bool started = false;
+	  tree orig_da = *da_op_p;
+	  tree da_type = TREE_TYPE (orig_da);
+	  tree next_da_op;
+
+	  gimple_stmt_iterator da_gsi = *gsi_p, new_gsi;
+	  while (da_op_p)
+	    {
+	      if (!is_gimple_assign (gsi_stmt (da_gsi))
+		  || ((gimple_assign_single_p (gsi_stmt (da_gsi))
+		       || gimple_assign_cast_p (gsi_stmt (da_gsi)))
+		      && *da_op_p == gimple_assign_rhs1 (gsi_stmt (da_gsi))))
+		break;
+
+	      new_gsi = da_dimension_peel (da_ctx, da_gsi, orig_da,
+					   da_op_p, &da_type, &da_dims);
+	      if (!started)
+		{
+		  /* Point 'stmt' to the start of the newly added
+		     sequence.  */
+		  started = true;
+		  *gsi_p = new_gsi;
+		  stmt = gsi_stmt (*gsi_p);
+		}
+	      if (!da_dims)
+		break;
+
+	      next_da_op = gimple_assign_lhs (gsi_stmt (da_gsi));
+
+	      do {
+		gsi_next (&da_gsi);
+		da_op_p = scan_for_reference (gsi_stmt (da_gsi), next_da_op);
+	      }
+	      while (!da_op_p);
+	    }
+	}
+
       if ((ctx || task_shared_vars)
 	  && walk_gimple_op (stmt, lower_omp_regimplify_p,
 			     ctx ? NULL : &wi))
Index: include/gomp-constants.h
===================================================================
--- include/gomp-constants.h	(revision 244258)
+++ include/gomp-constants.h	(revision 244259)
@@ -40,6 +40,7 @@
 #define GOMP_MAP_FLAG_SPECIAL_0		(1 << 2)
 #define GOMP_MAP_FLAG_SPECIAL_1		(1 << 3)
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
+#define GOMP_MAP_FLAG_SPECIAL_3		(1 << 5)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
@@ -128,7 +129,26 @@
     /* Decrement usage count and deallocate if zero.  */
     GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
 					 | GOMP_MAP_DELETE),
-
+    /* Mapping kinds for dynamic arrays.  */
+    GOMP_MAP_DYNAMIC_ARRAY =		(GOMP_MAP_FLAG_SPECIAL_3),
+    GOMP_MAP_DYNAMIC_ARRAY_TO =		(GOMP_MAP_DYNAMIC_ARRAY
+					 | GOMP_MAP_TO),
+    GOMP_MAP_DYNAMIC_ARRAY_FROM =	(GOMP_MAP_DYNAMIC_ARRAY
+					 | GOMP_MAP_FROM),
+    GOMP_MAP_DYNAMIC_ARRAY_TOFROM =	(GOMP_MAP_DYNAMIC_ARRAY
+					 | GOMP_MAP_TOFROM),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO =	(GOMP_MAP_DYNAMIC_ARRAY_TO
+					 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM =		(GOMP_MAP_DYNAMIC_ARRAY_FROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM =	(GOMP_MAP_DYNAMIC_ARRAY_TOFROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_DYNAMIC_ARRAY_ALLOC =		(GOMP_MAP_DYNAMIC_ARRAY
+						 | GOMP_MAP_ALLOC),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC =	(GOMP_MAP_DYNAMIC_ARRAY
+						 | GOMP_MAP_FORCE_ALLOC),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT =	(GOMP_MAP_DYNAMIC_ARRAY
+						 | GOMP_MAP_FORCE_PRESENT),
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
     GOMP_MAP_FIRSTPRIVATE_POINTER =	(GOMP_MAP_LAST | 1),
@@ -156,6 +176,8 @@
 #define GOMP_MAP_ALWAYS_P(X) \
   (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM))
 
+#define GOMP_MAP_DYNAMIC_ARRAY_P(X) \
+  ((X) & GOMP_MAP_DYNAMIC_ARRAY)
 
 /* Asynchronous behavior.  Keep in sync with
    libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t.  */
Index: libgomp/target.c
===================================================================
--- libgomp/target.c	(revision 244258)
+++ libgomp/target.c	(revision 244259)
@@ -375,6 +375,140 @@
   return tgt->tgt_start + tgt->list[i].offset;
 }
 
+/* Dynamic array related data structures, interfaces with the compiler.  */
+
+struct da_dim {
+  size_t base;
+  size_t length;
+  size_t elem_size;
+  size_t is_array;
+};
+
+struct da_descr_type {
+  void *ptr;
+  size_t ndims;
+  struct da_dim dims[];
+};
+
+/* Internal dynamic array info struct, used only here inside the runtime. */
+
+struct da_info
+{
+  struct da_descr_type *descr;
+  size_t map_index;
+  size_t ptrblock_size;
+  size_t data_row_num;
+  size_t data_row_size;
+};
+
+static size_t
+gomp_dynamic_array_count_rows (struct da_descr_type *descr)
+{
+  size_t nrows = 1;
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    nrows *= descr->dims[d].length / sizeof (void *);
+  return nrows;
+}
+
+static void
+gomp_dynamic_array_compute_info (struct da_info *da)
+{
+  size_t d, n = 1;
+  struct da_descr_type *descr = da->descr;
+
+  da->ptrblock_size = 0;
+  for (d = 0; d < descr->ndims - 1; d++)
+    {
+      size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size;
+      size_t dim_ptrblock_size = (descr->dims[d + 1].is_array
+				  ? 0 : descr->dims[d].length * n);
+      da->ptrblock_size += dim_ptrblock_size;
+      n *= dim_count;
+    }
+  da->data_row_num = n;
+  da->data_row_size = descr->dims[d].length;
+}
+
+static void
+gomp_dynamic_array_fill_rows_1 (struct da_descr_type *descr, void *da,
+				size_t d, void ***row_ptr, size_t *count)
+{
+  if (d < descr->ndims - 1)
+    {
+      size_t elsize = descr->dims[d].elem_size;
+      size_t n = descr->dims[d].length / elsize;
+      void *p = da + descr->dims[d].base;
+      for (size_t i = 0; i < n; i++)
+	{
+	  void *ptr = p + i * elsize;
+	  /* Deref if next dimension is not array.  */
+	  if (!descr->dims[d + 1].is_array)
+	    ptr = *((void **) ptr);
+	  gomp_dynamic_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count);
+	}
+    }
+  else
+    {
+      **row_ptr = da + descr->dims[d].base;
+      *row_ptr += 1;
+      *count += 1;
+    }
+}
+
+static size_t
+gomp_dynamic_array_fill_rows (struct da_descr_type *descr, void *rows[])
+{
+  size_t count = 0;
+  void **p = rows;
+  gomp_dynamic_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count);
+  return count;
+}
+
+static void *
+gomp_dynamic_array_create_ptrblock (struct da_info *da,
+				    void *tgt_addr, void *tgt_data_rows[])
+{
+  struct da_descr_type *descr = da->descr;
+  void *ptrblock = gomp_malloc (da->ptrblock_size);
+  void **curr_dim_ptrblock = (void **) ptrblock;
+  size_t n = 1;
+
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    {
+      int curr_dim_len = descr->dims[d].length;
+      int next_dim_len = descr->dims[d + 1].length;
+      int curr_dim_num = curr_dim_len / sizeof (void *);
+
+      void *next_dim_ptrblock
+	= (void *)(curr_dim_ptrblock + n * curr_dim_num);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < curr_dim_num; i++)
+	  {
+	    if (d < descr->ndims - 2)
+	      {
+		void *ptr = (next_dim_ptrblock
+			     + b * curr_dim_num * next_dim_len
+			     + i * next_dim_len);
+		void *tgt_ptr = tgt_addr + (ptr - ptrblock);
+		curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr;
+	      }
+	    else
+	      {
+		curr_dim_ptrblock[b * curr_dim_num + i]
+		  = tgt_data_rows[b * curr_dim_num + i];
+	      }
+	    void *addr = &curr_dim_ptrblock[b * curr_dim_num + i];
+	    assert (ptrblock <= addr && addr < ptrblock + da->ptrblock_size);
+	  }
+
+      n *= curr_dim_num;
+      curr_dim_ptrblock = next_dim_ptrblock;
+    }
+  assert (n == da->data_row_num);
+  return ptrblock;
+}
+
 attribute_hidden struct target_mem_desc *
 gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	       void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds,
@@ -386,9 +520,29 @@
   const int typemask = short_mapkind ? 0xff : 0x7;
   struct splay_tree_s *mem_map = &devicep->mem_map;
   struct splay_tree_key_s cur_node;
-  struct target_mem_desc *tgt
-    = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
-  tgt->list_count = mapnum;
+  struct target_mem_desc *tgt;
+
+  size_t da_data_row_num = 0, row_start = 0;
+  size_t da_info_num = 0, da_index;
+  struct da_info *da_info = NULL;
+  struct target_var_desc *row_desc;
+  uintptr_t target_row_addr;
+  void **host_data_rows = NULL, **target_data_rows = NULL;
+  void *row;
+
+  for (i = 0; i < mapnum; i++)
+    {
+      int kind = get_kind (short_mapkind, kinds, i);
+      if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	{
+	  da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
+	  da_info_num += 1;
+	}
+    }
+
+  tgt = gomp_malloc (sizeof (*tgt)
+		     + sizeof (tgt->list[0]) * (mapnum + da_data_row_num));
+  tgt->list_count = mapnum + da_data_row_num;
   tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
   tgt->device_descr = devicep;
 
@@ -399,6 +553,14 @@
       return tgt;
     }
 
+  if (da_info_num)
+    da_info = gomp_alloca (sizeof (struct da_info) * da_info_num);
+  if (da_data_row_num)
+    {
+      host_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num);
+      target_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num);
+    }
+
   tgt_align = sizeof (void *);
   tgt_size = 0;
   if (pragma_kind == GOMP_MAP_VARS_TARGET)
@@ -416,7 +578,7 @@
       return NULL;
     }
 
-  for (i = 0; i < mapnum; i++)
+  for (i = 0, da_index = 0; i < mapnum; i++)
     {
       int kind = get_kind (short_mapkind, kinds, i);
       if (hostaddrs[i] == NULL
@@ -482,6 +644,20 @@
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	{
+	  /* Ignore dynamic arrays for now, we process them together
+	     later.  */
+	  tgt->list[i].key = NULL;
+	  tgt->list[i].offset = 0;
+	  not_found_cnt++;
+
+	  struct da_info *da = &da_info[da_index++];
+	  da->descr = (struct da_descr_type *) hostaddrs[i];
+	  da->map_index = i;
+	  continue;
+	}
+
       cur_node.host_start = (uintptr_t) hostaddrs[i];
       if (!GOMP_MAP_POINTER_P (kind & typemask))
 	cur_node.host_end = cur_node.host_start + sizes[i];
@@ -545,6 +721,55 @@
 	}
     }
 
+  /* For dynamic arrays. Each data row is one target item, separated from
+     the normal map clause items, hence we order them after mapnum.  */
+  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+    {
+      int kind = get_kind (short_mapkind, kinds, i);
+      if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	continue;
+
+      struct da_info *da = &da_info[da_index++];
+      struct da_descr_type *descr = da->descr;
+      size_t nr;
+
+      gomp_dynamic_array_compute_info (da);
+
+      /* We have allocated space in host/target_data_rows to place all the
+	 row data block pointers, now we can start filling them in.  */
+      nr = gomp_dynamic_array_fill_rows (descr, &host_data_rows[row_start]);
+      assert (nr == da->data_row_num);
+
+      size_t align = (size_t) 1 << (kind >> rshift);
+      if (tgt_align < align)
+	tgt_align = align;
+      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+      tgt_size += da->ptrblock_size;
+
+      for (size_t j = 0; j < da->data_row_num; j++)
+	{
+	  row = host_data_rows[row_start + j];
+	  row_desc = &tgt->list[mapnum + row_start + j];
+
+	  cur_node.host_start = (uintptr_t) row;
+	  cur_node.host_end = cur_node.host_start + da->data_row_size;
+	  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	  if (n)
+	    {
+	      assert (n->refcount != REFCOUNT_LINK);
+	      gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+				      kind & typemask);	      
+	    }
+	  else
+	    {
+	      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+	      tgt_size += da->data_row_size;
+	      not_found_cnt++;
+	    }
+	}
+      row_start += da->data_row_num;
+    }
+
   if (devaddrs)
     {
       if (mapnum != 1)
@@ -675,6 +900,15 @@
 	      default:
 		break;
 	      }
+
+	    if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	      {
+		tgt->list[i].key = &array->key;
+		tgt->list[i].key->tgt = tgt;
+		array++;
+		continue;
+	      }
+
 	    splay_tree_key k = &array->key;
 	    k->host_start = (uintptr_t) hostaddrs[i];
 	    if (!GOMP_MAP_POINTER_P (kind & typemask))
@@ -825,8 +1059,110 @@
 		array++;
 	      }
 	  }
+
+      /* Processing of dynamic array rows.  */
+      for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+	{
+	  int kind = get_kind (short_mapkind, kinds, i);
+	  if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	    continue;
+
+	  struct da_info *da = &da_info[da_index++];
+	  assert (da->descr == hostaddrs[i]);
+
+	  /* The map for the dynamic array itself is never copied from during
+	     unmapping, its the data rows that count. Set copy from flags are
+	     set to false here.  */
+	  tgt->list[i].copy_from = false;
+	  tgt->list[i].always_copy_from = false;
+
+	  size_t align = (size_t) 1 << (kind >> rshift);
+	  tgt_size = (tgt_size + align - 1) & ~(align - 1);
+
+	  /* For the map of the dynamic array itself, adjust so that the passed
+	     device address points to the beginning of the ptrblock.  */
+	  tgt->list[i].key->tgt_offset = tgt_size;
+
+	  void *target_ptrblock = (void*) tgt->tgt_start + tgt_size;
+	  tgt_size += da->ptrblock_size;
+
+	  /* Add splay key for each data row in current DA.  */
+	  for (size_t j = 0; j < da->data_row_num; j++)
+	    {
+	      row = host_data_rows[row_start + j];
+	      row_desc = &tgt->list[mapnum + row_start + j];
+
+	      cur_node.host_start = (uintptr_t) row;
+	      cur_node.host_end = cur_node.host_start + da->data_row_size;
+	      splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	      if (n)
+		{
+		  assert (n->refcount != REFCOUNT_LINK);
+		  gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+					  kind & typemask);
+		  target_row_addr = n->tgt->tgt_start + n->tgt_offset;
+		}
+	      else
+		{
+		  tgt->refcount++;
+
+		  splay_tree_key k = &array->key;
+		  k->host_start = (uintptr_t) row;
+		  k->host_end = k->host_start + da->data_row_size;
+
+		  k->tgt = tgt;
+		  k->refcount = 1;
+		  k->link_key = NULL;
+		  tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		  target_row_addr = tgt->tgt_start + tgt_size;
+		  k->tgt_offset = tgt_size;
+		  tgt_size += da->data_row_size;
+
+		  row_desc->key = k;
+		  row_desc->copy_from
+		    = GOMP_MAP_COPY_FROM_P (kind & typemask);
+		  row_desc->always_copy_from
+		    = GOMP_MAP_COPY_FROM_P (kind & typemask);
+		  row_desc->offset = 0;
+		  row_desc->length = da->data_row_size;
+
+		  array->left = NULL;
+		  array->right = NULL;
+		  splay_tree_insert (mem_map, array);
+
+		  if (GOMP_MAP_COPY_TO_P (kind & typemask))
+		    gomp_copy_host2dev (devicep,
+					(void *) tgt->tgt_start + k->tgt_offset,
+					(void *) k->host_start,
+					da->data_row_size);
+		  array++;
+		}
+	      target_data_rows[row_start + j] = (void *) target_row_addr;
+	    }
+
+	  /* Now we have the target memory allocated, and target offsets of all
+	     row blocks assigned and calculated, we can construct the
+	     accelerator side ptrblock and copy it in.  */
+	  if (da->ptrblock_size)
+	    {
+	      void *ptrblock = gomp_dynamic_array_create_ptrblock
+		(da, target_ptrblock, target_data_rows + row_start);
+	      gomp_copy_host2dev (devicep, target_ptrblock, ptrblock,
+				  da->ptrblock_size);
+	      free (ptrblock);
+	    }
+
+	  row_start += da->data_row_num;
+	}
+      assert (row_start == da_data_row_num && da_index == da_info_num);
     }
 
+  if (da_data_row_num)
+    {
+      free (host_data_rows);
+      free (target_data_rows);
+    }
+
   if (pragma_kind == GOMP_MAP_VARS_TARGET)
     {
       for (i = 0; i < mapnum; i++)
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c	(revision 244259)
@@ -0,0 +1,45 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "da-utils.h"
+
+int main (void)
+{
+  int n = 20, x = 5, y = 12;
+  int *****a = (int *****) create_da (sizeof (int), n, 5);
+
+  int sum1 = 0, sum2 = 0, sum3 = 0;
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    {
+	      a[i][j][k][l][m] = 1;
+	      sum1++;
+	    }
+
+  #pragma acc parallel copy (a[x:y][x:y][x:y][x:y][x:y]) copy(sum2)
+  {
+    for (int i = x; i < x + y; i++)
+      for (int j = x; j < x + y; j++)
+	for (int k = x; k < x + y; k++)
+	  for (int l = x; l < x + y; l++)
+	    for (int m = x; m < x + y; m++)
+	      {
+		a[i][j][k][l][m] = 0;
+		sum2++;
+	      }
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    sum3 += a[i][j][k][l][m];
+
+  assert (sum1 == sum2 + sum3);
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c	(revision 244259)
@@ -0,0 +1,36 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "da-utils.h"
+
+int main (void)
+{
+  int n = 128;
+  double ***a = (double ***) create_da (sizeof (double), n, 3);
+  double ***b = (double ***) create_da (sizeof (double), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	a[i][j][k] = i + j + k + i * j * k;
+
+  /* This test exercises async copyout of dynamic array rows.  */
+  #pragma acc parallel copyin(a[0:n][0:n][0:n]) copyout(b[0:n][0:n][0:n]) async(5)
+  {
+    #pragma acc loop gang
+    for (int i = 0; i < n; i++)
+      #pragma acc loop vector
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  b[i][j][k] = a[i][j][k] * 2.0;
+  }
+
+  #pragma acc wait (5)
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (b[i][j][k] == a[i][j][k] * 2.0);
+
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h	(revision 244259)
@@ -0,0 +1,44 @@
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include <stdint.h>
+
+/* Allocate and create a pointer based NDIMS-dimensional array,
+   each dimension DIMLEN long, with ELSIZE sized data elements.  */
+void *
+create_da (size_t elsize, int dimlen, int ndims)
+{
+  size_t blk_size = 0;
+  size_t n = 1;
+
+  for (int i = 0; i < ndims - 1; i++)
+    {
+      n *= dimlen;
+      blk_size += sizeof (void *) * n;
+    }
+  size_t data_rows_num = n;
+  size_t data_rows_offset = blk_size;
+  blk_size += elsize * n * dimlen;
+
+  void *blk = (void *) malloc (blk_size);
+  memset (blk, 0, blk_size);
+  void **curr_dim = (void **) blk;
+  n = 1;
+
+  for (int d = 0; d < ndims - 1; d++)
+    {
+      uintptr_t next_dim = (uintptr_t) (curr_dim + n * dimlen);
+      size_t next_dimlen = dimlen * (d < ndims - 2 ? sizeof (void *) : elsize);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < dimlen; i++)
+	  if (d < ndims - 1)
+	    curr_dim[b * dimlen + i]
+	      = (void*) (next_dim + b * dimlen * next_dimlen + i * next_dimlen);
+
+      n *= dimlen;
+      curr_dim = (void**) next_dim;
+    }
+  assert (n == data_rows_num);
+  return blk;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c	(revision 244259)
@@ -0,0 +1,103 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+
+#define n 100
+#define m 100
+
+int b[n][m];
+
+void
+test1 (void)
+{
+  int i, j, *a[100];
+
+  /* Array of pointers form test.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+}
+
+void
+test2 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+
+  /* Separately allocated blocks.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+  free (a);
+}
+
+void
+test3 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+  a[0] = (int *) malloc (sizeof (int) * n * m);
+
+  /* Rows allocated in one contiguous block.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = *a + i * m;
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    for (j = 0; j < m; j++)
+      assert (a[i][j] == b[i][j]);
+
+  free (a[0]);
+  free (a);
+}
+
+int
+main (void)
+{
+  test1 ();
+  test2 ();
+  test3 ();
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c	(revision 244259)
@@ -0,0 +1,37 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "da-utils.h"
+
+int
+main (void)
+{
+  int n = 10;
+  int ***a = (int ***) create_da (sizeof (int), n, 3);
+  int ***b = (int ***) create_da (sizeof (int), n, 3);
+  int ***c = (int ***) create_da (sizeof (int), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	{
+	  a[i][j][k] = i + j * k + k;
+	  b[i][j][k] = j + k * i + i * j;
+	  c[i][j][k] = a[i][j][k];
+	}
+
+  #pragma acc parallel copy (a[0:n][0:n][0:n]) copyin (b[0:n][0:n][0:n])
+  {
+    for (int i = 0; i < n; i++)
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  a[i][j][k] += b[k][j][i] + i + j + k;
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (a[i][j][k] == c[i][j][k] + b[k][j][i] + i + j + k);
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 0/8] Multi-dimensional dynamic array support for OpenACC data clauses
@ 2018-10-16 12:56 ` Chung-Lin Tang
  2018-10-16 12:56   ` [PATCH, OpenACC, 1/8] Multi-dimensional dynamic array support for OpenACC data clauses, gomp-constants.h additions Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 12:56 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge; +Cc: Moore, Catherine

Hi Jakub, this patch set is supposed to be OpenACC functionality, but most of it touches shared code,
so I have CCed both you and Thomas.

This patch adds capability to handle C/C++ non-contiguous, dynamically
allocated multi-dimensional arrays in OpenACC data clauses:

int *a[100], **b;

#pragma acc parallel copyin (a[0:n][0:m], b[1:x][5:y]) // re-constructs array slices on GPU and copies data in

We currently only allow arrays (e.g. []) at the "outermost" dimension, for example:

// These are all okay
int **p;
int ***m;
int ****n;
int *x[100];

int (*y)[100];   // not allowed

Some of this was due to limiting the scope of implementation, but may actually be extended to support with reasonable effort.

I have added descriptions of each part of the implementation in the respective
patch mails. The test results are all okay, no regressions of any sort.
Asking for permission to apply to trunk.

Thanks,
Chung-Lin

2018-10-16  Chung-Lin Tang  <cltang@codesourcery.com>

	include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define.
	(enum gomp_map_kind): Add GOMP_MAP_DYNAMIC_ARRAY,
	GOMP_MAP_DYNAMIC_ARRAY_TO, GOMP_MAP_DYNAMIC_ARRAY_FROM,
	GOMP_MAP_DYNAMIC_ARRAY_TOFROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO,
	GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM,
	GOMP_MAP_DYNAMIC_ARRAY_ALLOC, GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC,
	GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT.
	(GOMP_MAP_DYNAMIC_ARRAY_P): Define.

	gcc/c/
	* c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/cp/
	* semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/
	* gimplify.c (gimplify_scan_omp_clauses): For dynamic array map kinds,
	make sure bias in each dimension are put into firstprivate variables.
	* tree-pretty-print.c (dump_omp_clauses): Add cases for printing
	GOMP_MAP_DYNAMIC_ARRAY map kinds.
	* omp-low.c (struct omp_context):
	Add 'hash_map<tree_operand_hash, tree> *dynamic_arrays' field, also
	added include of "tree-hash-traits.h".
	(append_field_to_record_type): New function.
	(create_dynamic_array_descr_type): Likewise.
	(create_dynamic_array_descr_init_code): Likewise.
	(new_omp_context): Add initialize of dynamic_arrays field.
	(delete_omp_context): Add delete of dynamic_arrays field.
	(scan_sharing_clauses): For dynamic array map kinds, check for
	supported dimension structure, and install dynamic array variable into
	current omp_context.
	(lower_omp_target): Add handling for dynamic array map kinds.
	(dynamic_array_lookup): New function.
	(dynamic_array_reference_start): Likewise.
	(scan_for_op): Likewise.
	(scan_for_reference): Likewise.
	(da_create_bias): Likewise.
	(da_dimension_peel): Likewise.
	(lower_omp_1): Add case to look for start of dynamic array reference,
	and handle bias adjustments for the code sequence.

	libgomp/
	PR other/76739
	* target.c (struct da_dim): New struct declaration.
	(struct da_descr_type): Likewise.
	(struct da_info): Likewise.
	(gomp_dynamic_array_count_rows): New function.
	(gomp_dynamic_array_compute_info): Likewise.
	(gomp_dynamic_array_fill_rows_1): Likewise.
	(gomp_dynamic_array_fill_rows): Likewise.
	(gomp_dynamic_array_create_ptrblock): Likewise.
	(gomp_map_vars): Add code to handle dynamic array map kinds.
	* testsuite/libgomp.oacc-c-c++-common/da-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-utils.h: New test.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 1/8] Multi-dimensional dynamic array support for OpenACC data clauses, gomp-constants.h additions
@ 2018-10-16 12:56   ` Chung-Lin Tang
  2018-10-16 12:57     ` [PATCH, OpenACC, 2/8] Multi-dimensional dynamic array support for OpenACC data clauses, C/C++ front-end parts Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 12:56 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]

This part defines GOMP_MAP_DYNAMIC_ARRAY_* symbols in include/gomp-constants.h.
Basically use the next bit to define GOMP_MAP_FLAG_SPECIAL_3 to achieve this
purpose.

Thanks,
Chung-Lin Tang

	include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define.
	(enum gomp_map_kind): Add GOMP_MAP_DYNAMIC_ARRAY,
	GOMP_MAP_DYNAMIC_ARRAY_TO, GOMP_MAP_DYNAMIC_ARRAY_FROM,
	GOMP_MAP_DYNAMIC_ARRAY_TOFROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO,
	GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM, GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM,
	GOMP_MAP_DYNAMIC_ARRAY_ALLOC, GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC,
	GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT.
	(GOMP_MAP_DYNAMIC_ARRAY_P): Define.

[-- Attachment #2: openacc-da-01.gomp-constants.patch --]
[-- Type: text/plain, Size: 2086 bytes --]

diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index ccfb657..f25169c 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -40,6 +40,7 @@
 #define GOMP_MAP_FLAG_SPECIAL_0		(1 << 2)
 #define GOMP_MAP_FLAG_SPECIAL_1		(1 << 3)
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
+#define GOMP_MAP_FLAG_SPECIAL_3		(1 << 5)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
@@ -128,6 +129,26 @@ enum gomp_map_kind
     /* Decrement usage count and deallocate if zero.  */
     GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
 					 | GOMP_MAP_DELETE),
+    /* Mapping kinds for dynamic arrays.  */
+    GOMP_MAP_DYNAMIC_ARRAY =		(GOMP_MAP_FLAG_SPECIAL_3),
+    GOMP_MAP_DYNAMIC_ARRAY_TO =		(GOMP_MAP_DYNAMIC_ARRAY
+					 | GOMP_MAP_TO),
+    GOMP_MAP_DYNAMIC_ARRAY_FROM =	(GOMP_MAP_DYNAMIC_ARRAY
+					 | GOMP_MAP_FROM),
+    GOMP_MAP_DYNAMIC_ARRAY_TOFROM =	(GOMP_MAP_DYNAMIC_ARRAY
+					 | GOMP_MAP_TOFROM),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO =	(GOMP_MAP_DYNAMIC_ARRAY_TO
+					 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM =		(GOMP_MAP_DYNAMIC_ARRAY_FROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM =	(GOMP_MAP_DYNAMIC_ARRAY_TOFROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_DYNAMIC_ARRAY_ALLOC =		(GOMP_MAP_DYNAMIC_ARRAY
+						 | GOMP_MAP_ALLOC),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC =	(GOMP_MAP_DYNAMIC_ARRAY
+						 | GOMP_MAP_FORCE_ALLOC),
+    GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT =	(GOMP_MAP_DYNAMIC_ARRAY
+						 | GOMP_MAP_FORCE_PRESENT),
 
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
@@ -156,6 +177,8 @@ enum gomp_map_kind
 #define GOMP_MAP_ALWAYS_P(X) \
   (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM))
 
+#define GOMP_MAP_DYNAMIC_ARRAY_P(X) \
+  ((X) & GOMP_MAP_DYNAMIC_ARRAY)
 
 /* Asynchronous behavior.  Keep in sync with
    libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t.  */

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 3/8] Multi-dimensional dynamic array support for OpenACC data clauses, gimplify patch
@ 2018-10-16 12:57       ` Chung-Lin Tang
  2018-10-16 13:13         ` [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 12:57 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 378 bytes --]

This gimplify.c patch adds to the omp clause scanning to handle dynamic array
cases, mainly to properly handle dimension biases of GOMP_MAP_DYNAMIC_ARRAYs by
making sure the bias field is seen in the omp-ctx.

Thanks,
Chung-Lin

	gcc/
	* gimplify.c (gimplify_scan_omp_clauses): For dynamic array map kinds,
	make sure bias in each dimension are put into firstprivate variables.

[-- Attachment #2: openacc-da-03.gimplify.patch --]
[-- Type: text/plain, Size: 1323 bytes --]

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 781d430..09ef876 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7901,8 +7901,28 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	  if (OMP_CLAUSE_SIZE (c) == NULL_TREE)
 	    OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl)
 				  : TYPE_SIZE_UNIT (TREE_TYPE (decl));
-	  if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
-			     NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
+	  if (OMP_CLAUSE_SIZE (c)
+	      && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST
+	      && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	    {
+	      tree dims = OMP_CLAUSE_SIZE (c);
+	      for (tree t = dims; t; t = TREE_CHAIN (t))
+		{
+		  /* If a dimension bias isn't a constant, we have to ensure
+		     that the value gets transferred to the offload target.  */
+		  tree low_bound = TREE_PURPOSE (t);
+		  if (TREE_CODE (low_bound) != INTEGER_CST)
+		    {
+		      low_bound = get_initialized_tmp_var (low_bound, pre_p,
+							   NULL, false);
+		      omp_add_variable (ctx, low_bound,
+					GOVD_FIRSTPRIVATE | GOVD_SEEN);
+		      TREE_PURPOSE (t) = low_bound;
+		    }
+		}
+	    }
+	  else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
+				  NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
 	    {
 	      remove = true;
 	      break;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 2/8] Multi-dimensional dynamic array support for OpenACC data clauses, C/C++ front-end parts
@ 2018-10-16 12:57     ` Chung-Lin Tang
  2018-10-16 12:57       ` [PATCH, OpenACC, 3/8] Multi-dimensional dynamic array support for OpenACC data clauses, gimplify patch Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 12:57 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 903 bytes --]

These are the parts for the C/C++ front-ends. We now allow certain non-contiguous
cases under OpenACC, and pass the defined base/length pairs for each
array dimension as a TREE_LIST passed to the middle-end through OMP_CLAUSE_SIZE.

Thanks,
Chung-Lin

	gcc/c/
	* c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/cp/
	* semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

[-- Attachment #2: openacc-da-02.c-cxx-front-ends.patch --]
[-- Type: text/plain, Size: 8041 bytes --]

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 0f639be..c273435 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12409,7 +12409,7 @@ c_finish_omp_cancellation_point (location_t loc, tree clauses)
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -12494,7 +12494,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
     }
 
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -12654,6 +12655,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 		    }
 		}
 	    }
+
+	  /* For OpenACC, if the low_bound/length suggest this is a subarray,
+	     and is referenced through by a pointer, then mark this as
+	     non-contiguous.  */
+	  if (ort == C_ORT_ACC
+	      && types.length () > 0
+	      && (TREE_CODE (low_bound) != INTEGER_CST
+		  || integer_nonzerop (low_bound)
+		  || (length && (TREE_CODE (length) != INTEGER_CST
+				 || !tree_int_cst_equal (size, length)))))
+	    {
+	      tree x = types.last ();
+	      if (TREE_CODE (x) == POINTER_TYPE)
+		non_contiguous = true;
+	    }
 	}
       else if (length == NULL_TREE)
 	{
@@ -12695,13 +12711,16 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
       /* If there is a pointer type anywhere but in the very first
 	 array-section-subscript, the array section can't be contiguous.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
-	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST
+	  && ort != C_ORT_ACC)
 	{
 	  error_at (OMP_CLAUSE_LOCATION (c),
 		    "array section is not contiguous in %qs clause",
 		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
 	  return error_mark_node;
 	}
+      else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	non_contiguous = true;
     }
   else
     {
@@ -12729,10 +12748,11 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree first = handle_omp_array_sections_1 (c, OMP_CLAUSE_DECL (c), types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -12765,6 +12785,7 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree da_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -12788,6 +12809,13 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      da_dims = tree_cons (low_bound, length, da_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -12880,6 +12908,14 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
 		size = size_binop (MULT_EXPR, size, l);
 	    }
 	}
+      if (non_contiguous)
+	{
+	  int kind = OMP_CLAUSE_MAP_KIND (c);
+	  OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_DYNAMIC_ARRAY);
+	  OMP_CLAUSE_DECL (c) = t;
+	  OMP_CLAUSE_SIZE (c) = da_dims;
+	  return false;
+	}
       if (side_effects)
 	size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
       if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION)
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 85c7cfa..af7a1a6 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -4521,7 +4521,7 @@ omp_privatize_field (tree t, bool shared)
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -4604,7 +4604,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
       && TREE_CODE (TREE_CHAIN (t)) == FIELD_DECL)
     TREE_CHAIN (t) = omp_privatize_field (TREE_CHAIN (t), false);
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -4776,6 +4777,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 		    }
 		}
 	    }
+
+	  /* For OpenACC, if the low_bound/length suggest this is a subarray,
+	     and is referenced through by a pointer, then mark this as
+	     non-contiguous.  */
+	  if (ort == C_ORT_ACC
+	      && types.length () > 0
+	      && (TREE_CODE (low_bound) != INTEGER_CST
+		  || integer_nonzerop (low_bound)
+		  || (length && (TREE_CODE (length) != INTEGER_CST
+				 || !tree_int_cst_equal (size, length)))))
+	    {
+	      tree x = types.last ();
+	      if (TREE_CODE (x) == POINTER_TYPE)
+		non_contiguous = true;
+	    }
 	}
       else if (length == NULL_TREE)
 	{
@@ -4817,13 +4833,16 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
       /* If there is a pointer type anywhere but in the very first
 	 array-section-subscript, the array section can't be contiguous.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
-	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST
+	  && ort != C_ORT_ACC)
 	{
 	  error_at (OMP_CLAUSE_LOCATION (c),
 		    "array section is not contiguous in %qs clause",
 		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
 	  return error_mark_node;
 	}
+      else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	non_contiguous = true;
     }
   else
     {
@@ -4851,10 +4870,11 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree first = handle_omp_array_sections_1 (c, OMP_CLAUSE_DECL (c), types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -4888,6 +4908,7 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree da_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -4913,6 +4934,13 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      da_dims = tree_cons (low_bound, length, da_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -5000,6 +5028,14 @@ handle_omp_array_sections (tree c, enum c_omp_region_type ort)
 	}
       if (!processing_template_decl)
 	{
+	  if (non_contiguous)
+	    {
+	      int kind = OMP_CLAUSE_MAP_KIND (c);
+	      OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_DYNAMIC_ARRAY);
+	      OMP_CLAUSE_DECL (c) = t;
+	      OMP_CLAUSE_SIZE (c) = da_dims;
+	      return false;
+	    }
 	  if (side_effects)
 	    size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
 	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation
@ 2018-10-16 13:13         ` Chung-Lin Tang
  2018-10-16 13:54           ` [PATCH, OpenACC, 5/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: bias scanning/adjustment during omp-lowering Chung-Lin Tang
  2018-12-13 14:52           ` [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation Chung-Lin Tang
  0 siblings, 2 replies; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 13:13 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

The next two patches are the bulk of the compiler patch in the middle-ends.

The first patch here, implements the creation of dynamic array descriptors to
pass to the runtime, a different way than completely using map-clauses.

Because we support arbitrary number of dimensions, adding more map kind cases
may convolute a lot of the compiler/runtime logic handling the long map sequences.

This implementation uses a descriptor struct created on stack, and passes the
pointer to descriptor through to the libgomp runtime, using the exact same receiver field
for the dynamic array.

The libgomp runtime then does its stuff to set things up, and properly adjusts the device-side
receiver field pointer to the on-device created dynamic array structures. I.e. the same receiver
field serves as descriptor address field on the compiler side, and as the actual data address
once we get to device code (a pretty important point needed to clarify).

Thanks,
Chung-Lin

	gcc/
	* omp-low.c (struct omp_context):
	Add 'hash_map<tree_operand_hash, tree> *dynamic_arrays' field, also
	added include of "tree-hash-traits.h".
	(append_field_to_record_type): New function.
	(create_dynamic_array_descr_type): Likewise.
	(create_dynamic_array_descr_init_code): Likewise.
	(new_omp_context): Add initialize of dynamic_arrays field.
	(delete_omp_context): Add delete of dynamic_arrays field.
	(scan_sharing_clauses): For dynamic array map kinds, check for
	supported dimension structure, and install dynamic array variable into
	current omp_context.
	(lower_omp_target): Add handling for dynamic array map kinds.

[-- Attachment #2: openacc-da-04.omp-low.descr_create.patch --]
[-- Type: text/plain, Size: 10003 bytes --]

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6a1cb05..4c44800 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "hsa-common.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "tree-hash-traits.h"
 
 /* Lowering of OMP parallel and workshare constructs proceeds in two
    phases.  The first phase scans the function looking for OMP statements
@@ -124,6 +125,9 @@ struct omp_context
 
   /* True if this construct can be cancelled.  */
   bool cancellable;
+
+  /* Hash map of dynamic arrays in this context.  */
+  hash_map<tree_operand_hash, tree> *dynamic_arrays;
 };
 
 static splay_tree all_contexts;
@@ -843,6 +847,136 @@ omp_copy_decl (tree var, copy_body_data *cb)
   return error_mark_node;
 }
 
+/* Helper function for create_dynamic_array_descr_type(), to append a new field
+   to a record type.  */
+
+static void
+append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type)
+{
+  tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type);
+  DECL_CONTEXT (fld) = record_type;
+
+  for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p))
+    ;
+  *p = fld;
+}
+
+/* Create type for dynamic array descriptor. Returns created type, and
+   returns the number of dimensions in *DIM_NUM.  */
+
+static tree
+create_dynamic_array_descr_type (tree decl, tree dims, int *dim_num)
+{
+  int n = 0;
+  tree da_descr_type, name, x;
+  gcc_assert (TREE_CODE (dims) == TREE_LIST);
+
+  da_descr_type = lang_hooks.types.make_type (RECORD_TYPE);
+  name = create_tmp_var_name (".omp_dynamic_array_descr_type");
+  name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, da_descr_type);
+  DECL_ARTIFICIAL (name) = 1;
+  DECL_NAMELESS (name) = 1;
+  TYPE_NAME (da_descr_type) = name;
+  TYPE_ARTIFICIAL (da_descr_type) = 1;
+
+  /* Main starting pointer/array.  */
+  tree main_var_type = TREE_TYPE (decl);
+  if (TREE_CODE (main_var_type) == REFERENCE_TYPE)
+    main_var_type = TREE_TYPE (main_var_type);
+  append_field_to_record_type (da_descr_type, DECL_NAME (decl),
+			       (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE
+				? main_var_type
+				: build_pointer_type (main_var_type)));
+  /* Number of dimensions.  */
+  append_field_to_record_type (da_descr_type, get_identifier ("$dim_num"),
+			       sizetype);
+
+  for (x = dims; x; x = TREE_CHAIN (x), n++)
+    {
+      char *fldname;
+      /* One for the start index.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_base", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the length.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_length", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the element size.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_elem_size", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for is_array flag.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_is_array", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+    }
+
+  layout_type (da_descr_type);
+  *dim_num = n;
+  return da_descr_type;
+}
+
+/* Generate code sequence for initializing dynamic array descriptor.  */
+
+static void
+create_dynamic_array_descr_init_code (tree da_descr, tree da_var,
+				      tree dimensions, int da_dim_num,
+				      gimple_seq *ilist)
+{
+  tree fld, fldref;
+  tree da_descr_type = TREE_TYPE (da_descr);
+  tree dim_type = TREE_TYPE (da_var);
+
+  fld = TYPE_FIELDS (da_descr_type);
+  fldref = omp_build_component_ref (da_descr, fld);
+  gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE
+			    ? build_fold_addr_expr (da_var) : da_var), ilist);
+
+  if (TREE_CODE (dim_type) == REFERENCE_TYPE)
+    dim_type = TREE_TYPE (dim_type);
+
+  fld = TREE_CHAIN (fld);
+  fldref = omp_build_component_ref (da_descr, fld);
+  gimplify_assign (fldref, build_int_cst (sizetype, da_dim_num), ilist);
+
+  while (dimensions)
+    {
+      tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions));
+      tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions));
+      tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type));
+      tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE
+			   ? integer_one_node : integer_zero_node);
+      /* Set base.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size);
+      gimplify_assign (fldref, dim_base, ilist);
+
+      /* Set length.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size);
+      gimplify_assign (fldref, dim_length, ilist);
+
+      /* Set elem_size.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_elem_size = fold_convert (sizetype, dim_elem_size);
+      gimplify_assign (fldref, dim_elem_size, ilist);
+
+      /* Set is_array flag.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_is_array = fold_convert (sizetype, dim_is_array);
+      gimplify_assign (fldref, dim_is_array, ilist);
+
+      dimensions = TREE_CHAIN (dimensions);
+      dim_type = TREE_TYPE (dim_type);
+    }
+  gcc_assert (TREE_CHAIN (fld) == NULL_TREE);
+}
+
 /* Create a new context, with OUTER_CTX being the surrounding context.  */
 
 static omp_context *
@@ -877,6 +1011,8 @@ new_omp_context (gimple *stmt, omp_context *outer_ctx)
 
   ctx->cb.decl_map = new hash_map<tree, tree>;
 
+  ctx->dynamic_arrays = new hash_map<tree_operand_hash, tree>;
+
   return ctx;
 }
 
@@ -951,6 +1087,8 @@ delete_omp_context (splay_tree_value value)
   if (is_task_ctx (ctx))
     finalize_task_copyfn (as_a <gomp_task *> (ctx->stmt));
 
+  delete ctx->dynamic_arrays;
+
   XDELETE (ctx);
 }
 
@@ -1256,6 +1394,42 @@ scan_sharing_clauses (tree clauses, omp_context *ctx,
 	      install_var_local (decl, ctx);
 	      break;
 	    }
+
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	      && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	    {
+	      tree da_decl = OMP_CLAUSE_DECL (c);
+	      tree da_dimensions = OMP_CLAUSE_SIZE (c);
+	      tree da_type = TREE_TYPE (da_decl);
+	      bool by_ref = (TREE_CODE (da_type) == ARRAY_TYPE
+			     ? true : false);
+
+	      /* Checking code to ensure we only have arrays at top dimension.
+		 This limitation might be lifted in the future.  */
+	      if (TREE_CODE (da_type) == REFERENCE_TYPE)
+		da_type = TREE_TYPE (da_type);
+	      tree t = da_type, prev_t = NULL_TREE;
+	      while (t)
+		{
+		  if (TREE_CODE (t) == ARRAY_TYPE && prev_t)
+		    {
+		      error_at (gimple_location (ctx->stmt), "array types are"
+				" only allowed at outermost dimension of"
+				" dynamic array");
+		      break;
+		    }
+		  prev_t = t;
+		  t = TREE_TYPE (t);
+		}
+
+	      install_var_field (da_decl, by_ref, 3, ctx);
+	      tree new_var = install_var_local (da_decl, ctx);
+
+	      bool existed = ctx->dynamic_arrays->put (new_var, da_dimensions);
+	      gcc_assert (!existed);
+	      break;
+	    }
+
 	  if (DECL_P (decl))
 	    {
 	      if (DECL_SIZE (decl)
@@ -7687,6 +7861,15 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  case GOMP_MAP_FORCE_PRESENT:
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
+	  case GOMP_MAP_DYNAMIC_ARRAY_TO:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_TOFROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_ALLOC:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT:
 	  case GOMP_MAP_LINK:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
@@ -7749,7 +7932,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 			   && OMP_CLAUSE_MAP_IN_REDUCTION (c)))
 	  {
-	    x = build_receiver_ref (var, true, ctx);
+	    tree var_type = TREE_TYPE (var);
+	    bool rcv_by_ref =
+	      (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	       && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))
+	       && TREE_CODE (var_type) != ARRAY_TYPE
+	       ? false : true);
+
+	    x = build_receiver_ref (var, rcv_by_ref, ctx);
 	    tree new_var = lookup_decl (var, ctx);
 
 	    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
@@ -7993,6 +8183,25 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 		    avar = build_fold_addr_expr (avar);
 		    gimplify_assign (x, avar, &ilist);
 		  }
+		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+			 && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY))
+		  {
+		    int da_dim_num;
+		    tree dimensions = OMP_CLAUSE_SIZE (c);
+
+		    tree da_descr_type =
+		      create_dynamic_array_descr_type (OMP_CLAUSE_DECL (c),
+						       dimensions, &da_dim_num);
+		    tree da_descr =
+		      create_tmp_var_raw (da_descr_type, ".$omp_da_descr");
+		    gimple_add_tmp_var (da_descr);
+
+		    create_dynamic_array_descr_init_code
+		      (da_descr, ovar, dimensions, da_dim_num, &ilist);
+
+		    gimplify_assign (x, build_fold_addr_expr (da_descr),
+				     &ilist);
+		  }
 		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
 		  {
 		    gcc_assert (is_gimple_omp_oacc (ctx->stmt));
@@ -8053,6 +8262,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 		  s = TREE_TYPE (s);
 		s = TYPE_SIZE_UNIT (s);
 	      }
+	    else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+		     && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY))
+	      s = NULL_TREE;
 	    else
 	      s = OMP_CLAUSE_SIZE (c);
 	    if (s == NULL_TREE)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 5/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: bias scanning/adjustment during omp-lowering
@ 2018-10-16 13:54           ` Chung-Lin Tang
  2018-10-16 14:11             ` [PATCH, OpenACC, 6/8] Multi-dimensional dynamic array support for OpenACC data clauses, tree pretty-printing additions Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 13:54 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 779 bytes --]

This part is also in omp-low.c.

We scan and adjust the code during omp-lowering, to add the biases for each dimension when
a dynamic array access is detected, which is required for generally supporting copying
sections of each dimension.

The code is a bit sophisticated, and I wonder if this is better implemented in gimplify.c
(though probably a non-trivial task as well). Nevertheless, it is currently working.

Thanks,
Chung-Lin

	gcc/
	* omp-low.c (dynamic_array_lookup): New function.
	(dynamic_array_reference_start): Likewise.
	(scan_for_op): Likewise.
	(scan_for_reference): Likewise.
	(da_create_bias): Likewise.
	(da_dimension_peel): Likewise.
	(lower_omp_1): Add case to look for start of dynamic array reference,
	and handle bias adjustments for the code sequence.

[-- Attachment #2: openacc-da-05.omp-low.bias_adjust.patch --]
[-- Type: text/plain, Size: 7680 bytes --]

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6a1cb05..4c44800 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -8734,6 +8946,201 @@ lower_omp_grid_body (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 		       gimple_build_omp_return (false));
 }
 
+/* Helper to lookup dynamic array through nested omp contexts. Returns
+   TREE_LIST of dimensions, and the CTX where it was found in *CTX_P.  */
+
+static tree
+dynamic_array_lookup (tree t, omp_context **ctx_p)
+{
+  omp_context *c = *ctx_p;
+  while (c)
+    {
+      tree *dims = c->dynamic_arrays->get (t);
+      if (dims)
+	{
+	  *ctx_p = c;
+	  return *dims;
+	}
+      c = c->outer;
+    }
+  return NULL_TREE;
+}
+
+/* Tests if this gimple STMT is the start of a dynamic array access sequence.
+   Returns true if found, and also returns the gimple operand ptr and
+   dimensions tree list through *OUT_REF and *OUT_DIMS respectively.  */
+
+static bool
+dynamic_array_reference_start (gimple *stmt, omp_context **ctx_p,
+			       tree **out_ref, tree *out_dims)
+{
+  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+    for (unsigned i = 1; i < gimple_num_ops (stmt); i++)
+      {
+	tree *op = gimple_op_ptr (stmt, i), dims;
+	if (TREE_CODE (*op) == ARRAY_REF)
+	  op = &TREE_OPERAND (*op, 0);
+	if (TREE_CODE (*op) == MEM_REF)
+	  op = &TREE_OPERAND (*op, 0);
+	if ((dims = dynamic_array_lookup (*op, ctx_p)) != NULL_TREE)
+	  {
+	    *out_ref = op;
+	    *out_dims = dims;
+	    return true;
+	  }
+      }
+  return false;
+}
+
+static tree
+scan_for_op (tree *tp, int *walk_subtrees, void *data)
+{
+  struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
+  tree t = *tp;
+  tree op = (tree) wi->info;
+  *walk_subtrees = 1;
+  if (operand_equal_p (t, op, 0))
+    {
+      wi->info = tp;
+      return t;
+    }
+  return NULL_TREE;
+}
+
+static tree *
+scan_for_reference (gimple *stmt, tree op)
+{
+  struct walk_stmt_info wi;
+  memset (&wi, 0, sizeof (wi));
+  wi.info = op;
+  if (walk_gimple_op (stmt, scan_for_op, &wi))
+    return (tree *) wi.info;
+  return NULL;
+}
+
+static tree
+da_create_bias (tree orig_bias, tree unit_type)
+{
+  return build2 (MULT_EXPR, sizetype, fold_convert (sizetype, orig_bias),
+		 TYPE_SIZE_UNIT (unit_type));
+}
+
+/* Main worker for adjusting dynamic array accesses, handles the adjustment
+   of many cases of statement forms, and called multiple times to 'peel' away
+   each dimension.  */
+
+static gimple_stmt_iterator
+da_dimension_peel (omp_context *da_ctx,
+		   gimple_stmt_iterator da_gsi, tree orig_da,
+		   tree *da_op_p, tree *da_type_p, tree *da_dims_p)
+{
+  gimple *stmt = gsi_stmt (da_gsi);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs = gimple_assign_rhs1 (stmt);
+
+  if (gimple_num_ops (stmt) == 2
+      && TREE_CODE (rhs) == MEM_REF
+      && operand_equal_p (*da_op_p, TREE_OPERAND (rhs, 0), 0)
+      && !operand_equal_p (orig_da, TREE_OPERAND (rhs, 0), 0)
+      && (TREE_OPERAND (rhs, 1) == NULL_TREE
+	  || integer_zerop (TREE_OPERAND (rhs, 1))))
+    {
+      gcc_assert (TREE_CODE (TREE_TYPE (*da_type_p)) == POINTER_TYPE);
+      *da_type_p = TREE_TYPE (*da_type_p);
+    }
+  else 
+    {
+      gimple *g;
+      gimple_seq ilist = NULL;
+      tree bias, t;
+      tree op = *da_op_p;
+      tree orig_type = *da_type_p;
+      tree orig_bias = TREE_PURPOSE (*da_dims_p);
+      bool by_ref = false;
+
+      if (TREE_CODE (orig_bias) != INTEGER_CST)
+	orig_bias = lookup_decl (orig_bias, da_ctx);
+
+      if (gimple_num_ops (stmt) == 2)
+	{
+	  if (TREE_CODE (rhs) == ADDR_EXPR)
+	    {
+	      rhs = TREE_OPERAND (rhs, 0);
+	      *da_dims_p = NULL_TREE;
+	    }
+
+	  if (TREE_CODE (rhs) == ARRAY_REF
+	      && TREE_CODE (TREE_OPERAND (rhs, 0)) == MEM_REF
+	      && operand_equal_p (TREE_OPERAND (TREE_OPERAND (rhs, 0), 0),
+				  *da_op_p, 0))
+	    {
+	      bias = da_create_bias (orig_bias,
+				     TREE_TYPE (TREE_TYPE (orig_type)));
+	      *da_type_p = TREE_TYPE (TREE_TYPE (orig_type));
+	    }
+	  else if (TREE_CODE (rhs) == ARRAY_REF
+		   && TREE_CODE (TREE_OPERAND (rhs, 0)) == VAR_DECL
+		   && operand_equal_p (TREE_OPERAND (rhs, 0), *da_op_p, 0))
+	    {
+	      tree ptr_type = build_pointer_type (orig_type);
+	      op = create_tmp_var (ptr_type);
+	      gimplify_assign (op, build_fold_addr_expr (TREE_OPERAND (rhs, 0)),
+			       &ilist);
+	      bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+	      *da_type_p = TREE_TYPE (orig_type);
+	      orig_type = ptr_type;
+	      by_ref = true;
+	    }
+	  else if (TREE_CODE (rhs) == MEM_REF
+		   && operand_equal_p (*da_op_p, TREE_OPERAND (rhs, 0), 0)
+		   && TREE_OPERAND (rhs, 1) != NULL_TREE)
+	    {
+	      bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+	      *da_type_p = TREE_TYPE (orig_type);
+	    }
+	  else if (TREE_CODE (lhs) == MEM_REF
+		   && operand_equal_p (*da_op_p, TREE_OPERAND (lhs, 0), 0))
+	    {
+	      if (*da_dims_p != NULL_TREE)
+		{
+		  gcc_assert (TREE_CHAIN (*da_dims_p) == NULL_TREE);
+		  bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+		  *da_type_p = TREE_TYPE (orig_type);
+		}
+	      else
+		/* This should be the end of the dynamic array access
+		   sequence.  */
+		return da_gsi;
+	    }
+	  else
+	    gcc_unreachable ();
+	}
+      else if (gimple_num_ops (stmt) == 3
+	       && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+	       && operand_equal_p (*da_op_p, rhs, 0))
+	{
+	  bias = da_create_bias (orig_bias, TREE_TYPE (orig_type));
+	}
+      else
+	gcc_unreachable ();
+
+      bias = fold_build1 (NEGATE_EXPR, sizetype, bias);
+      bias = fold_build2 (POINTER_PLUS_EXPR, orig_type, op, bias);
+
+      t = create_tmp_var (by_ref ? build_pointer_type (orig_type) : orig_type);
+
+      g = gimplify_assign (t, bias, &ilist);
+      gsi_insert_seq_before (&da_gsi, ilist, GSI_NEW_STMT);
+      *da_op_p = gimple_assign_lhs (g);
+
+      if (by_ref)
+	*da_op_p = build2 (MEM_REF, TREE_TYPE (orig_type), *da_op_p,
+			   build_int_cst (orig_type, 0));
+      *da_dims_p = TREE_CHAIN (*da_dims_p);
+    }
+
+  return da_gsi;
+}
 
 /* Callback for lower_omp_1.  Return non-NULL if *tp needs to be
    regimplified.  If DATA is non-NULL, lower_omp_1 is outside
@@ -9009,6 +9416,51 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  }
       /* FALLTHRU */
     default:
+
+      /* If we detect the start of a dynamic array reference sequence, scan
+	 and do the needed adjustments.  */
+      tree da_dims, *da_op_p;
+      omp_context *da_ctx = ctx;
+      if (da_ctx && dynamic_array_reference_start (stmt, &da_ctx,
+						   &da_op_p, &da_dims))
+	{
+	  bool started = false;
+	  tree orig_da = *da_op_p;
+	  tree da_type = TREE_TYPE (orig_da);
+	  tree next_da_op;
+
+	  gimple_stmt_iterator da_gsi = *gsi_p, new_gsi;
+	  while (da_op_p)
+	    {
+	      if (!is_gimple_assign (gsi_stmt (da_gsi))
+		  || ((gimple_assign_single_p (gsi_stmt (da_gsi))
+		       || gimple_assign_cast_p (gsi_stmt (da_gsi)))
+		      && *da_op_p == gimple_assign_rhs1 (gsi_stmt (da_gsi))))
+		break;
+
+	      new_gsi = da_dimension_peel (da_ctx, da_gsi, orig_da,
+					   da_op_p, &da_type, &da_dims);
+	      if (!started)
+		{
+		  /* Point 'stmt' to the start of the newly added
+		     sequence.  */
+		  started = true;
+		  *gsi_p = new_gsi;
+		  stmt = gsi_stmt (*gsi_p);
+		}
+	      if (!da_dims)
+		break;
+
+	      next_da_op = gimple_assign_lhs (gsi_stmt (da_gsi));
+
+	      do {
+		gsi_next (&da_gsi);
+		da_op_p = scan_for_reference (gsi_stmt (da_gsi), next_da_op);
+	      }
+	      while (!da_op_p);
+	    }
+	}
+
       if ((ctx || task_shared_vars)
 	  && walk_gimple_op (stmt, lower_omp_regimplify_p,
 			     ctx ? NULL : &wi))

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 6/8] Multi-dimensional dynamic array support for OpenACC data clauses, tree pretty-printing additions
@ 2018-10-16 14:11             ` Chung-Lin Tang
  2018-10-16 14:20               ` [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 14:11 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 226 bytes --]

This tree-pretty-print.c patch allows proper dumping of the dynamic arrays case
of OMP_CLAUSE_MAP.

Thanks,
Chung-Lin

	gcc/
	* tree-pretty-print.c (dump_omp_clauses): Add cases for printing
	GOMP_MAP_DYNAMIC_ARRAY map kinds.

[-- Attachment #2: openacc-da-06.tree-pretty-print.patch --]
[-- Type: text/plain, Size: 1645 bytes --]

diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 1c7982c..803f76b 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -745,6 +745,33 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
 	case GOMP_MAP_LINK:
 	  pp_string (pp, "link");
 	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_TO:
+	  pp_string (pp, "to,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FROM:
+	  pp_string (pp, "from,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_TOFROM:
+	  pp_string (pp, "tofrom,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO:
+	  pp_string (pp, "force_to,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM:
+	  pp_string (pp, "force_from,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM:
+	  pp_string (pp, "force_tofrom,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_ALLOC:
+	  pp_string (pp, "alloc,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC:
+	  pp_string (pp, "force_alloc,dynamic_array");
+	  break;
+	case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT:
+	  pp_string (pp, "force_present,dynamic_array");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -766,6 +793,10 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
 	    case GOMP_MAP_TO_PSET:
 	      pp_string (pp, " [pointer set, len: ");
 	      break;
+	    case GOMP_MAP_DYNAMIC_ARRAY:
+	      gcc_assert (TREE_CODE (OMP_CLAUSE_SIZE (clause)) == TREE_LIST);
+	      pp_string (pp, " [dimensions: ");
+	      break;
 	    default:
 	      pp_string (pp, " [len: ");
 	      break;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support
@ 2018-10-16 14:20               ` Chung-Lin Tang
  2018-10-16 14:28                 ` [PATCH, OpenACC, 8/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp testsuite additions Chung-Lin Tang
  2018-10-16 14:49                 ` [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support Jakub Jelinek
  0 siblings, 2 replies; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 14:20 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 776 bytes --]

This part is the libgomp runtime handling for OpenACC dynamic arrays.

We handle such arrays by creating a "pointer block" that emulates the N-1 dimensions,
and then treating each data row of the final Nth dimension as an individual object
mapped in the TGT. All the rows are processed as appended after all the other map
kind objects.

Thanks,
Chung-Lin

	libgomp/
	* target.c (struct da_dim): New struct declaration.
	(struct da_descr_type): Likewise.
	(struct da_info): Likewise.
	(gomp_dynamic_array_count_rows): New function.
	(gomp_dynamic_array_compute_info): Likewise.
	(gomp_dynamic_array_fill_rows_1): Likewise.
	(gomp_dynamic_array_fill_rows): Likewise.
	(gomp_dynamic_array_create_ptrblock): Likewise.
	(gomp_map_vars): Add code to handle dynamic array map kinds.

[-- Attachment #2: openacc-da-07.libgomp-target.patch --]
[-- Type: text/plain, Size: 12362 bytes --]

diff --git a/libgomp/target.c b/libgomp/target.c
index 4c9fae0..071dc70 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -490,6 +490,140 @@ gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)
   return tgt->tgt_start + tgt->list[i].offset;
 }
 
+/* Dynamic array related data structures, interfaces with the compiler.  */
+
+struct da_dim {
+  size_t base;
+  size_t length;
+  size_t elem_size;
+  size_t is_array;
+};
+
+struct da_descr_type {
+  void *ptr;
+  size_t ndims;
+  struct da_dim dims[];
+};
+
+/* Internal dynamic array info struct, used only here inside the runtime. */
+
+struct da_info
+{
+  struct da_descr_type *descr;
+  size_t map_index;
+  size_t ptrblock_size;
+  size_t data_row_num;
+  size_t data_row_size;
+};
+
+static size_t
+gomp_dynamic_array_count_rows (struct da_descr_type *descr)
+{
+  size_t nrows = 1;
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    nrows *= descr->dims[d].length / sizeof (void *);
+  return nrows;
+}
+
+static void
+gomp_dynamic_array_compute_info (struct da_info *da)
+{
+  size_t d, n = 1;
+  struct da_descr_type *descr = da->descr;
+
+  da->ptrblock_size = 0;
+  for (d = 0; d < descr->ndims - 1; d++)
+    {
+      size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size;
+      size_t dim_ptrblock_size = (descr->dims[d + 1].is_array
+				  ? 0 : descr->dims[d].length * n);
+      da->ptrblock_size += dim_ptrblock_size;
+      n *= dim_count;
+    }
+  da->data_row_num = n;
+  da->data_row_size = descr->dims[d].length;
+}
+
+static void
+gomp_dynamic_array_fill_rows_1 (struct da_descr_type *descr, void *da,
+				size_t d, void ***row_ptr, size_t *count)
+{
+  if (d < descr->ndims - 1)
+    {
+      size_t elsize = descr->dims[d].elem_size;
+      size_t n = descr->dims[d].length / elsize;
+      void *p = da + descr->dims[d].base;
+      for (size_t i = 0; i < n; i++)
+	{
+	  void *ptr = p + i * elsize;
+	  /* Deref if next dimension is not array.  */
+	  if (!descr->dims[d + 1].is_array)
+	    ptr = *((void **) ptr);
+	  gomp_dynamic_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count);
+	}
+    }
+  else
+    {
+      **row_ptr = da + descr->dims[d].base;
+      *row_ptr += 1;
+      *count += 1;
+    }
+}
+
+static size_t
+gomp_dynamic_array_fill_rows (struct da_descr_type *descr, void *rows[])
+{
+  size_t count = 0;
+  void **p = rows;
+  gomp_dynamic_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count);
+  return count;
+}
+
+static void *
+gomp_dynamic_array_create_ptrblock (struct da_info *da,
+				    void *tgt_addr, void *tgt_data_rows[])
+{
+  struct da_descr_type *descr = da->descr;
+  void *ptrblock = gomp_malloc (da->ptrblock_size);
+  void **curr_dim_ptrblock = (void **) ptrblock;
+  size_t n = 1;
+
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    {
+      int curr_dim_len = descr->dims[d].length;
+      int next_dim_len = descr->dims[d + 1].length;
+      int curr_dim_num = curr_dim_len / sizeof (void *);
+
+      void *next_dim_ptrblock
+	= (void *)(curr_dim_ptrblock + n * curr_dim_num);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < curr_dim_num; i++)
+	  {
+	    if (d < descr->ndims - 2)
+	      {
+		void *ptr = (next_dim_ptrblock
+			     + b * curr_dim_num * next_dim_len
+			     + i * next_dim_len);
+		void *tgt_ptr = tgt_addr + (ptr - ptrblock);
+		curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr;
+	      }
+	    else
+	      {
+		curr_dim_ptrblock[b * curr_dim_num + i]
+		  = tgt_data_rows[b * curr_dim_num + i];
+	      }
+	    void *addr = &curr_dim_ptrblock[b * curr_dim_num + i];
+	    assert (ptrblock <= addr && addr < ptrblock + da->ptrblock_size);
+	  }
+
+      n *= curr_dim_num;
+      curr_dim_ptrblock = next_dim_ptrblock;
+    }
+  assert (n == da->data_row_num);
+  return ptrblock;
+}
+
 attribute_hidden struct target_mem_desc *
 gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	       void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds,
@@ -501,9 +635,29 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
   const int typemask = short_mapkind ? 0xff : 0x7;
   struct splay_tree_s *mem_map = &devicep->mem_map;
   struct splay_tree_key_s cur_node;
-  struct target_mem_desc *tgt
-    = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
-  tgt->list_count = mapnum;
+  struct target_mem_desc *tgt;
+
+  size_t da_data_row_num = 0, row_start = 0;
+  size_t da_info_num = 0, da_index;
+  struct da_info *da_info = NULL;
+  struct target_var_desc *row_desc;
+  uintptr_t target_row_addr;
+  void **host_data_rows = NULL, **target_data_rows = NULL;
+  void *row;
+
+  for (i = 0; i < mapnum; i++)
+    {
+      int kind = get_kind (short_mapkind, kinds, i);
+      if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	{
+	  da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
+	  da_info_num += 1;
+	}
+    }
+
+  tgt = gomp_malloc (sizeof (*tgt)
+		     + sizeof (tgt->list[0]) * (mapnum + da_data_row_num));
+  tgt->list_count = mapnum + da_data_row_num;
   tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
   tgt->device_descr = devicep;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
@@ -515,6 +669,14 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
       return tgt;
     }
 
+  if (da_info_num)
+    da_info = gomp_alloca (sizeof (struct da_info) * da_info_num);
+  if (da_data_row_num)
+    {
+      host_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num);
+      target_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num);
+    }
+
   tgt_align = sizeof (void *);
   tgt_size = 0;
   cbuf.chunks = NULL;
@@ -546,7 +708,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
       return NULL;
     }
 
-  for (i = 0; i < mapnum; i++)
+  for (i = 0, da_index = 0; i < mapnum; i++)
     {
       int kind = get_kind (short_mapkind, kinds, i);
       if (hostaddrs[i] == NULL
@@ -619,6 +781,20 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	{
+	  /* Ignore dynamic arrays for now, we process them together
+	     later.  */
+	  tgt->list[i].key = NULL;
+	  tgt->list[i].offset = 0;
+	  not_found_cnt++;
+
+	  struct da_info *da = &da_info[da_index++];
+	  da->descr = (struct da_descr_type *) hostaddrs[i];
+	  da->map_index = i;
+	  continue;
+	}
+
       cur_node.host_start = (uintptr_t) hostaddrs[i];
       if (!GOMP_MAP_POINTER_P (kind & typemask))
 	cur_node.host_end = cur_node.host_start + sizes[i];
@@ -687,6 +863,55 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	}
     }
 
+  /* For dynamic arrays. Each data row is one target item, separated from
+     the normal map clause items, hence we order them after mapnum.  */
+  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+    {
+      int kind = get_kind (short_mapkind, kinds, i);
+      if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	continue;
+
+      struct da_info *da = &da_info[da_index++];
+      struct da_descr_type *descr = da->descr;
+      size_t nr;
+
+      gomp_dynamic_array_compute_info (da);
+
+      /* We have allocated space in host/target_data_rows to place all the
+	 row data block pointers, now we can start filling them in.  */
+      nr = gomp_dynamic_array_fill_rows (descr, &host_data_rows[row_start]);
+      assert (nr == da->data_row_num);
+
+      size_t align = (size_t) 1 << (kind >> rshift);
+      if (tgt_align < align)
+	tgt_align = align;
+      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+      tgt_size += da->ptrblock_size;
+
+      for (size_t j = 0; j < da->data_row_num; j++)
+	{
+	  row = host_data_rows[row_start + j];
+	  row_desc = &tgt->list[mapnum + row_start + j];
+
+	  cur_node.host_start = (uintptr_t) row;
+	  cur_node.host_end = cur_node.host_start + da->data_row_size;
+	  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	  if (n)
+	    {
+	      assert (n->refcount != REFCOUNT_LINK);
+	      gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+				      kind & typemask, /* TODO: cbuf? */ NULL);
+	    }
+	  else
+	    {
+	      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+	      tgt_size += da->data_row_size;
+	      not_found_cnt++;
+	    }
+	}
+      row_start += da->data_row_num;
+    }
+
   if (devaddrs)
     {
       if (mapnum != 1)
@@ -830,6 +1055,15 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	      default:
 		break;
 	      }
+
+	    if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	      {
+		tgt->list[i].key = &array->key;
+		tgt->list[i].key->tgt = tgt;
+		array++;
+		continue;
+	      }
+
 	    splay_tree_key k = &array->key;
 	    k->host_start = (uintptr_t) hostaddrs[i];
 	    if (!GOMP_MAP_POINTER_P (kind & typemask))
@@ -976,6 +1210,108 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 		array++;
 	      }
 	  }
+
+      /* Processing of dynamic array rows.  */
+      for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+	{
+	  int kind = get_kind (short_mapkind, kinds, i);
+	  if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	    continue;
+
+	  struct da_info *da = &da_info[da_index++];
+	  assert (da->descr == hostaddrs[i]);
+
+	  /* The map for the dynamic array itself is never copied from during
+	     unmapping, its the data rows that count. Set copy from flags are
+	     set to false here.  */
+	  tgt->list[i].copy_from = false;
+	  tgt->list[i].always_copy_from = false;
+
+	  size_t align = (size_t) 1 << (kind >> rshift);
+	  tgt_size = (tgt_size + align - 1) & ~(align - 1);
+
+	  /* For the map of the dynamic array itself, adjust so that the passed
+	     device address points to the beginning of the ptrblock.  */
+	  tgt->list[i].key->tgt_offset = tgt_size;
+
+	  void *target_ptrblock = (void*) tgt->tgt_start + tgt_size;
+	  tgt_size += da->ptrblock_size;
+
+	  /* Add splay key for each data row in current DA.  */
+	  for (size_t j = 0; j < da->data_row_num; j++)
+	    {
+	      row = host_data_rows[row_start + j];
+	      row_desc = &tgt->list[mapnum + row_start + j];
+
+	      cur_node.host_start = (uintptr_t) row;
+	      cur_node.host_end = cur_node.host_start + da->data_row_size;
+	      splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	      if (n)
+		{
+		  assert (n->refcount != REFCOUNT_LINK);
+		  gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+					  kind & typemask, cbufp);
+		  target_row_addr = n->tgt->tgt_start + n->tgt_offset;
+		}
+	      else
+		{
+		  tgt->refcount++;
+
+		  splay_tree_key k = &array->key;
+		  k->host_start = (uintptr_t) row;
+		  k->host_end = k->host_start + da->data_row_size;
+
+		  k->tgt = tgt;
+		  k->refcount = 1;
+		  k->link_key = NULL;
+		  tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		  target_row_addr = tgt->tgt_start + tgt_size;
+		  k->tgt_offset = tgt_size;
+		  tgt_size += da->data_row_size;
+
+		  row_desc->key = k;
+		  row_desc->copy_from
+		    = GOMP_MAP_COPY_FROM_P (kind & typemask);
+		  row_desc->always_copy_from
+		    = GOMP_MAP_COPY_FROM_P (kind & typemask);
+		  row_desc->offset = 0;
+		  row_desc->length = da->data_row_size;
+
+		  array->left = NULL;
+		  array->right = NULL;
+		  splay_tree_insert (mem_map, array);
+
+		  if (GOMP_MAP_COPY_TO_P (kind & typemask))
+		    gomp_copy_host2dev (devicep,
+					(void *) tgt->tgt_start + k->tgt_offset,
+					(void *) k->host_start,
+					da->data_row_size, cbufp);
+		  array++;
+		}
+	      target_data_rows[row_start + j] = (void *) target_row_addr;
+	    }
+
+	  /* Now we have the target memory allocated, and target offsets of all
+	     row blocks assigned and calculated, we can construct the
+	     accelerator side ptrblock and copy it in.  */
+	  if (da->ptrblock_size)
+	    {
+	      void *ptrblock = gomp_dynamic_array_create_ptrblock
+		(da, target_ptrblock, target_data_rows + row_start);
+	      gomp_copy_host2dev (devicep, target_ptrblock, ptrblock,
+				  da->ptrblock_size, cbufp);
+	      free (ptrblock);
+	    }
+
+	  row_start += da->data_row_num;
+	}
+      assert (row_start == da_data_row_num && da_index == da_info_num);
+    }
+
+  if (da_data_row_num)
+    {
+      free (host_data_rows);
+      free (target_data_rows);
     }
 
   if (pragma_kind == GOMP_MAP_VARS_TARGET)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 8/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp testsuite additions
@ 2018-10-16 14:28                 ` Chung-Lin Tang
  2019-08-20 11:54                   ` [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-10-16 14:28 UTC (permalink / raw)
  To: gcc-patches, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 404 bytes --]

These are the added cases for testing the OpenACC dynamic (sub)arrays functionality.

Thanks,
Chung-Lin

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/da-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/da-utils.h: New test.

[-- Attachment #2: openacc-da-08.libgomp-testsuite.patch --]
[-- Type: text/plain, Size: 7333 bytes --]

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c
new file mode 100644
index 0000000..c1c205d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-1.c
@@ -0,0 +1,103 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+
+#define n 100
+#define m 100
+
+int b[n][m];
+
+void
+test1 (void)
+{
+  int i, j, *a[100];
+
+  /* Array of pointers form test.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+}
+
+void
+test2 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+
+  /* Separately allocated blocks.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+  free (a);
+}
+
+void
+test3 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+  a[0] = (int *) malloc (sizeof (int) * n * m);
+
+  /* Rows allocated in one contiguous block.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = *a + i * m;
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    for (j = 0; j < m; j++)
+      assert (a[i][j] == b[i][j]);
+
+  free (a[0]);
+  free (a);
+}
+
+int
+main (void)
+{
+  test1 ();
+  test2 ();
+  test3 ();
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c
new file mode 100644
index 0000000..6ee7855
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-2.c
@@ -0,0 +1,37 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "da-utils.h"
+
+int
+main (void)
+{
+  int n = 10;
+  int ***a = (int ***) create_da (sizeof (int), n, 3);
+  int ***b = (int ***) create_da (sizeof (int), n, 3);
+  int ***c = (int ***) create_da (sizeof (int), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	{
+	  a[i][j][k] = i + j * k + k;
+	  b[i][j][k] = j + k * i + i * j;
+	  c[i][j][k] = a[i][j][k];
+	}
+
+  #pragma acc parallel copy (a[0:n][0:n][0:n]) copyin (b[0:n][0:n][0:n])
+  {
+    for (int i = 0; i < n; i++)
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  a[i][j][k] += b[k][j][i] + i + j + k;
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (a[i][j][k] == c[i][j][k] + b[k][j][i] + i + j + k);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c
new file mode 100644
index 0000000..877c6df
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-3.c
@@ -0,0 +1,45 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "da-utils.h"
+
+int main (void)
+{
+  int n = 20, x = 5, y = 12;
+  int *****a = (int *****) create_da (sizeof (int), n, 5);
+
+  int sum1 = 0, sum2 = 0, sum3 = 0;
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    {
+	      a[i][j][k][l][m] = 1;
+	      sum1++;
+	    }
+
+  #pragma acc parallel copy (a[x:y][x:y][x:y][x:y][x:y]) copy(sum2)
+  {
+    for (int i = x; i < x + y; i++)
+      for (int j = x; j < x + y; j++)
+	for (int k = x; k < x + y; k++)
+	  for (int l = x; l < x + y; l++)
+	    for (int m = x; m < x + y; m++)
+	      {
+		a[i][j][k][l][m] = 0;
+		sum2++;
+	      }
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    sum3 += a[i][j][k][l][m];
+
+  assert (sum1 == sum2 + sum3);
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c
new file mode 100644
index 0000000..2059c5f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-4.c
@@ -0,0 +1,36 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "da-utils.h"
+
+int main (void)
+{
+  int n = 128;
+  double ***a = (double ***) create_da (sizeof (double), n, 3);
+  double ***b = (double ***) create_da (sizeof (double), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	a[i][j][k] = i + j + k + i * j * k;
+
+  /* This test exercises async copyout of dynamic array rows.  */
+  #pragma acc parallel copyin(a[0:n][0:n][0:n]) copyout(b[0:n][0:n][0:n]) async(5)
+  {
+    #pragma acc loop gang
+    for (int i = 0; i < n; i++)
+      #pragma acc loop vector
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  b[i][j][k] = a[i][j][k] * 2.0;
+  }
+
+  #pragma acc wait (5)
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (b[i][j][k] == a[i][j][k] * 2.0);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h
new file mode 100644
index 0000000..2f87795
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/da-utils.h
@@ -0,0 +1,44 @@
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include <stdint.h>
+
+/* Allocate and create a pointer based NDIMS-dimensional array,
+   each dimension DIMLEN long, with ELSIZE sized data elements.  */
+void *
+create_da (size_t elsize, int dimlen, int ndims)
+{
+  size_t blk_size = 0;
+  size_t n = 1;
+
+  for (int i = 0; i < ndims - 1; i++)
+    {
+      n *= dimlen;
+      blk_size += sizeof (void *) * n;
+    }
+  size_t data_rows_num = n;
+  size_t data_rows_offset = blk_size;
+  blk_size += elsize * n * dimlen;
+
+  void *blk = (void *) malloc (blk_size);
+  memset (blk, 0, blk_size);
+  void **curr_dim = (void **) blk;
+  n = 1;
+
+  for (int d = 0; d < ndims - 1; d++)
+    {
+      uintptr_t next_dim = (uintptr_t) (curr_dim + n * dimlen);
+      size_t next_dimlen = dimlen * (d < ndims - 2 ? sizeof (void *) : elsize);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < dimlen; i++)
+	  if (d < ndims - 1)
+	    curr_dim[b * dimlen + i]
+	      = (void*) (next_dim + b * dimlen * next_dimlen + i * next_dimlen);
+
+      n *= dimlen;
+      curr_dim = (void**) next_dim;
+    }
+  assert (n == data_rows_num);
+  return blk;
+}
-- 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support
  2018-10-16 14:20               ` [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support Chung-Lin Tang
  2018-10-16 14:28                 ` [PATCH, OpenACC, 8/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp testsuite additions Chung-Lin Tang
@ 2018-10-16 14:49                 ` Jakub Jelinek
  2018-12-06 14:20                   ` Chung-Lin Tang
  1 sibling, 1 reply; 24+ messages in thread
From: Jakub Jelinek @ 2018-10-16 14:49 UTC (permalink / raw)
  To: cltang; +Cc: gcc-patches, Thomas Schwinge

On Tue, Oct 16, 2018 at 08:57:00PM +0800, Chung-Lin Tang wrote:
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -490,6 +490,140 @@ gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)
>    return tgt->tgt_start + tgt->list[i].offset;
>  }
>  
> +/* Dynamic array related data structures, interfaces with the compiler.  */
> +
> +struct da_dim {
> +  size_t base;
> +  size_t length;
> +  size_t elem_size;
> +  size_t is_array;
> +};
> +
> +struct da_descr_type {
> +  void *ptr;
> +  size_t ndims;
> +  struct da_dim dims[];
> +};

Why do you call the non-contiguous arrays dynamic arrays?  Is that some OpenACC term?
I'd also prefix those with gomp_ and it is important to make it clear what
is the ABI type shared with the compiler and what are the internal types.
struct gomp_array_descr would look more natural to me.

> +  for (i = 0; i < mapnum; i++)
> +    {
> +      int kind = get_kind (short_mapkind, kinds, i);
> +      if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
> +	{
> +	  da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
> +	  da_info_num += 1;
> +	}
> +    }

I'm not really happy by adding several extra loops which will not do
anything in the case there are no non-contiguous arrays being mapped (for
now always for OpenMP (OpenMP 5 has support for non-contigious target update
to/from though) and guess rarely for OpenACC).
Can't you use some flag bit in flags passed to GOMP_target* etc. and do the
above loop only if the compiler indicated there are any?

> +  tgt = gomp_malloc (sizeof (*tgt)
> +		     + sizeof (tgt->list[0]) * (mapnum + da_data_row_num));
> +  tgt->list_count = mapnum + da_data_row_num;
>    tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
>    tgt->device_descr = devicep;
>    struct gomp_coalesce_buf cbuf, *cbufp = NULL;

> @@ -687,6 +863,55 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
>  	}
>      }
>  
> +  /* For dynamic arrays. Each data row is one target item, separated from
> +     the normal map clause items, hence we order them after mapnum.  */
> +  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)

Even if nothing is in flags, you could just avoid this loop if the previous
loop(s) haven't found any noncontiguous arrays.

> @@ -976,6 +1210,108 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
>  		array++;
>  	      }
>  	  }
> +
> +      /* Processing of dynamic array rows.  */
> +      for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
> +	{
> +	  int kind = get_kind (short_mapkind, kinds, i);
> +	  if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
> +	    continue;

Again.

	Jakub

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support
  2018-10-16 14:49                 ` [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support Jakub Jelinek
@ 2018-12-06 14:20                   ` Chung-Lin Tang
  2018-12-06 14:43                     ` Jakub Jelinek
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-12-06 14:20 UTC (permalink / raw)
  To: Jakub Jelinek, cltang; +Cc: gcc-patches, Thomas Schwinge

Hi Jakub, thanks for the swift review a few weeks ago, and apologies I haven't been able
to respond sooner.

On 2018/10/16 9:13 PM, Jakub Jelinek wrote:>> +/* Dynamic array related data structures, interfaces with the compiler.  */
>> +
>> +struct da_dim {
>> +  size_t base;
>> +  size_t length;
>> +  size_t elem_size;
>> +  size_t is_array;
>> +};
>> +
>> +struct da_descr_type {
>> +  void *ptr;
>> +  size_t ndims;
>> +  struct da_dim dims[];
>> +};
> 
> Why do you call the non-contiguous arrays dynamic arrays?  Is that some OpenACC term?
> I'd also prefix those with gomp_ and it is important to make it clear what
> is the ABI type shared with the compiler and what are the internal types.
> struct gomp_array_descr would look more natural to me.

Well it's not particularly an OpenACC term, just that non-contiguous arrays are
often multi-dimensional arrays dynamically allocated and created through (arrays of) pointers.
Are you strongly opposed to this naming? If so, I can adjust this part.

I think the suggested 'gomp_array_descr' identifier looks descriptive, I'll revise that in an update,
as well as add more comments to better describe its ABI significance with the compiler.

>> +  for (i = 0; i < mapnum; i++)
>> +    {
>> +      int kind = get_kind (short_mapkind, kinds, i);
>> +      if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
>> +	{
>> +	  da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
>> +	  da_info_num += 1;
>> +	}
>> +    }
> 
> I'm not really happy by adding several extra loops which will not do
> anything in the case there are no non-contiguous arrays being mapped (for
> now always for OpenMP (OpenMP 5 has support for non-contigious target update
> to/from though) and guess rarely for OpenACC).
> Can't you use some flag bit in flags passed to GOMP_target* etc. and do the
> above loop only if the compiler indicated there are any?

I originally strived to not have that loop, but because each row in the last dimension
is mapped as its own target_var_desc, we need to count them at this stage to allocate
the right number at start. Otherwise a realloc later seems even more ugly...

We currently don't have a suitable flag word argument in GOMP_target*, GOACC_parallel*, etc.
I am not sure if such a feature warrants changing the interface.

If you are weary of OpenMP being affected, I can add a condition to restrict such processing
to only (pragma_kind == GOMP_MAP_VARS_OPENACC). Is that okay? (at least before making any
larger runtime interface adjustments)

>> +  tgt = gomp_malloc (sizeof (*tgt)
>> +		     + sizeof (tgt->list[0]) * (mapnum + da_data_row_num));
>> +  tgt->list_count = mapnum + da_data_row_num;
>>     tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
>>     tgt->device_descr = devicep;
>>     struct gomp_coalesce_buf cbuf, *cbufp = NULL;
> 
>> @@ -687,6 +863,55 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
>>   	}
>>       }
>>   
>> +  /* For dynamic arrays. Each data row is one target item, separated from
>> +     the normal map clause items, hence we order them after mapnum.  */
>> +  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
> 
> Even if nothing is in flags, you could just avoid this loop if the previous
> loop(s) haven't found any noncontiguous arrays.

I'll add a bit more checking to avoid these cases.

Thanks,
Chung-Lin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support
  2018-12-06 14:20                   ` Chung-Lin Tang
@ 2018-12-06 14:43                     ` Jakub Jelinek
  2018-12-13 14:52                       ` Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Jakub Jelinek @ 2018-12-06 14:43 UTC (permalink / raw)
  To: cltang; +Cc: gcc-patches, Thomas Schwinge

On Thu, Dec 06, 2018 at 10:19:43PM +0800, Chung-Lin Tang wrote:
> > Why do you call the non-contiguous arrays dynamic arrays?  Is that some OpenACC term?
> > I'd also prefix those with gomp_ and it is important to make it clear what
> > is the ABI type shared with the compiler and what are the internal types.
> > struct gomp_array_descr would look more natural to me.
> 
> Well it's not particularly an OpenACC term, just that non-contiguous arrays are
> often multi-dimensional arrays dynamically allocated and created through (arrays of) pointers.
> Are you strongly opposed to this naming? If so, I can adjust this part.

The way how those arrays are created (and it doesn't have to be dynamically
allocated) doesn't affect their representation.
There are various terms that describe various data structures, like Iliffe
vectors, jagged/ragged arrays, dope vectors.
I guess it depends on what kind of data structures does this new framework
support, if the Iliffe vectors (arrays of pointers), or just flat but
strided arrays, etc.

> > > +  for (i = 0; i < mapnum; i++)
> > > +    {
> > > +      int kind = get_kind (short_mapkind, kinds, i);
> > > +      if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
> > > +	{
> > > +	  da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
> > > +	  da_info_num += 1;
> > > +	}
> > > +    }
> > 
> > I'm not really happy by adding several extra loops which will not do
> > anything in the case there are no non-contiguous arrays being mapped (for
> > now always for OpenMP (OpenMP 5 has support for non-contigious target update
> > to/from though) and guess rarely for OpenACC).
> > Can't you use some flag bit in flags passed to GOMP_target* etc. and do the
> > above loop only if the compiler indicated there are any?
> 
> I originally strived to not have that loop, but because each row in the last dimension
> is mapped as its own target_var_desc, we need to count them at this stage to allocate
> the right number at start. Otherwise a realloc later seems even more ugly...
> 
> We currently don't have a suitable flag word argument in GOMP_target*, GOACC_parallel*, etc.
> I am not sure if such a feature warrants changing the interface.
> 
> If you are weary of OpenMP being affected, I can add a condition to restrict such processing
> to only (pragma_kind == GOMP_MAP_VARS_OPENACC). Is that okay? (at least before making any
> larger runtime interface adjustments)

That will still cost you doing that loop for OpenACC constructs that don't
have any of these non-contiguous arrays.  GOMP_target_ext has flags
argument, but GOACC_paralel_keyed doesn't.  It has ... and you could perhaps
encode some flag in there.  Or, could these array descriptors be passed
first in the list of vars, so instead of a loop to check for these you could
just check the first one?

> > > +  tgt = gomp_malloc (sizeof (*tgt)
> > > +		     + sizeof (tgt->list[0]) * (mapnum + da_data_row_num));
> > > +  tgt->list_count = mapnum + da_data_row_num;
> > >     tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
> > >     tgt->device_descr = devicep;
> > >     struct gomp_coalesce_buf cbuf, *cbufp = NULL;
> > 
> > > @@ -687,6 +863,55 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
> > >   	}
> > >       }
> > > +  /* For dynamic arrays. Each data row is one target item, separated from
> > > +     the normal map clause items, hence we order them after mapnum.  */
> > > +  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
> > 
> > Even if nothing is in flags, you could just avoid this loop if the previous
> > loop(s) haven't found any noncontiguous arrays.
> 
> I'll add a bit more checking to avoid these cases.

	Jakub

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation
  2018-10-16 13:13         ` [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation Chung-Lin Tang
  2018-10-16 13:54           ` [PATCH, OpenACC, 5/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: bias scanning/adjustment during omp-lowering Chung-Lin Tang
@ 2018-12-13 14:52           ` Chung-Lin Tang
  2018-12-18 12:51             ` Jakub Jelinek
  1 sibling, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2018-12-13 14:52 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 1375 bytes --]

On 2018/10/16 8:56 PM, Chung-Lin Tang wrote:
> The next two patches are the bulk of the compiler patch in the middle-ends.
> 
> The first patch here, implements the creation of dynamic array descriptors to
> pass to the runtime, a different way than completely using map-clauses.
> 
> Because we support arbitrary number of dimensions, adding more map kind cases
> may convolute a lot of the compiler/runtime logic handling the long map sequences.
> 
> This implementation uses a descriptor struct created on stack, and passes the
> pointer to descriptor through to the libgomp runtime, using the exact same receiver field
> for the dynamic array.
> 
> The libgomp runtime then does its stuff to set things up, and properly adjusts the device-side
> receiver field pointer to the on-device created dynamic array structures. I.e. the same receiver
> field serves as descriptor address field on the compiler side, and as the actual data address
> once we get to device code (a pretty important point needed to clarify).

After the prior revising of libgomp/target.c:gomp_map_vars() to test the first map for
necessity of this dynamic array processing, this patch correspondingly updates scan_omp_target
to reorder related map clauses to the start of the clause chain for OpenACC constructs.

Again, besides the whole revised patch, v1-v2 diff also included.

Thanks,
Chung-Lin

[-- Attachment #2: openacc-da-04.omp-low.descr_create.v2.patch --]
[-- Type: text/plain, Size: 12142 bytes --]

Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	(revision 267050)
+++ gcc/omp-low.c	(working copy)
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "hsa-common.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "tree-hash-traits.h"
 
 /* Lowering of OMP parallel and workshare constructs proceeds in two
    phases.  The first phase scans the function looking for OMP statements
@@ -133,6 +134,9 @@ struct omp_context
 
   /* True if this construct can be cancelled.  */
   bool cancellable;
+
+  /* Hash map of dynamic arrays in this context.  */
+  hash_map<tree_operand_hash, tree> *dynamic_arrays;
 };
 
 static splay_tree all_contexts;
@@ -839,6 +843,136 @@ omp_copy_decl (tree var, copy_body_data *cb)
   return error_mark_node;
 }
 
+/* Helper function for create_dynamic_array_descr_type(), to append a new field
+   to a record type.  */
+
+static void
+append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type)
+{
+  tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type);
+  DECL_CONTEXT (fld) = record_type;
+
+  for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p))
+    ;
+  *p = fld;
+}
+
+/* Create type for dynamic array descriptor. Returns created type, and
+   returns the number of dimensions in *DIM_NUM.  */
+
+static tree
+create_dynamic_array_descr_type (tree decl, tree dims, int *dim_num)
+{
+  int n = 0;
+  tree da_descr_type, name, x;
+  gcc_assert (TREE_CODE (dims) == TREE_LIST);
+
+  da_descr_type = lang_hooks.types.make_type (RECORD_TYPE);
+  name = create_tmp_var_name (".omp_dynamic_array_descr_type");
+  name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, da_descr_type);
+  DECL_ARTIFICIAL (name) = 1;
+  DECL_NAMELESS (name) = 1;
+  TYPE_NAME (da_descr_type) = name;
+  TYPE_ARTIFICIAL (da_descr_type) = 1;
+
+  /* Main starting pointer/array.  */
+  tree main_var_type = TREE_TYPE (decl);
+  if (TREE_CODE (main_var_type) == REFERENCE_TYPE)
+    main_var_type = TREE_TYPE (main_var_type);
+  append_field_to_record_type (da_descr_type, DECL_NAME (decl),
+			       (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE
+				? main_var_type
+				: build_pointer_type (main_var_type)));
+  /* Number of dimensions.  */
+  append_field_to_record_type (da_descr_type, get_identifier ("$dim_num"),
+			       sizetype);
+
+  for (x = dims; x; x = TREE_CHAIN (x), n++)
+    {
+      char *fldname;
+      /* One for the start index.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_base", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the length.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_length", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the element size.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_elem_size", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for is_array flag.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "$dim_is_array", n);
+      append_field_to_record_type (da_descr_type, get_identifier (fldname),
+				   sizetype);
+    }
+
+  layout_type (da_descr_type);
+  *dim_num = n;
+  return da_descr_type;
+}
+
+/* Generate code sequence for initializing dynamic array descriptor.  */
+
+static void
+create_dynamic_array_descr_init_code (tree da_descr, tree da_var,
+				      tree dimensions, int da_dim_num,
+				      gimple_seq *ilist)
+{
+  tree fld, fldref;
+  tree da_descr_type = TREE_TYPE (da_descr);
+  tree dim_type = TREE_TYPE (da_var);
+
+  fld = TYPE_FIELDS (da_descr_type);
+  fldref = omp_build_component_ref (da_descr, fld);
+  gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE
+			    ? build_fold_addr_expr (da_var) : da_var), ilist);
+
+  if (TREE_CODE (dim_type) == REFERENCE_TYPE)
+    dim_type = TREE_TYPE (dim_type);
+
+  fld = TREE_CHAIN (fld);
+  fldref = omp_build_component_ref (da_descr, fld);
+  gimplify_assign (fldref, build_int_cst (sizetype, da_dim_num), ilist);
+
+  while (dimensions)
+    {
+      tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions));
+      tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions));
+      tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type));
+      tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE
+			   ? integer_one_node : integer_zero_node);
+      /* Set base.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size);
+      gimplify_assign (fldref, dim_base, ilist);
+
+      /* Set length.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size);
+      gimplify_assign (fldref, dim_length, ilist);
+
+      /* Set elem_size.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_elem_size = fold_convert (sizetype, dim_elem_size);
+      gimplify_assign (fldref, dim_elem_size, ilist);
+
+      /* Set is_array flag.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (da_descr, fld);
+      dim_is_array = fold_convert (sizetype, dim_is_array);
+      gimplify_assign (fldref, dim_is_array, ilist);
+
+      dimensions = TREE_CHAIN (dimensions);
+      dim_type = TREE_TYPE (dim_type);
+    }
+  gcc_assert (TREE_CHAIN (fld) == NULL_TREE);
+}
+
 /* Create a new context, with OUTER_CTX being the surrounding context.  */
 
 static omp_context *
@@ -873,6 +1007,8 @@ new_omp_context (gimple *stmt, omp_context *outer_
 
   ctx->cb.decl_map = new hash_map<tree, tree>;
 
+  ctx->dynamic_arrays = new hash_map<tree_operand_hash, tree>;
+
   return ctx;
 }
 
@@ -953,6 +1089,8 @@ delete_omp_context (splay_tree_value value)
       delete ctx->task_reduction_map;
     }
 
+  delete ctx->dynamic_arrays;
+
   XDELETE (ctx);
 }
 
@@ -1300,6 +1438,42 @@ scan_sharing_clauses (tree clauses, omp_context *c
 	      install_var_local (decl, ctx);
 	      break;
 	    }
+
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	      && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	    {
+	      tree da_decl = OMP_CLAUSE_DECL (c);
+	      tree da_dimensions = OMP_CLAUSE_SIZE (c);
+	      tree da_type = TREE_TYPE (da_decl);
+	      bool by_ref = (TREE_CODE (da_type) == ARRAY_TYPE
+			     ? true : false);
+
+	      /* Checking code to ensure we only have arrays at top dimension.
+		 This limitation might be lifted in the future.  */
+	      if (TREE_CODE (da_type) == REFERENCE_TYPE)
+		da_type = TREE_TYPE (da_type);
+	      tree t = da_type, prev_t = NULL_TREE;
+	      while (t)
+		{
+		  if (TREE_CODE (t) == ARRAY_TYPE && prev_t)
+		    {
+		      error_at (gimple_location (ctx->stmt), "array types are"
+				" only allowed at outermost dimension of"
+				" dynamic array");
+		      break;
+		    }
+		  prev_t = t;
+		  t = TREE_TYPE (t);
+		}
+
+	      install_var_field (da_decl, by_ref, 3, ctx);
+	      tree new_var = install_var_local (da_decl, ctx);
+
+	      bool existed = ctx->dynamic_arrays->put (new_var, da_dimensions);
+	      gcc_assert (!existed);
+	      break;
+	    }
+
 	  if (DECL_P (decl))
 	    {
 	      if (DECL_SIZE (decl)
@@ -2426,6 +2600,50 @@ scan_omp_single (gomp_single *stmt, omp_context *o
     layout_type (ctx->record_type);
 }
 
+/* Reorder clauses so that dynamic array map clauses are placed at the very
+   front of the chain.  */
+
+static void
+reorder_dynamic_array_clauses (tree *clauses_ptr)
+{
+  tree c, clauses = *clauses_ptr;
+  tree prev_clause = NULL_TREE, next_clause;
+  tree da_clauses = NULL_TREE, da_clauses_tail = NULL_TREE;
+
+  for (c = clauses; c; c = next_clause)
+    {
+      next_clause = OMP_CLAUSE_CHAIN (c);
+
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	  && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	{
+	  /* Unchain c from clauses.  */
+	  if (c == clauses)
+	    clauses = next_clause;
+
+	  /* Link on to da_clauses.  */
+	  if (da_clauses_tail)
+	    OMP_CLAUSE_CHAIN (da_clauses_tail) = c;
+	  else
+	    da_clauses = c;
+	  da_clauses_tail = c;
+
+	  if (prev_clause)
+	    OMP_CLAUSE_CHAIN (prev_clause) = next_clause;
+	  continue;
+	}
+
+      prev_clause = c;
+    }  
+
+  /* Place dynamic array clauses at the start of the clause list.  */
+  if (da_clauses)
+    {
+      OMP_CLAUSE_CHAIN (da_clauses_tail) = clauses;
+      *clauses_ptr = da_clauses;
+    }
+}
+
 /* Scan a GIMPLE_OMP_TARGET.  */
 
 static void
@@ -2434,7 +2652,6 @@ scan_omp_target (gomp_target *stmt, omp_context *o
   omp_context *ctx;
   tree name;
   bool offloaded = is_gimple_omp_offloaded (stmt);
-  tree clauses = gimple_omp_target_clauses (stmt);
 
   ctx = new_omp_context (stmt, outer_ctx);
   ctx->field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0);
@@ -2453,6 +2670,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
+  /* If OpenACC construct, put dynamic array clauses (if any) in front of
+     clause chain. The runtime can then test the first to see if the
+     additional map processing for them is required.  */
+  if (is_gimple_omp_oacc (stmt))
+    reorder_dynamic_array_clauses (gimple_omp_target_clauses_ptr (stmt));
+
+  tree clauses = gimple_omp_target_clauses (stmt);
+  
   scan_sharing_clauses (clauses, ctx);
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
 
@@ -9151,6 +9376,15 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 	  case GOMP_MAP_FORCE_PRESENT:
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
+	  case GOMP_MAP_DYNAMIC_ARRAY_TO:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_TOFROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TO:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_FROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_TOFROM:
+	  case GOMP_MAP_DYNAMIC_ARRAY_ALLOC:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_ALLOC:
+	  case GOMP_MAP_DYNAMIC_ARRAY_FORCE_PRESENT:
 	  case GOMP_MAP_LINK:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
@@ -9213,7 +9447,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 	if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 			   && OMP_CLAUSE_MAP_IN_REDUCTION (c)))
 	  {
-	    x = build_receiver_ref (var, true, ctx);
+	    tree var_type = TREE_TYPE (var);
+	    bool rcv_by_ref =
+	      (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	       && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))
+	       && TREE_CODE (var_type) != ARRAY_TYPE
+	       ? false : true);
+
+	    x = build_receiver_ref (var, rcv_by_ref, ctx);
 	    tree new_var = lookup_decl (var, ctx);
 
 	    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
@@ -9457,6 +9698,25 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 		    avar = build_fold_addr_expr (avar);
 		    gimplify_assign (x, avar, &ilist);
 		  }
+		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+			 && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY))
+		  {
+		    int da_dim_num;
+		    tree dimensions = OMP_CLAUSE_SIZE (c);
+
+		    tree da_descr_type =
+		      create_dynamic_array_descr_type (OMP_CLAUSE_DECL (c),
+						       dimensions, &da_dim_num);
+		    tree da_descr =
+		      create_tmp_var_raw (da_descr_type, ".$omp_da_descr");
+		    gimple_add_tmp_var (da_descr);
+
+		    create_dynamic_array_descr_init_code
+		      (da_descr, ovar, dimensions, da_dim_num, &ilist);
+
+		    gimplify_assign (x, build_fold_addr_expr (da_descr),
+				     &ilist);
+		  }
 		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
 		  {
 		    gcc_assert (is_gimple_omp_oacc (ctx->stmt));
@@ -9517,6 +9777,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 		  s = TREE_TYPE (s);
 		s = TYPE_SIZE_UNIT (s);
 	      }
+	    else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+		     && (OMP_CLAUSE_MAP_KIND (c) & GOMP_MAP_DYNAMIC_ARRAY))
+	      s = NULL_TREE;
 	    else
 	      s = OMP_CLAUSE_SIZE (c);
 	    if (s == NULL_TREE)

[-- Attachment #3: 04-omp-low.descr_create.v1-v2.diff --]
[-- Type: text/plain, Size: 2174 bytes --]

--- trunk-orig/gcc/omp-low.c	2018-12-12 18:19:40.920852744 +0800
+++ trunk-work/gcc/omp-low.c	2018-12-13 22:28:29.149913219 +0800
@@ -2600,6 +2600,50 @@
     layout_type (ctx->record_type);
 }
 
+/* Reorder clauses so that dynamic array map clauses are placed at the very
+   front of the chain.  */
+
+static void
+reorder_dynamic_array_clauses (tree *clauses_ptr)
+{
+  tree c, clauses = *clauses_ptr;
+  tree prev_clause = NULL_TREE, next_clause;
+  tree da_clauses = NULL_TREE, da_clauses_tail = NULL_TREE;
+
+  for (c = clauses; c; c = next_clause)
+    {
+      next_clause = OMP_CLAUSE_CHAIN (c);
+
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	  && GOMP_MAP_DYNAMIC_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	{
+	  /* Unchain c from clauses.  */
+	  if (c == clauses)
+	    clauses = next_clause;
+
+	  /* Link on to da_clauses.  */
+	  if (da_clauses_tail)
+	    OMP_CLAUSE_CHAIN (da_clauses_tail) = c;
+	  else
+	    da_clauses = c;
+	  da_clauses_tail = c;
+
+	  if (prev_clause)
+	    OMP_CLAUSE_CHAIN (prev_clause) = next_clause;
+	  continue;
+	}
+
+      prev_clause = c;
+    }  
+
+  /* Place dynamic array clauses at the start of the clause list.  */
+  if (da_clauses)
+    {
+      OMP_CLAUSE_CHAIN (da_clauses_tail) = clauses;
+      *clauses_ptr = da_clauses;
+    }
+}
+
 /* Scan a GIMPLE_OMP_TARGET.  */
 
 static void
@@ -2608,7 +2652,6 @@
   omp_context *ctx;
   tree name;
   bool offloaded = is_gimple_omp_offloaded (stmt);
-  tree clauses = gimple_omp_target_clauses (stmt);
 
   ctx = new_omp_context (stmt, outer_ctx);
   ctx->field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0);
@@ -2627,6 +2670,14 @@
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
+  /* If OpenACC construct, put dynamic array clauses (if any) in front of
+     clause chain. The runtime can then test the first to see if the
+     additional map processing for them is required.  */
+  if (is_gimple_omp_oacc (stmt))
+    reorder_dynamic_array_clauses (gimple_omp_target_clauses_ptr (stmt));
+
+  tree clauses = gimple_omp_target_clauses (stmt);
+  
   scan_sharing_clauses (clauses, ctx);
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support
  2018-12-06 14:43                     ` Jakub Jelinek
@ 2018-12-13 14:52                       ` Chung-Lin Tang
  0 siblings, 0 replies; 24+ messages in thread
From: Chung-Lin Tang @ 2018-12-13 14:52 UTC (permalink / raw)
  To: Jakub Jelinek, cltang; +Cc: gcc-patches, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 3496 bytes --]

On 2018/12/6 10:43 PM, Jakub Jelinek wrote:
> On Thu, Dec 06, 2018 at 10:19:43PM +0800, Chung-Lin Tang wrote:
>>> Why do you call the non-contiguous arrays dynamic arrays?  Is that some OpenACC term?
>>> I'd also prefix those with gomp_ and it is important to make it clear what
>>> is the ABI type shared with the compiler and what are the internal types.
>>> struct gomp_array_descr would look more natural to me.
>>
>> Well it's not particularly an OpenACC term, just that non-contiguous arrays are
>> often multi-dimensional arrays dynamically allocated and created through (arrays of) pointers.
>> Are you strongly opposed to this naming? If so, I can adjust this part.
> 
> The way how those arrays are created (and it doesn't have to be dynamically
> allocated) doesn't affect their representation.
> There are various terms that describe various data structures, like Iliffe
> vectors, jagged/ragged arrays, dope vectors.
> I guess it depends on what kind of data structures does this new framework
> support, if the Iliffe vectors (arrays of pointers), or just flat but
> strided arrays, etc.
> 
>>>> +  for (i = 0; i < mapnum; i++)
>>>> +    {
>>>> +      int kind = get_kind (short_mapkind, kinds, i);
>>>> +      if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
>>>> +	{
>>>> +	  da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
>>>> +	  da_info_num += 1;
>>>> +	}
>>>> +    }
>>>
>>> I'm not really happy by adding several extra loops which will not do
>>> anything in the case there are no non-contiguous arrays being mapped (for
>>> now always for OpenMP (OpenMP 5 has support for non-contigious target update
>>> to/from though) and guess rarely for OpenACC).
>>> Can't you use some flag bit in flags passed to GOMP_target* etc. and do the
>>> above loop only if the compiler indicated there are any?
>>
>> I originally strived to not have that loop, but because each row in the last dimension
>> is mapped as its own target_var_desc, we need to count them at this stage to allocate
>> the right number at start. Otherwise a realloc later seems even more ugly...
>>
>> We currently don't have a suitable flag word argument in GOMP_target*, GOACC_parallel*, etc.
>> I am not sure if such a feature warrants changing the interface.
>>
>> If you are weary of OpenMP being affected, I can add a condition to restrict such processing
>> to only (pragma_kind == GOMP_MAP_VARS_OPENACC). Is that okay? (at least before making any
>> larger runtime interface adjustments)
> 
> That will still cost you doing that loop for OpenACC constructs that don't
> have any of these non-contiguous arrays.  GOMP_target_ext has flags
> argument, but GOACC_paralel_keyed doesn't.  It has ... and you could perhaps
> encode some flag in there.  Or, could these array descriptors be passed
> first in the list of vars, so instead of a loop to check for these you could
> just check the first one?

Hi Jakub,
I have revised the patch to rename the main struct da_* types into struct gomp_array_* and
added more detailed descriptions in the comments (though admittedly the "dynamic array" term
is not purged completely).

I have opted for the place-at-start-of-chain route, this should avoid all the tests and
additional iterating when such arrays are not used. There's also another omp-low.c update in
another patch.

Besides the revised whole patch, I have also attached a v1-v2 diff showing the changes in between.
Tested with offloading to ensure no regressions.

Thanks,
Chung-Lin






[-- Attachment #2: openacc-da-07.libgomp-target.v2.patch --]
[-- Type: text/plain, Size: 13350 bytes --]

Index: libgomp/target.c
===================================================================
--- libgomp/target.c	(revision 267050)
+++ libgomp/target.c	(working copy)
@@ -477,6 +477,151 @@ gomp_map_val (struct target_mem_desc *tgt, void **
   return tgt->tgt_start + tgt->list[i].offset;
 }
 
+/* Definitions for data structures describing dynamic, non-contiguous arrays
+   (Note: interfaces with compiler)
+
+   The compiler generates a descriptor for each such array, places the
+   descriptor on stack, and passes the address of the descriptor to the libgomp
+   runtime as a normal map argument. The runtime then processes the array
+   data structure setup, and replaces the argument with the new actual
+   array address for the child function.
+
+   Care must be taken such that the struct field and layout assumptions
+   of struct gomp_array_dim, gomp_array_descr_type inside the compiler
+   be consistant with the below declarations.  */
+
+struct gomp_array_dim {
+  size_t base;
+  size_t length;
+  size_t elem_size;
+  size_t is_array;
+};
+
+struct gomp_array_descr_type {
+  void *ptr;
+  size_t ndims;
+  struct gomp_array_dim dims[];
+};
+
+/* Internal dynamic array info struct, used only here inside the runtime. */
+
+struct da_info
+{
+  struct gomp_array_descr_type *descr;
+  size_t map_index;
+  size_t ptrblock_size;
+  size_t data_row_num;
+  size_t data_row_size;
+};
+
+static size_t
+gomp_dynamic_array_count_rows (struct gomp_array_descr_type *descr)
+{
+  size_t nrows = 1;
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    nrows *= descr->dims[d].length / sizeof (void *);
+  return nrows;
+}
+
+static void
+gomp_dynamic_array_compute_info (struct da_info *da)
+{
+  size_t d, n = 1;
+  struct gomp_array_descr_type *descr = da->descr;
+
+  da->ptrblock_size = 0;
+  for (d = 0; d < descr->ndims - 1; d++)
+    {
+      size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size;
+      size_t dim_ptrblock_size = (descr->dims[d + 1].is_array
+				  ? 0 : descr->dims[d].length * n);
+      da->ptrblock_size += dim_ptrblock_size;
+      n *= dim_count;
+    }
+  da->data_row_num = n;
+  da->data_row_size = descr->dims[d].length;
+}
+
+static void
+gomp_dynamic_array_fill_rows_1 (struct gomp_array_descr_type *descr, void *da,
+				size_t d, void ***row_ptr, size_t *count)
+{
+  if (d < descr->ndims - 1)
+    {
+      size_t elsize = descr->dims[d].elem_size;
+      size_t n = descr->dims[d].length / elsize;
+      void *p = da + descr->dims[d].base;
+      for (size_t i = 0; i < n; i++)
+	{
+	  void *ptr = p + i * elsize;
+	  /* Deref if next dimension is not array.  */
+	  if (!descr->dims[d + 1].is_array)
+	    ptr = *((void **) ptr);
+	  gomp_dynamic_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count);
+	}
+    }
+  else
+    {
+      **row_ptr = da + descr->dims[d].base;
+      *row_ptr += 1;
+      *count += 1;
+    }
+}
+
+static size_t
+gomp_dynamic_array_fill_rows (struct gomp_array_descr_type *descr, void *rows[])
+{
+  size_t count = 0;
+  void **p = rows;
+  gomp_dynamic_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count);
+  return count;
+}
+
+static void *
+gomp_dynamic_array_create_ptrblock (struct da_info *da,
+				    void *tgt_addr, void *tgt_data_rows[])
+{
+  struct gomp_array_descr_type *descr = da->descr;
+  void *ptrblock = gomp_malloc (da->ptrblock_size);
+  void **curr_dim_ptrblock = (void **) ptrblock;
+  size_t n = 1;
+
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    {
+      int curr_dim_len = descr->dims[d].length;
+      int next_dim_len = descr->dims[d + 1].length;
+      int curr_dim_num = curr_dim_len / sizeof (void *);
+
+      void *next_dim_ptrblock
+	= (void *)(curr_dim_ptrblock + n * curr_dim_num);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < curr_dim_num; i++)
+	  {
+	    if (d < descr->ndims - 2)
+	      {
+		void *ptr = (next_dim_ptrblock
+			     + b * curr_dim_num * next_dim_len
+			     + i * next_dim_len);
+		void *tgt_ptr = tgt_addr + (ptr - ptrblock);
+		curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr;
+	      }
+	    else
+	      {
+		curr_dim_ptrblock[b * curr_dim_num + i]
+		  = tgt_data_rows[b * curr_dim_num + i];
+	      }
+	    void *addr = &curr_dim_ptrblock[b * curr_dim_num + i];
+	    assert (ptrblock <= addr && addr < ptrblock + da->ptrblock_size);
+	  }
+
+      n *= curr_dim_num;
+      curr_dim_ptrblock = next_dim_ptrblock;
+    }
+  assert (n == da->data_row_num);
+  return ptrblock;
+}
+
 attribute_hidden struct target_mem_desc *
 gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	       void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds,
@@ -488,9 +633,37 @@ gomp_map_vars (struct gomp_device_descr *devicep,
   const int typemask = short_mapkind ? 0xff : 0x7;
   struct splay_tree_s *mem_map = &devicep->mem_map;
   struct splay_tree_key_s cur_node;
-  struct target_mem_desc *tgt
-    = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
-  tgt->list_count = mapnum;
+  struct target_mem_desc *tgt;
+
+  bool process_dynarrays = false;
+  size_t da_data_row_num = 0, row_start = 0;
+  size_t da_info_num = 0, da_index;
+  struct da_info *da_info = NULL;
+  struct target_var_desc *row_desc;
+  uintptr_t target_row_addr;
+  void **host_data_rows = NULL, **target_data_rows = NULL;
+  void *row;
+
+  if (mapnum > 0)
+    {
+      int kind = get_kind (short_mapkind, kinds, 0);
+      process_dynarrays = GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask);
+    }
+
+  if (process_dynarrays)
+    for (i = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	  {
+	    da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
+	    da_info_num += 1;
+	  }
+      }
+
+  tgt = gomp_malloc (sizeof (*tgt)
+		     + sizeof (tgt->list[0]) * (mapnum + da_data_row_num));
+  tgt->list_count = mapnum + da_data_row_num;
   tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
   tgt->device_descr = devicep;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
@@ -502,6 +675,14 @@ gomp_map_vars (struct gomp_device_descr *devicep,
       return tgt;
     }
 
+  if (da_info_num)
+    da_info = gomp_alloca (sizeof (struct da_info) * da_info_num);
+  if (da_data_row_num)
+    {
+      host_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num);
+      target_data_rows = gomp_malloc (sizeof (void *) * da_data_row_num);
+    }
+
   tgt_align = sizeof (void *);
   tgt_size = 0;
   cbuf.chunks = NULL;
@@ -533,7 +714,7 @@ gomp_map_vars (struct gomp_device_descr *devicep,
       return NULL;
     }
 
-  for (i = 0; i < mapnum; i++)
+  for (i = 0, da_index = 0; i < mapnum; i++)
     {
       int kind = get_kind (short_mapkind, kinds, i);
       if (hostaddrs[i] == NULL
@@ -606,6 +787,20 @@ gomp_map_vars (struct gomp_device_descr *devicep,
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	{
+	  /* Ignore dynamic arrays for now, we process them together
+	     later.  */
+	  tgt->list[i].key = NULL;
+	  tgt->list[i].offset = 0;
+	  not_found_cnt++;
+
+	  struct da_info *da = &da_info[da_index++];
+	  da->descr = (struct gomp_array_descr_type *) hostaddrs[i];
+	  da->map_index = i;
+	  continue;
+	}
+
       cur_node.host_start = (uintptr_t) hostaddrs[i];
       if (!GOMP_MAP_POINTER_P (kind & typemask))
 	cur_node.host_end = cur_node.host_start + sizes[i];
@@ -674,6 +869,56 @@ gomp_map_vars (struct gomp_device_descr *devicep,
 	}
     }
 
+  /* For dynamic arrays. Each data row is one target item, separated from
+     the normal map clause items, hence we order them after mapnum.  */
+  if (process_dynarrays)
+    for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	  continue;
+
+	struct da_info *da = &da_info[da_index++];
+	struct gomp_array_descr_type *descr = da->descr;
+	size_t nr;
+
+	gomp_dynamic_array_compute_info (da);
+
+	/* We have allocated space in host/target_data_rows to place all the
+	   row data block pointers, now we can start filling them in.  */
+	nr = gomp_dynamic_array_fill_rows (descr, &host_data_rows[row_start]);
+	assert (nr == da->data_row_num);
+
+	size_t align = (size_t) 1 << (kind >> rshift);
+	if (tgt_align < align)
+	  tgt_align = align;
+	tgt_size = (tgt_size + align - 1) & ~(align - 1);
+	tgt_size += da->ptrblock_size;
+
+	for (size_t j = 0; j < da->data_row_num; j++)
+	  {
+	    row = host_data_rows[row_start + j];
+	    row_desc = &tgt->list[mapnum + row_start + j];
+
+	    cur_node.host_start = (uintptr_t) row;
+	    cur_node.host_end = cur_node.host_start + da->data_row_size;
+	    splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	    if (n)
+	      {
+		assert (n->refcount != REFCOUNT_LINK);
+		gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+					kind & typemask, /* TODO: cbuf? */ NULL);
+	      }
+	    else
+	      {
+		tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		tgt_size += da->data_row_size;
+		not_found_cnt++;
+	      }
+	  }
+	row_start += da->data_row_num;
+      }
+
   if (devaddrs)
     {
       if (mapnum != 1)
@@ -817,6 +1062,15 @@ gomp_map_vars (struct gomp_device_descr *devicep,
 	      default:
 		break;
 	      }
+
+	    if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	      {
+		tgt->list[i].key = &array->key;
+		tgt->list[i].key->tgt = tgt;
+		array++;
+		continue;
+	      }
+
 	    splay_tree_key k = &array->key;
 	    k->host_start = (uintptr_t) hostaddrs[i];
 	    if (!GOMP_MAP_POINTER_P (kind & typemask))
@@ -965,8 +1219,113 @@ gomp_map_vars (struct gomp_device_descr *devicep,
 		array++;
 	      }
 	  }
+
+      /* Processing of dynamic array rows.  */
+      if (process_dynarrays)
+	{
+	  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+	    {
+	      int kind = get_kind (short_mapkind, kinds, i);
+	      if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+		continue;
+
+	      struct da_info *da = &da_info[da_index++];
+	      assert (da->descr == hostaddrs[i]);
+
+	      /* The map for the dynamic array itself is never copied from during
+		 unmapping, its the data rows that count. Set copy from flags are
+		 set to false here.  */
+	      tgt->list[i].copy_from = false;
+	      tgt->list[i].always_copy_from = false;
+
+	      size_t align = (size_t) 1 << (kind >> rshift);
+	      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+
+	      /* For the map of the dynamic array itself, adjust so that the passed
+		 device address points to the beginning of the ptrblock.  */
+	      tgt->list[i].key->tgt_offset = tgt_size;
+
+	      void *target_ptrblock = (void*) tgt->tgt_start + tgt_size;
+	      tgt_size += da->ptrblock_size;
+
+	      /* Add splay key for each data row in current DA.  */
+	      for (size_t j = 0; j < da->data_row_num; j++)
+		{
+		  row = host_data_rows[row_start + j];
+		  row_desc = &tgt->list[mapnum + row_start + j];
+
+		  cur_node.host_start = (uintptr_t) row;
+		  cur_node.host_end = cur_node.host_start + da->data_row_size;
+		  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+		  if (n)
+		    {
+		      assert (n->refcount != REFCOUNT_LINK);
+		      gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+					      kind & typemask, cbufp);
+		      target_row_addr = n->tgt->tgt_start + n->tgt_offset;
+		    }
+		  else
+		    {
+		      tgt->refcount++;
+
+		      splay_tree_key k = &array->key;
+		      k->host_start = (uintptr_t) row;
+		      k->host_end = k->host_start + da->data_row_size;
+
+		      k->tgt = tgt;
+		      k->refcount = 1;
+		      k->link_key = NULL;
+		      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		      target_row_addr = tgt->tgt_start + tgt_size;
+		      k->tgt_offset = tgt_size;
+		      tgt_size += da->data_row_size;
+
+		      row_desc->key = k;
+		      row_desc->copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->always_copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->offset = 0;
+		      row_desc->length = da->data_row_size;
+
+		      array->left = NULL;
+		      array->right = NULL;
+		      splay_tree_insert (mem_map, array);
+
+		      if (GOMP_MAP_COPY_TO_P (kind & typemask))
+			gomp_copy_host2dev (devicep,
+					    (void *) tgt->tgt_start + k->tgt_offset,
+					    (void *) k->host_start,
+					    da->data_row_size, cbufp);
+		      array++;
+		    }
+		  target_data_rows[row_start + j] = (void *) target_row_addr;
+		}
+
+	      /* Now we have the target memory allocated, and target offsets of all
+		 row blocks assigned and calculated, we can construct the
+		 accelerator side ptrblock and copy it in.  */
+	      if (da->ptrblock_size)
+		{
+		  void *ptrblock = gomp_dynamic_array_create_ptrblock
+		    (da, target_ptrblock, target_data_rows + row_start);
+		  gomp_copy_host2dev (devicep, target_ptrblock, ptrblock,
+				      da->ptrblock_size, cbufp);
+		  free (ptrblock);
+		}
+
+	      row_start += da->data_row_num;
+	    }
+	  assert (row_start == da_data_row_num && da_index == da_info_num);
+	}
     }
 
+  if (da_data_row_num)
+    {
+      free (host_data_rows);
+      free (target_data_rows);
+    }
+
   if (pragma_kind == GOMP_MAP_VARS_TARGET)
     {
       for (i = 0; i < mapnum; i++)

[-- Attachment #3: 07-target.c.v1-v2.diff --]
[-- Type: text/plain, Size: 14095 bytes --]

--- trunk-orig/libgomp/target.c	2018-12-12 18:19:51.020618265 +0800
+++ trunk-work/libgomp/target.c	2018-12-12 22:05:49.197617036 +0800
@@ -477,26 +477,37 @@
   return tgt->tgt_start + tgt->list[i].offset;
 }
 
-/* Dynamic array related data structures, interfaces with the compiler.  */
+/* Definitions for data structures describing dynamic, non-contiguous arrays
+   (Note: interfaces with compiler)
 
-struct da_dim {
+   The compiler generates a descriptor for each such array, places the
+   descriptor on stack, and passes the address of the descriptor to the libgomp
+   runtime as a normal map argument. The runtime then processes the array
+   data structure setup, and replaces the argument with the new actual
+   array address for the child function.
+
+   Care must be taken such that the struct field and layout assumptions
+   of struct gomp_array_dim, gomp_array_descr_type inside the compiler
+   be consistant with the below declarations.  */
+
+struct gomp_array_dim {
   size_t base;
   size_t length;
   size_t elem_size;
   size_t is_array;
 };
 
-struct da_descr_type {
+struct gomp_array_descr_type {
   void *ptr;
   size_t ndims;
-  struct da_dim dims[];
+  struct gomp_array_dim dims[];
 };
 
 /* Internal dynamic array info struct, used only here inside the runtime. */
 
 struct da_info
 {
-  struct da_descr_type *descr;
+  struct gomp_array_descr_type *descr;
   size_t map_index;
   size_t ptrblock_size;
   size_t data_row_num;
@@ -504,7 +515,7 @@
 };
 
 static size_t
-gomp_dynamic_array_count_rows (struct da_descr_type *descr)
+gomp_dynamic_array_count_rows (struct gomp_array_descr_type *descr)
 {
   size_t nrows = 1;
   for (size_t d = 0; d < descr->ndims - 1; d++)
@@ -516,7 +527,7 @@
 gomp_dynamic_array_compute_info (struct da_info *da)
 {
   size_t d, n = 1;
-  struct da_descr_type *descr = da->descr;
+  struct gomp_array_descr_type *descr = da->descr;
 
   da->ptrblock_size = 0;
   for (d = 0; d < descr->ndims - 1; d++)
@@ -532,7 +543,7 @@
 }
 
 static void
-gomp_dynamic_array_fill_rows_1 (struct da_descr_type *descr, void *da,
+gomp_dynamic_array_fill_rows_1 (struct gomp_array_descr_type *descr, void *da,
 				size_t d, void ***row_ptr, size_t *count)
 {
   if (d < descr->ndims - 1)
@@ -558,7 +569,7 @@
 }
 
 static size_t
-gomp_dynamic_array_fill_rows (struct da_descr_type *descr, void *rows[])
+gomp_dynamic_array_fill_rows (struct gomp_array_descr_type *descr, void *rows[])
 {
   size_t count = 0;
   void **p = rows;
@@ -570,7 +581,7 @@
 gomp_dynamic_array_create_ptrblock (struct da_info *da,
 				    void *tgt_addr, void *tgt_data_rows[])
 {
-  struct da_descr_type *descr = da->descr;
+  struct gomp_array_descr_type *descr = da->descr;
   void *ptrblock = gomp_malloc (da->ptrblock_size);
   void **curr_dim_ptrblock = (void **) ptrblock;
   size_t n = 1;
@@ -624,6 +635,7 @@
   struct splay_tree_key_s cur_node;
   struct target_mem_desc *tgt;
 
+  bool process_dynarrays = false;
   size_t da_data_row_num = 0, row_start = 0;
   size_t da_info_num = 0, da_index;
   struct da_info *da_info = NULL;
@@ -632,16 +644,23 @@
   void **host_data_rows = NULL, **target_data_rows = NULL;
   void *row;
 
-  for (i = 0; i < mapnum; i++)
+  if (mapnum > 0)
     {
-      int kind = get_kind (short_mapkind, kinds, i);
-      if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
-	{
-	  da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
-	  da_info_num += 1;
-	}
+      int kind = get_kind (short_mapkind, kinds, 0);
+      process_dynarrays = GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask);
     }
 
+  if (process_dynarrays)
+    for (i = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	  {
+	    da_data_row_num += gomp_dynamic_array_count_rows (hostaddrs[i]);
+	    da_info_num += 1;
+	  }
+      }
+
   tgt = gomp_malloc (sizeof (*tgt)
 		     + sizeof (tgt->list[0]) * (mapnum + da_data_row_num));
   tgt->list_count = mapnum + da_data_row_num;
@@ -777,7 +796,7 @@
 	  not_found_cnt++;
 
 	  struct da_info *da = &da_info[da_index++];
-	  da->descr = (struct da_descr_type *) hostaddrs[i];
+	  da->descr = (struct gomp_array_descr_type *) hostaddrs[i];
 	  da->map_index = i;
 	  continue;
 	}
@@ -852,52 +871,53 @@
 
   /* For dynamic arrays. Each data row is one target item, separated from
      the normal map clause items, hence we order them after mapnum.  */
-  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
-    {
-      int kind = get_kind (short_mapkind, kinds, i);
-      if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
-	continue;
-
-      struct da_info *da = &da_info[da_index++];
-      struct da_descr_type *descr = da->descr;
-      size_t nr;
-
-      gomp_dynamic_array_compute_info (da);
-
-      /* We have allocated space in host/target_data_rows to place all the
-	 row data block pointers, now we can start filling them in.  */
-      nr = gomp_dynamic_array_fill_rows (descr, &host_data_rows[row_start]);
-      assert (nr == da->data_row_num);
+  if (process_dynarrays)
+    for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+	  continue;
 
-      size_t align = (size_t) 1 << (kind >> rshift);
-      if (tgt_align < align)
-	tgt_align = align;
-      tgt_size = (tgt_size + align - 1) & ~(align - 1);
-      tgt_size += da->ptrblock_size;
+	struct da_info *da = &da_info[da_index++];
+	struct gomp_array_descr_type *descr = da->descr;
+	size_t nr;
+
+	gomp_dynamic_array_compute_info (da);
+
+	/* We have allocated space in host/target_data_rows to place all the
+	   row data block pointers, now we can start filling them in.  */
+	nr = gomp_dynamic_array_fill_rows (descr, &host_data_rows[row_start]);
+	assert (nr == da->data_row_num);
+
+	size_t align = (size_t) 1 << (kind >> rshift);
+	if (tgt_align < align)
+	  tgt_align = align;
+	tgt_size = (tgt_size + align - 1) & ~(align - 1);
+	tgt_size += da->ptrblock_size;
 
-      for (size_t j = 0; j < da->data_row_num; j++)
-	{
-	  row = host_data_rows[row_start + j];
-	  row_desc = &tgt->list[mapnum + row_start + j];
+	for (size_t j = 0; j < da->data_row_num; j++)
+	  {
+	    row = host_data_rows[row_start + j];
+	    row_desc = &tgt->list[mapnum + row_start + j];
 
-	  cur_node.host_start = (uintptr_t) row;
-	  cur_node.host_end = cur_node.host_start + da->data_row_size;
-	  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
-	  if (n)
-	    {
-	      assert (n->refcount != REFCOUNT_LINK);
-	      gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
-				      kind & typemask, /* TODO: cbuf? */ NULL);
-	    }
-	  else
-	    {
-	      tgt_size = (tgt_size + align - 1) & ~(align - 1);
-	      tgt_size += da->data_row_size;
-	      not_found_cnt++;
-	    }
-	}
-      row_start += da->data_row_num;
-    }
+	    cur_node.host_start = (uintptr_t) row;
+	    cur_node.host_end = cur_node.host_start + da->data_row_size;
+	    splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	    if (n)
+	      {
+		assert (n->refcount != REFCOUNT_LINK);
+		gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+					kind & typemask, /* TODO: cbuf? */ NULL);
+	      }
+	    else
+	      {
+		tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		tgt_size += da->data_row_size;
+		not_found_cnt++;
+	      }
+	  }
+	row_start += da->data_row_num;
+      }
 
   if (devaddrs)
     {
@@ -1201,100 +1221,103 @@
 	  }
 
       /* Processing of dynamic array rows.  */
-      for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+      if (process_dynarrays)
 	{
-	  int kind = get_kind (short_mapkind, kinds, i);
-	  if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
-	    continue;
+	  for (i = 0, da_index = 0, row_start = 0; i < mapnum; i++)
+	    {
+	      int kind = get_kind (short_mapkind, kinds, i);
+	      if (!GOMP_MAP_DYNAMIC_ARRAY_P (kind & typemask))
+		continue;
 
-	  struct da_info *da = &da_info[da_index++];
-	  assert (da->descr == hostaddrs[i]);
+	      struct da_info *da = &da_info[da_index++];
+	      assert (da->descr == hostaddrs[i]);
 
-	  /* The map for the dynamic array itself is never copied from during
-	     unmapping, its the data rows that count. Set copy from flags are
-	     set to false here.  */
-	  tgt->list[i].copy_from = false;
-	  tgt->list[i].always_copy_from = false;
+	      /* The map for the dynamic array itself is never copied from during
+		 unmapping, its the data rows that count. Set copy from flags are
+		 set to false here.  */
+	      tgt->list[i].copy_from = false;
+	      tgt->list[i].always_copy_from = false;
 
-	  size_t align = (size_t) 1 << (kind >> rshift);
-	  tgt_size = (tgt_size + align - 1) & ~(align - 1);
-
-	  /* For the map of the dynamic array itself, adjust so that the passed
-	     device address points to the beginning of the ptrblock.  */
-	  tgt->list[i].key->tgt_offset = tgt_size;
+	      size_t align = (size_t) 1 << (kind >> rshift);
+	      tgt_size = (tgt_size + align - 1) & ~(align - 1);
 
-	  void *target_ptrblock = (void*) tgt->tgt_start + tgt_size;
-	  tgt_size += da->ptrblock_size;
+	      /* For the map of the dynamic array itself, adjust so that the passed
+		 device address points to the beginning of the ptrblock.  */
+	      tgt->list[i].key->tgt_offset = tgt_size;
 
-	  /* Add splay key for each data row in current DA.  */
-	  for (size_t j = 0; j < da->data_row_num; j++)
-	    {
-	      row = host_data_rows[row_start + j];
-	      row_desc = &tgt->list[mapnum + row_start + j];
+	      void *target_ptrblock = (void*) tgt->tgt_start + tgt_size;
+	      tgt_size += da->ptrblock_size;
 
-	      cur_node.host_start = (uintptr_t) row;
-	      cur_node.host_end = cur_node.host_start + da->data_row_size;
-	      splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
-	      if (n)
-		{
-		  assert (n->refcount != REFCOUNT_LINK);
-		  gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
-					  kind & typemask, cbufp);
-		  target_row_addr = n->tgt->tgt_start + n->tgt_offset;
-		}
-	      else
+	      /* Add splay key for each data row in current DA.  */
+	      for (size_t j = 0; j < da->data_row_num; j++)
 		{
-		  tgt->refcount++;
+		  row = host_data_rows[row_start + j];
+		  row_desc = &tgt->list[mapnum + row_start + j];
 
-		  splay_tree_key k = &array->key;
-		  k->host_start = (uintptr_t) row;
-		  k->host_end = k->host_start + da->data_row_size;
-
-		  k->tgt = tgt;
-		  k->refcount = 1;
-		  k->link_key = NULL;
-		  tgt_size = (tgt_size + align - 1) & ~(align - 1);
-		  target_row_addr = tgt->tgt_start + tgt_size;
-		  k->tgt_offset = tgt_size;
-		  tgt_size += da->data_row_size;
-
-		  row_desc->key = k;
-		  row_desc->copy_from
-		    = GOMP_MAP_COPY_FROM_P (kind & typemask);
-		  row_desc->always_copy_from
-		    = GOMP_MAP_COPY_FROM_P (kind & typemask);
-		  row_desc->offset = 0;
-		  row_desc->length = da->data_row_size;
-
-		  array->left = NULL;
-		  array->right = NULL;
-		  splay_tree_insert (mem_map, array);
+		  cur_node.host_start = (uintptr_t) row;
+		  cur_node.host_end = cur_node.host_start + da->data_row_size;
+		  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+		  if (n)
+		    {
+		      assert (n->refcount != REFCOUNT_LINK);
+		      gomp_map_vars_existing (devicep, n, &cur_node, row_desc,
+					      kind & typemask, cbufp);
+		      target_row_addr = n->tgt->tgt_start + n->tgt_offset;
+		    }
+		  else
+		    {
+		      tgt->refcount++;
 
-		  if (GOMP_MAP_COPY_TO_P (kind & typemask))
-		    gomp_copy_host2dev (devicep,
-					(void *) tgt->tgt_start + k->tgt_offset,
-					(void *) k->host_start,
-					da->data_row_size, cbufp);
-		  array++;
+		      splay_tree_key k = &array->key;
+		      k->host_start = (uintptr_t) row;
+		      k->host_end = k->host_start + da->data_row_size;
+
+		      k->tgt = tgt;
+		      k->refcount = 1;
+		      k->link_key = NULL;
+		      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		      target_row_addr = tgt->tgt_start + tgt_size;
+		      k->tgt_offset = tgt_size;
+		      tgt_size += da->data_row_size;
+
+		      row_desc->key = k;
+		      row_desc->copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->always_copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->offset = 0;
+		      row_desc->length = da->data_row_size;
+
+		      array->left = NULL;
+		      array->right = NULL;
+		      splay_tree_insert (mem_map, array);
+
+		      if (GOMP_MAP_COPY_TO_P (kind & typemask))
+			gomp_copy_host2dev (devicep,
+					    (void *) tgt->tgt_start + k->tgt_offset,
+					    (void *) k->host_start,
+					    da->data_row_size, cbufp);
+		      array++;
+		    }
+		  target_data_rows[row_start + j] = (void *) target_row_addr;
 		}
-	      target_data_rows[row_start + j] = (void *) target_row_addr;
-	    }
 
-	  /* Now we have the target memory allocated, and target offsets of all
-	     row blocks assigned and calculated, we can construct the
-	     accelerator side ptrblock and copy it in.  */
-	  if (da->ptrblock_size)
-	    {
-	      void *ptrblock = gomp_dynamic_array_create_ptrblock
-		(da, target_ptrblock, target_data_rows + row_start);
-	      gomp_copy_host2dev (devicep, target_ptrblock, ptrblock,
-				  da->ptrblock_size, cbufp);
-	      free (ptrblock);
-	    }
+	      /* Now we have the target memory allocated, and target offsets of all
+		 row blocks assigned and calculated, we can construct the
+		 accelerator side ptrblock and copy it in.  */
+	      if (da->ptrblock_size)
+		{
+		  void *ptrblock = gomp_dynamic_array_create_ptrblock
+		    (da, target_ptrblock, target_data_rows + row_start);
+		  gomp_copy_host2dev (devicep, target_ptrblock, ptrblock,
+				      da->ptrblock_size, cbufp);
+		  free (ptrblock);
+		}
 
-	  row_start += da->data_row_num;
+	      row_start += da->data_row_num;
+	    }
+	  assert (row_start == da_data_row_num && da_index == da_info_num);
 	}
-      assert (row_start == da_data_row_num && da_index == da_info_num);
     }
 
   if (da_data_row_num)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation
  2018-12-13 14:52           ` [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation Chung-Lin Tang
@ 2018-12-18 12:51             ` Jakub Jelinek
  0 siblings, 0 replies; 24+ messages in thread
From: Jakub Jelinek @ 2018-12-18 12:51 UTC (permalink / raw)
  To: cltang; +Cc: gcc-patches, Thomas Schwinge

On Thu, Dec 13, 2018 at 10:52:32PM +0800, Chung-Lin Tang wrote:
> --- gcc/omp-low.c	(revision 267050)
> +++ gcc/omp-low.c	(working copy)
> @@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "hsa-common.h"
>  #include "stringpool.h"
>  #include "attribs.h"
> +#include "tree-hash-traits.h"
>  
>  /* Lowering of OMP parallel and workshare constructs proceeds in two
>     phases.  The first phase scans the function looking for OMP statements
> @@ -133,6 +134,9 @@ struct omp_context
>  
>    /* True if this construct can be cancelled.  */
>    bool cancellable;
> +
> +  /* Hash map of dynamic arrays in this context.  */
> +  hash_map<tree_operand_hash, tree> *dynamic_arrays;

You still call it dynamic arrays.  Call it array descriptors or something
similar.  In the comment too.

>  
> +/* Helper function for create_dynamic_array_descr_type(), to append a new field

Here too and many other spots.

> +  tree da_descr_type, name, x;

Even here.

> +  append_field_to_record_type (da_descr_type, get_identifier ("$dim_num"),
> +			       sizetype);

Why the $s in the identifiers?  Use . or __ if it shouldn't be user
accessible.  Think whether you want it to be in debuginfo or not, if not,
it should be DECL_IGNORED_P.

	Jakub

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches
@ 2019-08-20 11:54                   ` Chung-Lin Tang
  2019-08-20 12:01                     ` [PATCH, OpenACC, 2/3] Non-contiguous array support for OpenACC data clauses (re-submission), compiler patches Chung-Lin Tang
  2019-10-07 13:51                     ` [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches Thomas Schwinge
  0 siblings, 2 replies; 24+ messages in thread
From: Chung-Lin Tang @ 2019-08-20 11:54 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 1280 bytes --]

Hi Jakub, Thomas,
this is a re-submission of the patch-set from [1].

The usage of the term "dynamic arrays" didn't go well with Jakub the last time,
so this time I'm referring to this functionality as "non-contiguous arrays".

int *a[100], **b;

// re-constructs array slices on GPU and copies data in
#pragma acc parallel copyin (a[0:n][0:m], b[1:x][5:y])

The overall implementation has not changed much from the last submission,
mainly the renaming changes and rebasing to current trunk.

The first patch here are the C/C++ front-end patches.

Thanks,
Chung-Lin

[1] https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00937.html

	gcc/c/
	* c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/cp/
	* semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

[-- Attachment #2: 01.openacc-noncontig_arrays.frontends.patch --]
[-- Type: text/plain, Size: 8467 bytes --]

Index: gcc/c/c-typeck.c
===================================================================
--- gcc/c/c-typeck.c	(revision 274618)
+++ gcc/c/c-typeck.c	(working copy)
@@ -12848,7 +12848,7 @@ c_finish_omp_cancellation_point (location_t loc, t
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -12933,7 +12933,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
     }
 
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -13099,6 +13100,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
 		    }
 		}
 	    }
+
+	  /* For OpenACC, if the low_bound/length suggest this is a subarray,
+	     and is referenced through by a pointer, then mark this as
+	     non-contiguous.  */
+	  if (ort == C_ORT_ACC
+	      && types.length () > 0
+	      && (TREE_CODE (low_bound) != INTEGER_CST
+		  || integer_nonzerop (low_bound)
+		  || (length && (TREE_CODE (length) != INTEGER_CST
+				 || !tree_int_cst_equal (size, length)))))
+	    {
+	      tree x = types.last ();
+	      if (TREE_CODE (x) == POINTER_TYPE)
+		non_contiguous = true;
+	    }
 	}
       else if (length == NULL_TREE)
 	{
@@ -13142,7 +13158,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
       /* If there is a pointer type anywhere but in the very first
 	 array-section-subscript, the array section can't be contiguous.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
-	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST
+	  && ort != C_ORT_ACC)
 	{
 	  error_at (OMP_CLAUSE_LOCATION (c),
 		    "array section is not contiguous in %qs clause",
@@ -13149,6 +13166,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
 		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
 	  return error_mark_node;
 	}
+      else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	non_contiguous = true;
     }
   else
     {
@@ -13176,6 +13195,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree *tp = &OMP_CLAUSE_DECL (c);
   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
@@ -13185,7 +13205,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
     tp = &TREE_VALUE (*tp);
   tree first = handle_omp_array_sections_1 (c, *tp, types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -13218,6 +13238,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree ncarray_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -13241,6 +13262,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      ncarray_dims = tree_cons (low_bound, length, ncarray_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -13337,6 +13365,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 		size = size_binop (MULT_EXPR, size, l);
 	    }
 	}
+      if (non_contiguous)
+	{
+	  int kind = OMP_CLAUSE_MAP_KIND (c);
+	  OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY);
+	  OMP_CLAUSE_DECL (c) = t;
+	  OMP_CLAUSE_SIZE (c) = ncarray_dims;
+	  return false;
+	}
       if (side_effects)
 	size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
       if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
Index: gcc/cp/semantics.c
===================================================================
--- gcc/cp/semantics.c	(revision 274618)
+++ gcc/cp/semantics.c	(working copy)
@@ -4626,7 +4626,7 @@ omp_privatize_field (tree t, bool shared)
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -4711,7 +4711,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
       && TREE_CODE (TREE_CHAIN (t)) == FIELD_DECL)
     TREE_CHAIN (t) = omp_privatize_field (TREE_CHAIN (t), false);
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -4889,6 +4890,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
 		    }
 		}
 	    }
+
+	  /* For OpenACC, if the low_bound/length suggest this is a subarray,
+	     and is referenced through by a pointer, then mark this as
+	     non-contiguous.  */
+	  if (ort == C_ORT_ACC
+	      && types.length () > 0
+	      && (TREE_CODE (low_bound) != INTEGER_CST
+		  || integer_nonzerop (low_bound)
+		  || (length && (TREE_CODE (length) != INTEGER_CST
+				 || !tree_int_cst_equal (size, length)))))
+	    {
+	      tree x = types.last ();
+	      if (TREE_CODE (x) == POINTER_TYPE)
+		non_contiguous = true;
+	    }
 	}
       else if (length == NULL_TREE)
 	{
@@ -4932,7 +4948,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
       /* If there is a pointer type anywhere but in the very first
 	 array-section-subscript, the array section can't be contiguous.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
-	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST
+	  && ort != C_ORT_ACC)
 	{
 	  error_at (OMP_CLAUSE_LOCATION (c),
 		    "array section is not contiguous in %qs clause",
@@ -4939,6 +4956,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
 		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
 	  return error_mark_node;
 	}
+      else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
+	non_contiguous = true;
     }
   else
     {
@@ -4966,6 +4985,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree *tp = &OMP_CLAUSE_DECL (c);
   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
@@ -4975,7 +4995,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
     tp = &TREE_VALUE (*tp);
   tree first = handle_omp_array_sections_1 (c, *tp, types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -5009,6 +5029,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree ncarray_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -5034,6 +5055,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      ncarray_dims = tree_cons (low_bound, length, ncarray_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -5125,6 +5153,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 	}
       if (!processing_template_decl)
 	{
+	  if (non_contiguous)
+	    {
+	      int kind = OMP_CLAUSE_MAP_KIND (c);
+	      OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY);
+	      OMP_CLAUSE_DECL (c) = t;
+	      OMP_CLAUSE_SIZE (c) = ncarray_dims;
+	      return false;
+	    }
 	  if (side_effects)
 	    size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
 	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 2/3] Non-contiguous array support for OpenACC data clauses (re-submission), compiler patches
@ 2019-08-20 12:01                     ` Chung-Lin Tang
  2019-08-20 12:16                       ` [PATCH, OpenACC, 3/3] Non-contiguous array support for OpenACC data clauses (re-submission), libgomp patches Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2019-08-20 12:01 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 2227 bytes --]

These are the patches for gimplify, omp-low, and include/gomp-constants.h

On issue that Jakub raised in the last review email on omp-low changes [1],
was the use of DECL_IGNORED_P. Because the descriptor variables are created was
create_tmp_var_raw(), they already have DECL_IGNORED_P set, so this shouldn't
be of issue here. The use of '$' in identifier names have also been removed.

[1] https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01297.html

Thanks,
Chung-Lin

	gcc/
	* gimplify.c (gimplify_scan_omp_clauses): For non-contiguous array map kinds,
	make sure bias in each dimension are put into firstprivate variables.

	* omp-low.c (struct omp_context):
	Add 'hash_map<tree_operand_hash, tree> *non_contiguous_arrays' field, also
	added include of "tree-hash-traits.h".
	(append_field_to_record_type): New function.
	(create_noncontig_array_descr_type): Likewise.
	(create_noncontig_array_descr_init_code): Likewise.
	(new_omp_context): Add initialize of non_contiguous_arrays field.
	(delete_omp_context): Add delete of non_contiguous_arrays field.
	(scan_sharing_clauses): For non-contiguous array map kinds, check for
	supported dimension structure, and install non-contiguous array variable into
	current omp_context.
	(lower_omp_target): Add handling for non-contiguous array map kinds.
	(noncontig_array_lookup): New function.
	(noncontig_array_reference_start): Likewise.
	(scan_for_op): Likewise.
	(scan_for_reference): Likewise.
	(ncarray_create_bias): Likewise.
	(ncarray_dimension_peel): Likewise.
	(lower_omp_1): Add case to look for start of non-contiguous array reference,
	and handle bias adjustments for the code sequence.

	* tree-pretty-print.c (dump_omp_clauses): Add cases for printing
	GOMP_MAP_NONCONTIG_ARRAY map kinds.

	include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define.
	(enum gomp_map_kind): Add GOMP_MAP_NONCONTIG_ARRAY,
	GOMP_MAP_NONCONTIG_ARRAY_TO, GOMP_MAP_NONCONTIG_ARRAY_FROM,
	GOMP_MAP_NONCONTIG_ARRAY_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO,
	GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM,
	GOMP_MAP_NONCONTIG_ARRAY_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC,
	GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT.
	(GOMP_MAP_NONCONTIG_ARRAY_P): Define.


[-- Attachment #2: 02.openacc-noncontig_arrays.gcc.patch --]
[-- Type: text/plain, Size: 25422 bytes --]

Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	(revision 274618)
+++ gcc/gimplify.c	(working copy)
@@ -8563,9 +8563,29 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_se
 	  if (OMP_CLAUSE_SIZE (c) == NULL_TREE)
 	    OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl)
 				  : TYPE_SIZE_UNIT (TREE_TYPE (decl));
-	  if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
-			     NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
+	  if (OMP_CLAUSE_SIZE (c)
+	      && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST
+	      && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
 	    {
+	      tree dims = OMP_CLAUSE_SIZE (c);
+	      for (tree t = dims; t; t = TREE_CHAIN (t))
+		{
+		  /* If a dimension bias isn't a constant, we have to ensure
+		     that the value gets transferred to the offload target.  */
+		  tree low_bound = TREE_PURPOSE (t);
+		  if (TREE_CODE (low_bound) != INTEGER_CST)
+		    {
+		      low_bound = get_initialized_tmp_var (low_bound, pre_p,
+							   NULL, false);
+		      omp_add_variable (ctx, low_bound,
+					GOVD_FIRSTPRIVATE | GOVD_SEEN);
+		      TREE_PURPOSE (t) = low_bound;
+		    }
+		}
+	    }
+	  else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
+				  NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
+	    {
 	      remove = true;
 	      break;
 	    }
Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	(revision 274618)
+++ gcc/omp-low.c	(working copy)
@@ -60,6 +60,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "hsa-common.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "tree-hash-traits.h"
 
 /* Lowering of OMP parallel and workshare constructs proceeds in two
    phases.  The first phase scans the function looking for OMP statements
@@ -127,6 +128,9 @@ struct omp_context
      corresponding tracking loop iteration variables.  */
   hash_map<tree, tree> *lastprivate_conditional_map;
 
+  /* Hash map of non-contiguous arrays in this context.  */
+  hash_map<tree_operand_hash, tree> *non_contiguous_arrays;
+
   /* Nesting depth of this context.  Used to beautify error messages re
      invalid gotos.  The outermost ctx is depth 1, with depth 0 being
      reserved for the main body of the function.  */
@@ -885,6 +889,137 @@ omp_copy_decl (tree var, copy_body_data *cb)
   return error_mark_node;
 }
 
+/* Helper function for create_noncontig_array_descr_type(), to append a new field
+   to a record type.  */
+
+static void
+append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type)
+{
+  tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type);
+  DECL_CONTEXT (fld) = record_type;
+
+  for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p))
+    ;
+  *p = fld;
+}
+
+/* Create type for non-contiguous array descriptor. Returns created type, and
+   returns the number of dimensions in *DIM_NUM.  */
+
+static tree
+create_noncontig_array_descr_type (tree decl, tree dims, int *dim_num)
+{
+  int n = 0;
+  tree array_descr_type, name, x;
+  gcc_assert (TREE_CODE (dims) == TREE_LIST);
+
+  array_descr_type = lang_hooks.types.make_type (RECORD_TYPE);
+  name = create_tmp_var_name (".omp_noncontig_array_descr_type");
+  name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, array_descr_type);
+  DECL_ARTIFICIAL (name) = 1;
+  DECL_NAMELESS (name) = 1;
+  TYPE_NAME (array_descr_type) = name;
+  TYPE_ARTIFICIAL (array_descr_type) = 1;
+
+  /* Main starting pointer/array.  */
+  tree main_var_type = TREE_TYPE (decl);
+  if (TREE_CODE (main_var_type) == REFERENCE_TYPE)
+    main_var_type = TREE_TYPE (main_var_type);
+  append_field_to_record_type (array_descr_type, DECL_NAME (decl),
+			       (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE
+				? main_var_type
+				: build_pointer_type (main_var_type)));
+  /* Number of dimensions.  */
+  append_field_to_record_type (array_descr_type, get_identifier ("__dim_num"),
+			       sizetype);
+
+  for (x = dims; x; x = TREE_CHAIN (x), n++)
+    {
+      char *fldname;
+      /* One for the start index.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_base", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the length.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_length", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the element size.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_elem_size", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for is_array flag.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_is_array", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+    }
+
+  layout_type (array_descr_type);
+  *dim_num = n;
+  return array_descr_type;
+}
+
+/* Generate code sequence for initializing non-contiguous array descriptor.  */
+
+static void
+create_noncontig_array_descr_init_code (tree array_descr, tree array_var,
+					tree dimensions, int dim_num,
+					gimple_seq *ilist)
+{
+  tree fld, fldref;
+  tree array_descr_type = TREE_TYPE (array_descr);
+  tree dim_type = TREE_TYPE (array_var);
+
+  fld = TYPE_FIELDS (array_descr_type);
+  fldref = omp_build_component_ref (array_descr, fld);
+  gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE
+			    ? build_fold_addr_expr (array_var) : array_var),
+		   ilist);
+
+  if (TREE_CODE (dim_type) == REFERENCE_TYPE)
+    dim_type = TREE_TYPE (dim_type);
+
+  fld = TREE_CHAIN (fld);
+  fldref = omp_build_component_ref (array_descr, fld);
+  gimplify_assign (fldref, build_int_cst (sizetype, dim_num), ilist);
+
+  while (dimensions)
+    {
+      tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions));
+      tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions));
+      tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type));
+      tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE
+			   ? integer_one_node : integer_zero_node);
+      /* Set base.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size);
+      gimplify_assign (fldref, dim_base, ilist);
+
+      /* Set length.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size);
+      gimplify_assign (fldref, dim_length, ilist);
+
+      /* Set elem_size.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_elem_size = fold_convert (sizetype, dim_elem_size);
+      gimplify_assign (fldref, dim_elem_size, ilist);
+
+      /* Set is_array flag.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_is_array = fold_convert (sizetype, dim_is_array);
+      gimplify_assign (fldref, dim_is_array, ilist);
+
+      dimensions = TREE_CHAIN (dimensions);
+      dim_type = TREE_TYPE (dim_type);
+    }
+  gcc_assert (TREE_CHAIN (fld) == NULL_TREE);
+}
+
 /* Create a new context, with OUTER_CTX being the surrounding context.  */
 
 static omp_context *
@@ -921,6 +1056,8 @@ new_omp_context (gimple *stmt, omp_context *outer_
 
   ctx->cb.decl_map = new hash_map<tree, tree>;
 
+  ctx->non_contiguous_arrays = new hash_map<tree_operand_hash, tree>;
+
   return ctx;
 }
 
@@ -1003,6 +1140,8 @@ delete_omp_context (splay_tree_value value)
 
   delete ctx->lastprivate_conditional_map;
 
+  delete ctx->non_contiguous_arrays;
+
   XDELETE (ctx);
 }
 
@@ -1353,6 +1492,42 @@ scan_sharing_clauses (tree clauses, omp_context *c
 	      install_var_local (decl, ctx);
 	      break;
 	    }
+
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	      && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	    {
+	      tree array_decl = OMP_CLAUSE_DECL (c);
+	      tree array_dimensions = OMP_CLAUSE_SIZE (c);
+	      tree array_type = TREE_TYPE (array_decl);
+	      bool by_ref = (TREE_CODE (array_type) == ARRAY_TYPE
+			     ? true : false);
+
+	      /* Checking code to ensure we only have arrays at top dimension.
+		 This limitation might be lifted in the future.  */
+	      if (TREE_CODE (array_type) == REFERENCE_TYPE)
+		array_type = TREE_TYPE (array_type);
+	      tree t = array_type, prev_t = NULL_TREE;
+	      while (t)
+		{
+		  if (TREE_CODE (t) == ARRAY_TYPE && prev_t)
+		    {
+		      error_at (gimple_location (ctx->stmt), "array types are"
+				" only allowed at outermost dimension of"
+				" non-contiguous array");
+		      break;
+		    }
+		  prev_t = t;
+		  t = TREE_TYPE (t);
+		}
+
+	      install_var_field (array_decl, by_ref, 3, ctx);
+	      tree new_var = install_var_local (array_decl, ctx);
+
+	      bool existed = ctx->non_contiguous_arrays->put (new_var, array_dimensions);
+	      gcc_assert (!existed);
+	      break;
+	    }
+
 	  if (DECL_P (decl))
 	    {
 	      if (DECL_SIZE (decl)
@@ -2583,6 +2758,50 @@ scan_omp_single (gomp_single *stmt, omp_context *o
     layout_type (ctx->record_type);
 }
 
+/* Reorder clauses so that non-contiguous array map clauses are placed at the very
+   front of the chain.  */
+
+static void
+reorder_noncontig_array_clauses (tree *clauses_ptr)
+{
+  tree c, clauses = *clauses_ptr;
+  tree prev_clause = NULL_TREE, next_clause;
+  tree array_clauses = NULL_TREE, array_clauses_tail = NULL_TREE;
+
+  for (c = clauses; c; c = next_clause)
+    {
+      next_clause = OMP_CLAUSE_CHAIN (c);
+
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	  && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	{
+	  /* Unchain c from clauses.  */
+	  if (c == clauses)
+	    clauses = next_clause;
+
+	  /* Link on to array_clauses.  */
+	  if (array_clauses_tail)
+	    OMP_CLAUSE_CHAIN (array_clauses_tail) = c;
+	  else
+	    array_clauses = c;
+	  array_clauses_tail = c;
+
+	  if (prev_clause)
+	    OMP_CLAUSE_CHAIN (prev_clause) = next_clause;
+	  continue;
+	}
+
+      prev_clause = c;
+    }  
+
+  /* Place non-contiguous array clauses at the start of the clause list.  */
+  if (array_clauses)
+    {
+      OMP_CLAUSE_CHAIN (array_clauses_tail) = clauses;
+      *clauses_ptr = array_clauses;
+    }
+}
+
 /* Scan a GIMPLE_OMP_TARGET.  */
 
 static void
@@ -2591,7 +2810,6 @@ scan_omp_target (gomp_target *stmt, omp_context *o
   omp_context *ctx;
   tree name;
   bool offloaded = is_gimple_omp_offloaded (stmt);
-  tree clauses = gimple_omp_target_clauses (stmt);
 
   ctx = new_omp_context (stmt, outer_ctx);
   ctx->field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0);
@@ -2610,6 +2828,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
+  /* If is OpenACC construct, put non-contiguous array clauses (if any)
+     in front of clause chain. The runtime can then test the first to see
+     if the additional map processing for them is required.  */
+  if (is_gimple_omp_oacc (stmt))
+    reorder_noncontig_array_clauses (gimple_omp_target_clauses_ptr (stmt));
+
+  tree clauses = gimple_omp_target_clauses (stmt);
+  
   scan_sharing_clauses (clauses, ctx);
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
 
@@ -11326,6 +11552,15 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 	  case GOMP_MAP_FORCE_PRESENT:
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
+	  case GOMP_MAP_NONCONTIG_ARRAY_TO:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_TOFROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_ALLOC:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT:
 	  case GOMP_MAP_LINK:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
@@ -11388,7 +11623,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 	if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 			   && OMP_CLAUSE_MAP_IN_REDUCTION (c)))
 	  {
-	    x = build_receiver_ref (var, true, ctx);
+	    tree var_type = TREE_TYPE (var);
+	    bool rcv_by_ref =
+	      (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	       && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))
+	       && TREE_CODE (var_type) != ARRAY_TYPE
+	       ? false : true);
+
+	    x = build_receiver_ref (var, rcv_by_ref, ctx);
 	    tree new_var = lookup_decl (var, ctx);
 
 	    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
@@ -11635,6 +11877,24 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 		    avar = build_fold_addr_expr (avar);
 		    gimplify_assign (x, avar, &ilist);
 		  }
+		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+			 && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+		  {
+		    int dim_num;
+		    tree dimensions = OMP_CLAUSE_SIZE (c);
+
+		    tree array_descr_type =
+		      create_noncontig_array_descr_type (OMP_CLAUSE_DECL (c),
+							 dimensions, &dim_num);
+		    tree array_descr =
+		      create_tmp_var_raw (array_descr_type, ".omp_noncontig_array_descr");
+		    gimple_add_tmp_var (array_descr);
+
+		    create_noncontig_array_descr_init_code
+		      (array_descr, ovar, dimensions, dim_num, &ilist);
+
+		    gimplify_assign (x, build_fold_addr_expr (array_descr), &ilist);
+		  }
 		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
 		  {
 		    gcc_assert (is_gimple_omp_oacc (ctx->stmt));
@@ -11695,6 +11955,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 		  s = TREE_TYPE (s);
 		s = TYPE_SIZE_UNIT (s);
 	      }
+	    else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+		     && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	      s = NULL_TREE;
 	    else
 	      s = OMP_CLAUSE_SIZE (c);
 	    if (s == NULL_TREE)
@@ -12384,7 +12647,202 @@ lower_omp_grid_body (gimple_stmt_iterator *gsi_p,
 		       gimple_build_omp_return (false));
 }
 
+/* Helper to lookup non-contiguous arrays through nested omp contexts. Returns
+   TREE_LIST of dimensions, and the CTX where it was found in *CTX_P.  */
 
+static tree
+noncontig_array_lookup (tree t, omp_context **ctx_p)
+{
+  omp_context *c = *ctx_p;
+  while (c)
+    {
+      tree *dims = c->non_contiguous_arrays->get (t);
+      if (dims)
+	{
+	  *ctx_p = c;
+	  return *dims;
+	}
+      c = c->outer;
+    }
+  return NULL_TREE;
+}
+
+/* Tests if this gimple STMT is the start of a non-contiguous array access
+   sequence. Returns true if found, and also returns the gimple operand ptr
+   and dimensions tree list through *OUT_REF and *OUT_DIMS respectively.  */
+
+static bool
+noncontig_array_reference_start (gimple *stmt, omp_context **ctx_p,
+				 tree **out_ref, tree *out_dims)
+{
+  if (gimple_code (stmt) == GIMPLE_ASSIGN)
+    for (unsigned i = 1; i < gimple_num_ops (stmt); i++)
+      {
+	tree *op = gimple_op_ptr (stmt, i), dims;
+	if (TREE_CODE (*op) == ARRAY_REF)
+	  op = &TREE_OPERAND (*op, 0);
+	if (TREE_CODE (*op) == MEM_REF)
+	  op = &TREE_OPERAND (*op, 0);
+	if ((dims = noncontig_array_lookup (*op, ctx_p)) != NULL_TREE)
+	  {
+	    *out_ref = op;
+	    *out_dims = dims;
+	    return true;
+	  }
+      }
+  return false;
+}
+
+static tree
+scan_for_op (tree *tp, int *walk_subtrees, void *data)
+{
+  struct walk_stmt_info *wi = (struct walk_stmt_info *) data;
+  tree t = *tp;
+  tree op = (tree) wi->info;
+  *walk_subtrees = 1;
+  if (operand_equal_p (t, op, 0))
+    {
+      wi->info = tp;
+      return t;
+    }
+  return NULL_TREE;
+}
+
+static tree *
+scan_for_reference (gimple *stmt, tree op)
+{
+  struct walk_stmt_info wi;
+  memset (&wi, 0, sizeof (wi));
+  wi.info = op;
+  if (walk_gimple_op (stmt, scan_for_op, &wi))
+    return (tree *) wi.info;
+  return NULL;
+}
+
+static tree
+ncarray_create_bias (tree orig_bias, tree unit_type)
+{
+  return build2 (MULT_EXPR, sizetype, fold_convert (sizetype, orig_bias),
+		 TYPE_SIZE_UNIT (unit_type));
+}
+
+/* Main worker for adjusting non-contiguous array accesses, handles the
+   adjustment of many cases of statement forms, and called multiple times
+   to 'peel' away each dimension.  */
+
+static gimple_stmt_iterator
+ncarray_dimension_peel (omp_context *ctx,
+			gimple_stmt_iterator gsi, tree orig_da,
+			tree *op_ptr, tree *type_ptr, tree *dims_ptr)
+{
+  gimple *stmt = gsi_stmt (gsi);
+  tree lhs = gimple_assign_lhs (stmt);
+  tree rhs = gimple_assign_rhs1 (stmt);
+
+  if (gimple_num_ops (stmt) == 2
+      && TREE_CODE (rhs) == MEM_REF
+      && operand_equal_p (*op_ptr, TREE_OPERAND (rhs, 0), 0)
+      && !operand_equal_p (orig_da, TREE_OPERAND (rhs, 0), 0)
+      && (TREE_OPERAND (rhs, 1) == NULL_TREE
+	  || integer_zerop (TREE_OPERAND (rhs, 1))))
+    {
+      gcc_assert (TREE_CODE (TREE_TYPE (*type_ptr)) == POINTER_TYPE);
+      *type_ptr = TREE_TYPE (*type_ptr);
+    }
+  else 
+    {
+      gimple *g;
+      gimple_seq ilist = NULL;
+      tree bias, t;
+      tree op = *op_ptr;
+      tree orig_type = *type_ptr;
+      tree orig_bias = TREE_PURPOSE (*dims_ptr);
+      bool by_ref = false;
+
+      if (TREE_CODE (orig_bias) != INTEGER_CST)
+	orig_bias = lookup_decl (orig_bias, ctx);
+
+      if (gimple_num_ops (stmt) == 2)
+	{
+	  if (TREE_CODE (rhs) == ADDR_EXPR)
+	    {
+	      rhs = TREE_OPERAND (rhs, 0);
+	      *dims_ptr = NULL_TREE;
+	    }
+
+	  if (TREE_CODE (rhs) == ARRAY_REF
+	      && TREE_CODE (TREE_OPERAND (rhs, 0)) == MEM_REF
+	      && operand_equal_p (TREE_OPERAND (TREE_OPERAND (rhs, 0), 0),
+				  *op_ptr, 0))
+	    {
+	      bias = ncarray_create_bias (orig_bias,
+					  TREE_TYPE (TREE_TYPE (orig_type)));
+	      *type_ptr = TREE_TYPE (TREE_TYPE (orig_type));
+	    }
+	  else if (TREE_CODE (rhs) == ARRAY_REF
+		   && TREE_CODE (TREE_OPERAND (rhs, 0)) == VAR_DECL
+		   && operand_equal_p (TREE_OPERAND (rhs, 0), *op_ptr, 0))
+	    {
+	      tree ptr_type = build_pointer_type (orig_type);
+	      op = create_tmp_var (ptr_type);
+	      gimplify_assign (op, build_fold_addr_expr (TREE_OPERAND (rhs, 0)),
+			       &ilist);
+	      bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type));
+	      *type_ptr = TREE_TYPE (orig_type);
+	      orig_type = ptr_type;
+	      by_ref = true;
+	    }
+	  else if (TREE_CODE (rhs) == MEM_REF
+		   && operand_equal_p (*op_ptr, TREE_OPERAND (rhs, 0), 0)
+		   && TREE_OPERAND (rhs, 1) != NULL_TREE)
+	    {
+	      bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type));
+	      *type_ptr = TREE_TYPE (orig_type);
+	    }
+	  else if (TREE_CODE (lhs) == MEM_REF
+		   && operand_equal_p (*op_ptr, TREE_OPERAND (lhs, 0), 0))
+	    {
+	      if (*dims_ptr != NULL_TREE)
+		{
+		  gcc_assert (TREE_CHAIN (*dims_ptr) == NULL_TREE);
+		  bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type));
+		  *type_ptr = TREE_TYPE (orig_type);
+		}
+	      else
+		/* This should be the end of the non-contiguous array access
+		   sequence.  */
+		return gsi;
+	    }
+	  else
+	    gcc_unreachable ();
+	}
+      else if (gimple_num_ops (stmt) == 3
+	       && gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
+	       && operand_equal_p (*op_ptr, rhs, 0))
+	{
+	  bias = ncarray_create_bias (orig_bias, TREE_TYPE (orig_type));
+	}
+      else
+	gcc_unreachable ();
+
+      bias = fold_build1 (NEGATE_EXPR, sizetype, bias);
+      bias = fold_build2 (POINTER_PLUS_EXPR, orig_type, op, bias);
+
+      t = create_tmp_var (by_ref ? build_pointer_type (orig_type) : orig_type);
+
+      g = gimplify_assign (t, bias, &ilist);
+      gsi_insert_seq_before (&gsi, ilist, GSI_NEW_STMT);
+      *op_ptr = gimple_assign_lhs (g);
+
+      if (by_ref)
+	*op_ptr = build2 (MEM_REF, TREE_TYPE (orig_type), *op_ptr,
+			  build_int_cst (orig_type, 0));
+      *dims_ptr = TREE_CHAIN (*dims_ptr);
+    }
+
+  return gsi;
+}
+
 /* Callback for lower_omp_1.  Return non-NULL if *tp needs to be
    regimplified.  If DATA is non-NULL, lower_omp_1 is outside
    of OMP context, but with task_shared_vars set.  */
@@ -12709,6 +13167,48 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_cont
 
     default:
     regimplify:
+      /* If we detect the start of a non-contiguous array reference sequence,
+	 scan and do the needed adjustments.  */
+      tree dims, *op_ptr;
+      omp_context *ncarray_ctx = ctx;
+      if (ncarray_ctx
+	  && noncontig_array_reference_start (stmt, &ncarray_ctx, &op_ptr, &dims))
+	{
+	  bool started = false;
+	  tree orig_array_var = *op_ptr;
+	  tree curr_type = TREE_TYPE (orig_array_var);
+
+	  gimple_stmt_iterator gsi = *gsi_p, new_gsi;
+	  while (op_ptr)
+	    {
+	      if (!is_gimple_assign (gsi_stmt (gsi))
+		  || ((gimple_assign_single_p (gsi_stmt (gsi))
+		       || gimple_assign_cast_p (gsi_stmt (gsi)))
+		      && *op_ptr == gimple_assign_rhs1 (gsi_stmt (gsi))))
+		break;
+
+	      new_gsi = ncarray_dimension_peel (ncarray_ctx, gsi, orig_array_var,
+						op_ptr, &curr_type, &dims);
+	      if (!started)
+		{
+		  /* Point 'stmt' to the start of the newly added
+		     sequence.  */
+		  started = true;
+		  *gsi_p = new_gsi;
+		  stmt = gsi_stmt (*gsi_p);
+		}
+	      if (dims == NULL_TREE)
+		break;
+	      
+	      tree next_op = gimple_assign_lhs (gsi_stmt (gsi));
+	      do {
+		gsi_next (&gsi);
+		op_ptr = scan_for_reference (gsi_stmt (gsi), next_op);
+	      }
+	      while (!op_ptr);
+	    }
+	}
+
       if ((ctx || task_shared_vars)
 	  && walk_gimple_op (stmt, lower_omp_regimplify_p,
 			     ctx ? NULL : &wi))
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c	(revision 274618)
+++ gcc/tree-pretty-print.c	(working copy)
@@ -849,6 +849,33 @@ dump_omp_clause (pretty_printer *pp, tree clause,
 	case GOMP_MAP_LINK:
 	  pp_string (pp, "link");
 	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_TO:
+	  pp_string (pp, "to,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FROM:
+	  pp_string (pp, "from,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_TOFROM:
+	  pp_string (pp, "tofrom,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO:
+	  pp_string (pp, "force_to,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM:
+	  pp_string (pp, "force_from,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM:
+	  pp_string (pp, "force_tofrom,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_ALLOC:
+	  pp_string (pp, "alloc,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC:
+	  pp_string (pp, "force_alloc,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT:
+	  pp_string (pp, "force_present,noncontig_array");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -870,6 +897,10 @@ dump_omp_clause (pretty_printer *pp, tree clause,
 	    case GOMP_MAP_TO_PSET:
 	      pp_string (pp, " [pointer set, len: ");
 	      break;
+	    case GOMP_MAP_NONCONTIG_ARRAY:
+	      gcc_assert (TREE_CODE (OMP_CLAUSE_SIZE (clause)) == TREE_LIST);
+	      pp_string (pp, " [dimensions: ");
+	      break;
 	    default:
 	      pp_string (pp, " [len: ");
 	      break;
Index: include/gomp-constants.h
===================================================================
--- include/gomp-constants.h	(revision 274618)
+++ include/gomp-constants.h	(working copy)
@@ -40,6 +40,7 @@
 #define GOMP_MAP_FLAG_SPECIAL_0		(1 << 2)
 #define GOMP_MAP_FLAG_SPECIAL_1		(1 << 3)
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
+#define GOMP_MAP_FLAG_SPECIAL_3		(1 << 5)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
@@ -127,6 +128,26 @@ enum gomp_map_kind
     /* Decrement usage count and deallocate if zero.  */
     GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
 					 | GOMP_MAP_DELETE),
+    /* Mapping kinds for non-contiguous arrays.  */
+    GOMP_MAP_NONCONTIG_ARRAY =		(GOMP_MAP_FLAG_SPECIAL_3),
+    GOMP_MAP_NONCONTIG_ARRAY_TO =	(GOMP_MAP_NONCONTIG_ARRAY
+					 | GOMP_MAP_TO),
+    GOMP_MAP_NONCONTIG_ARRAY_FROM =	(GOMP_MAP_NONCONTIG_ARRAY
+					 | GOMP_MAP_FROM),
+    GOMP_MAP_NONCONTIG_ARRAY_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY
+					 | GOMP_MAP_TOFROM),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO =	(GOMP_MAP_NONCONTIG_ARRAY_TO
+					 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM =	(GOMP_MAP_NONCONTIG_ARRAY_FROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY_TOFROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_ALLOC =		(GOMP_MAP_NONCONTIG_ARRAY
+						 | GOMP_MAP_ALLOC),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC =	(GOMP_MAP_NONCONTIG_ARRAY
+						 | GOMP_MAP_FORCE_ALLOC),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT =	(GOMP_MAP_NONCONTIG_ARRAY
+						 | GOMP_MAP_FORCE_PRESENT),
 
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
@@ -155,6 +176,8 @@ enum gomp_map_kind
 #define GOMP_MAP_ALWAYS_P(X) \
   (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM))
 
+#define GOMP_MAP_NONCONTIG_ARRAY_P(X) \
+  ((X) & GOMP_MAP_NONCONTIG_ARRAY)
 
 /* Asynchronous behavior.  Keep in sync with
    libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t.  */

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, 3/3] Non-contiguous array support for OpenACC data clauses (re-submission), libgomp patches
@ 2019-08-20 12:16                       ` Chung-Lin Tang
  2019-10-07 13:58                         ` Thomas Schwinge
  2019-11-05 14:36                         ` [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses Chung-Lin Tang
  0 siblings, 2 replies; 24+ messages in thread
From: Chung-Lin Tang @ 2019-08-20 12:16 UTC (permalink / raw)
  To: gcc-patches, Jakub Jelinek, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 990 bytes --]

These are the libgomp patches (including testcases). Not much has
changed from last submission besides renaming to 'non-contiguous', etc. and
rebasing.

Thanks,
Chung-Lin


	libgomp/
	* target.c (struct gomp_ncarray_dim): New struct declaration.
	(struct gomp_ncarray_descr_type): Likewise.
	(struct ncarray_info): Likewise.
	(gomp_noncontig_array_count_rows): New function.
	(gomp_noncontig_array_compute_info): Likewise.
	(gomp_noncontig_array_fill_rows_1): Likewise.
	(gomp_noncontig_array_fill_rows): Likewise.
	(gomp_noncontig_array_create_ptrblock): Likewise.
	(gomp_map_vars): Add code to handle non-contiguous array map kinds.

	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: New test.

[-- Attachment #2: 03.openacc-noncontig_arrays.libgomp.patch --]
[-- Type: text/plain, Size: 21398 bytes --]

Index: libgomp/target.c
===================================================================
--- libgomp/target.c	(revision 274618)
+++ libgomp/target.c	(working copy)
@@ -510,6 +510,151 @@ gomp_map_val (struct target_mem_desc *tgt, void **
   return tgt->tgt_start + tgt->list[i].offset;
 }
 
+/* Definitions for data structures describing non-contiguous arrays
+   (Note: interfaces with compiler)
+
+   The compiler generates a descriptor for each such array, places the
+   descriptor on stack, and passes the address of the descriptor to the libgomp
+   runtime as a normal map argument. The runtime then processes the array
+   data structure setup, and replaces the argument with the new actual
+   array address for the child function.
+
+   Care must be taken such that the struct field and layout assumptions
+   of struct gomp_ncarray_dim, gomp_ncarray_descr_type inside the compiler
+   be consistant with the below declarations.  */
+
+struct gomp_ncarray_dim {
+  size_t base;
+  size_t length;
+  size_t elem_size;
+  size_t is_array;
+};
+
+struct gomp_ncarray_descr_type {
+  void *ptr;
+  size_t ndims;
+  struct gomp_ncarray_dim dims[];
+};
+
+/* Internal non-contiguous array info struct, used only here inside the runtime. */
+
+struct ncarray_info
+{
+  struct gomp_ncarray_descr_type *descr;
+  size_t map_index;
+  size_t ptrblock_size;
+  size_t data_row_num;
+  size_t data_row_size;
+};
+
+static size_t
+gomp_noncontig_array_count_rows (struct gomp_ncarray_descr_type *descr)
+{
+  size_t nrows = 1;
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    nrows *= descr->dims[d].length / sizeof (void *);
+  return nrows;
+}
+
+static void
+gomp_noncontig_array_compute_info (struct ncarray_info *nca)
+{
+  size_t d, n = 1;
+  struct gomp_ncarray_descr_type *descr = nca->descr;
+
+  nca->ptrblock_size = 0;
+  for (d = 0; d < descr->ndims - 1; d++)
+    {
+      size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size;
+      size_t dim_ptrblock_size = (descr->dims[d + 1].is_array
+				  ? 0 : descr->dims[d].length * n);
+      nca->ptrblock_size += dim_ptrblock_size;
+      n *= dim_count;
+    }
+  nca->data_row_num = n;
+  nca->data_row_size = descr->dims[d].length;
+}
+
+static void
+gomp_noncontig_array_fill_rows_1 (struct gomp_ncarray_descr_type *descr, void *nca,
+				  size_t d, void ***row_ptr, size_t *count)
+{
+  if (d < descr->ndims - 1)
+    {
+      size_t elsize = descr->dims[d].elem_size;
+      size_t n = descr->dims[d].length / elsize;
+      void *p = nca + descr->dims[d].base;
+      for (size_t i = 0; i < n; i++)
+	{
+	  void *ptr = p + i * elsize;
+	  /* Deref if next dimension is not array.  */
+	  if (!descr->dims[d + 1].is_array)
+	    ptr = *((void **) ptr);
+	  gomp_noncontig_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count);
+	}
+    }
+  else
+    {
+      **row_ptr = nca + descr->dims[d].base;
+      *row_ptr += 1;
+      *count += 1;
+    }
+}
+
+static size_t
+gomp_noncontig_array_fill_rows (struct gomp_ncarray_descr_type *descr, void *rows[])
+{
+  size_t count = 0;
+  void **p = rows;
+  gomp_noncontig_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count);
+  return count;
+}
+
+static void *
+gomp_noncontig_array_create_ptrblock (struct ncarray_info *nca,
+				      void *tgt_addr, void *tgt_data_rows[])
+{
+  struct gomp_ncarray_descr_type *descr = nca->descr;
+  void *ptrblock = gomp_malloc (nca->ptrblock_size);
+  void **curr_dim_ptrblock = (void **) ptrblock;
+  size_t n = 1;
+
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    {
+      int curr_dim_len = descr->dims[d].length;
+      int next_dim_len = descr->dims[d + 1].length;
+      int curr_dim_num = curr_dim_len / sizeof (void *);
+
+      void *next_dim_ptrblock
+	= (void *)(curr_dim_ptrblock + n * curr_dim_num);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < curr_dim_num; i++)
+	  {
+	    if (d < descr->ndims - 2)
+	      {
+		void *ptr = (next_dim_ptrblock
+			     + b * curr_dim_num * next_dim_len
+			     + i * next_dim_len);
+		void *tgt_ptr = tgt_addr + (ptr - ptrblock);
+		curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr;
+	      }
+	    else
+	      {
+		curr_dim_ptrblock[b * curr_dim_num + i]
+		  = tgt_data_rows[b * curr_dim_num + i];
+	      }
+	    void *addr = &curr_dim_ptrblock[b * curr_dim_num + i];
+	    assert (ptrblock <= addr && addr < ptrblock + nca->ptrblock_size);
+	  }
+
+      n *= curr_dim_num;
+      curr_dim_ptrblock = next_dim_ptrblock;
+    }
+  assert (n == nca->data_row_num);
+  return ptrblock;
+}
+
 static inline __attribute__((always_inline)) struct target_mem_desc *
 gomp_map_vars_internal (struct gomp_device_descr *devicep,
 			struct goacc_asyncqueue *aq, size_t mapnum,
@@ -523,9 +668,37 @@ gomp_map_vars_internal (struct gomp_device_descr *
   const int typemask = short_mapkind ? 0xff : 0x7;
   struct splay_tree_s *mem_map = &devicep->mem_map;
   struct splay_tree_key_s cur_node;
-  struct target_mem_desc *tgt
-    = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
-  tgt->list_count = mapnum;
+  struct target_mem_desc *tgt;
+
+  bool process_noncontig_arrays = false;
+  size_t nca_data_row_num = 0, row_start = 0;
+  size_t nca_info_num = 0, nca_index;
+  struct ncarray_info *nca_info = NULL;
+  struct target_var_desc *row_desc;
+  uintptr_t target_row_addr;
+  void **host_data_rows = NULL, **target_data_rows = NULL;
+  void *row;
+
+  if (mapnum > 0)
+    {
+      int kind = get_kind (short_mapkind, kinds, 0);
+      process_noncontig_arrays = GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask);
+    }
+
+  if (process_noncontig_arrays)
+    for (i = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	  {
+	    nca_data_row_num += gomp_noncontig_array_count_rows (hostaddrs[i]);
+	    nca_info_num += 1;
+	  }
+      }
+
+  tgt = gomp_malloc (sizeof (*tgt)
+		     + sizeof (tgt->list[0]) * (mapnum + nca_data_row_num));
+  tgt->list_count = mapnum + nca_data_row_num;
   tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
   tgt->device_descr = devicep;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
@@ -537,6 +710,14 @@ gomp_map_vars_internal (struct gomp_device_descr *
       return tgt;
     }
 
+  if (nca_info_num)
+    nca_info = gomp_alloca (sizeof (struct ncarray_info) * nca_info_num);
+  if (nca_data_row_num)
+    {
+      host_data_rows = gomp_malloc (sizeof (void *) * nca_data_row_num);
+      target_data_rows = gomp_malloc (sizeof (void *) * nca_data_row_num);
+    }
+
   tgt_align = sizeof (void *);
   tgt_size = 0;
   cbuf.chunks = NULL;
@@ -568,7 +749,7 @@ gomp_map_vars_internal (struct gomp_device_descr *
       return NULL;
     }
 
-  for (i = 0; i < mapnum; i++)
+  for (i = 0, nca_index = 0; i < mapnum; i++)
     {
       int kind = get_kind (short_mapkind, kinds, i);
       if (hostaddrs[i] == NULL
@@ -633,6 +814,20 @@ gomp_map_vars_internal (struct gomp_device_descr *
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	{
+	  /* Ignore non-contiguous arrays for now, we process them together
+	     later.  */
+	  tgt->list[i].key = NULL;
+	  tgt->list[i].offset = 0;
+	  not_found_cnt++;
+
+	  struct ncarray_info *nca = &nca_info[nca_index++];
+	  nca->descr = (struct gomp_ncarray_descr_type *) hostaddrs[i];
+	  nca->map_index = i;
+	  continue;
+	}
+
       cur_node.host_start = (uintptr_t) hostaddrs[i];
       if (!GOMP_MAP_POINTER_P (kind & typemask))
 	cur_node.host_end = cur_node.host_start + sizes[i];
@@ -701,6 +896,56 @@ gomp_map_vars_internal (struct gomp_device_descr *
 	}
     }
 
+  /* For non-contiguous arrays. Each data row is one target item, separated
+     from the normal map clause items, hence we order them after mapnum.  */
+  if (process_noncontig_arrays)
+    for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	  continue;
+
+	struct ncarray_info *nca = &nca_info[nca_index++];
+	struct gomp_ncarray_descr_type *descr = nca->descr;
+	size_t nr;
+
+	gomp_noncontig_array_compute_info (nca);
+
+	/* We have allocated space in host/target_data_rows to place all the
+	   row data block pointers, now we can start filling them in.  */
+	nr = gomp_noncontig_array_fill_rows (descr, &host_data_rows[row_start]);
+	assert (nr == nca->data_row_num);
+
+	size_t align = (size_t) 1 << (kind >> rshift);
+	if (tgt_align < align)
+	  tgt_align = align;
+	tgt_size = (tgt_size + align - 1) & ~(align - 1);
+	tgt_size += nca->ptrblock_size;
+
+	for (size_t j = 0; j < nca->data_row_num; j++)
+	  {
+	    row = host_data_rows[row_start + j];
+	    row_desc = &tgt->list[mapnum + row_start + j];
+
+	    cur_node.host_start = (uintptr_t) row;
+	    cur_node.host_end = cur_node.host_start + nca->data_row_size;
+	    splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	    if (n)
+	      {
+		assert (n->refcount != REFCOUNT_LINK);
+		gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc,
+					kind & typemask, /* TODO: cbuf? */ NULL);
+	      }
+	    else
+	      {
+		tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		tgt_size += nca->data_row_size;
+		not_found_cnt++;
+	      }
+	  }
+	row_start += nca->data_row_num;
+      }
+
   if (devaddrs)
     {
       if (mapnum != 1)
@@ -861,6 +1106,15 @@ gomp_map_vars_internal (struct gomp_device_descr *
 	      default:
 		break;
 	      }
+
+	    if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	      {
+		tgt->list[i].key = &array->key;
+		tgt->list[i].key->tgt = tgt;
+		array++;
+		continue;
+	      }
+
 	    splay_tree_key k = &array->key;
 	    k->host_start = (uintptr_t) hostaddrs[i];
 	    if (!GOMP_MAP_POINTER_P (kind & typemask))
@@ -1010,8 +1264,115 @@ gomp_map_vars_internal (struct gomp_device_descr *
 		array++;
 	      }
 	  }
+
+      /* Processing of non-contiguous array rows.  */
+      if (process_noncontig_arrays)
+	{
+	  for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++)
+	    {
+	      int kind = get_kind (short_mapkind, kinds, i);
+	      if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+		continue;
+
+	      struct ncarray_info *nca = &nca_info[nca_index++];
+	      assert (nca->descr == hostaddrs[i]);
+
+	      /* The map for the non-contiguous array itself is never copied from
+		 during unmapping, its the data rows that count. Set copy-from
+		 flags to false here.  */
+	      tgt->list[i].copy_from = false;
+	      tgt->list[i].always_copy_from = false;
+
+	      size_t align = (size_t) 1 << (kind >> rshift);
+	      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+
+	      /* For the map of the non-contiguous array itself, adjust so that
+		 the passed device address points to the beginning of the
+		 ptrblock.  */
+	      tgt->list[i].key->tgt_offset = tgt_size;
+
+	      void *target_ptrblock = (void*) tgt->tgt_start + tgt_size;
+	      tgt_size += nca->ptrblock_size;
+
+	      /* Add splay key for each data row in current non-contiguous
+		 array.  */
+	      for (size_t j = 0; j < nca->data_row_num; j++)
+		{
+		  row = host_data_rows[row_start + j];
+		  row_desc = &tgt->list[mapnum + row_start + j];
+
+		  cur_node.host_start = (uintptr_t) row;
+		  cur_node.host_end = cur_node.host_start + nca->data_row_size;
+		  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+		  if (n)
+		    {
+		      assert (n->refcount != REFCOUNT_LINK);
+		      gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc,
+					      kind & typemask, cbufp);
+		      target_row_addr = n->tgt->tgt_start + n->tgt_offset;
+		    }
+		  else
+		    {
+		      tgt->refcount++;
+
+		      splay_tree_key k = &array->key;
+		      k->host_start = (uintptr_t) row;
+		      k->host_end = k->host_start + nca->data_row_size;
+
+		      k->tgt = tgt;
+		      k->refcount = 1;
+		      k->link_key = NULL;
+		      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		      target_row_addr = tgt->tgt_start + tgt_size;
+		      k->tgt_offset = tgt_size;
+		      tgt_size += nca->data_row_size;
+
+		      row_desc->key = k;
+		      row_desc->copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->always_copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->offset = 0;
+		      row_desc->length = nca->data_row_size;
+
+		      array->left = NULL;
+		      array->right = NULL;
+		      splay_tree_insert (mem_map, array);
+
+		      if (GOMP_MAP_COPY_TO_P (kind & typemask))
+			gomp_copy_host2dev (devicep, aq,
+					    (void *) tgt->tgt_start + k->tgt_offset,
+					    (void *) k->host_start,
+					    nca->data_row_size, cbufp);
+		      array++;
+		    }
+		  target_data_rows[row_start + j] = (void *) target_row_addr;
+		}
+
+	      /* Now we have the target memory allocated, and target offsets of all
+		 row blocks assigned and calculated, we can construct the
+		 accelerator side ptrblock and copy it in.  */
+	      if (nca->ptrblock_size)
+		{
+		  void *ptrblock = gomp_noncontig_array_create_ptrblock
+		    (nca, target_ptrblock, target_data_rows + row_start);
+		  gomp_copy_host2dev (devicep, aq, target_ptrblock, ptrblock,
+				      nca->ptrblock_size, cbufp);
+		  free (ptrblock);
+		}
+
+	      row_start += nca->data_row_num;
+	    }
+	  assert (row_start == nca_data_row_num && nca_index == nca_info_num);
+	}
     }
 
+  if (nca_data_row_num)
+    {
+      free (host_data_rows);
+      free (target_data_rows);
+    }
+
   if (pragma_kind == GOMP_MAP_VARS_TARGET)
     {
       for (i = 0; i < mapnum; i++)
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c	(working copy)
@@ -0,0 +1,103 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+
+#define n 100
+#define m 100
+
+int b[n][m];
+
+void
+test1 (void)
+{
+  int i, j, *a[100];
+
+  /* Array of pointers form test.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+}
+
+void
+test2 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+
+  /* Separately allocated blocks.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+  free (a);
+}
+
+void
+test3 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+  a[0] = (int *) malloc (sizeof (int) * n * m);
+
+  /* Rows allocated in one contiguous block.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = *a + i * m;
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    for (j = 0; j < m; j++)
+      assert (a[i][j] == b[i][j]);
+
+  free (a[0]);
+  free (a);
+}
+
+int
+main (void)
+{
+  test1 ();
+  test2 ();
+  test3 ();
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c	(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "noncontig_array-utils.h"
+
+int
+main (void)
+{
+  int n = 10;
+  int ***a = (int ***) create_ncarray (sizeof (int), n, 3);
+  int ***b = (int ***) create_ncarray (sizeof (int), n, 3);
+  int ***c = (int ***) create_ncarray (sizeof (int), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	{
+	  a[i][j][k] = i + j * k + k;
+	  b[i][j][k] = j + k * i + i * j;
+	  c[i][j][k] = a[i][j][k];
+	}
+
+  #pragma acc parallel copy (a[0:n][0:n][0:n]) copyin (b[0:n][0:n][0:n])
+  {
+    for (int i = 0; i < n; i++)
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  a[i][j][k] += b[k][j][i] + i + j + k;
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (a[i][j][k] == c[i][j][k] + b[k][j][i] + i + j + k);
+
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c	(working copy)
@@ -0,0 +1,45 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "noncontig_array-utils.h"
+
+int main (void)
+{
+  int n = 20, x = 5, y = 12;
+  int *****a = (int *****) create_ncarray (sizeof (int), n, 5);
+
+  int sum1 = 0, sum2 = 0, sum3 = 0;
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    {
+	      a[i][j][k][l][m] = 1;
+	      sum1++;
+	    }
+
+  #pragma acc parallel copy (a[x:y][x:y][x:y][x:y][x:y]) copy(sum2)
+  {
+    for (int i = x; i < x + y; i++)
+      for (int j = x; j < x + y; j++)
+	for (int k = x; k < x + y; k++)
+	  for (int l = x; l < x + y; l++)
+	    for (int m = x; m < x + y; m++)
+	      {
+		a[i][j][k][l][m] = 0;
+		sum2++;
+	      }
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    sum3 += a[i][j][k][l][m];
+
+  assert (sum1 == sum2 + sum3);
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c	(working copy)
@@ -0,0 +1,36 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <assert.h>
+#include "noncontig_array-utils.h"
+
+int main (void)
+{
+  int n = 128;
+  double ***a = (double ***) create_ncarray (sizeof (double), n, 3);
+  double ***b = (double ***) create_ncarray (sizeof (double), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	a[i][j][k] = i + j + k + i * j * k;
+
+  /* This test exercises async copyout of non-contiguous array rows.  */
+  #pragma acc parallel copyin(a[0:n][0:n][0:n]) copyout(b[0:n][0:n][0:n]) async(5)
+  {
+    #pragma acc loop gang
+    for (int i = 0; i < n; i++)
+      #pragma acc loop vector
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  b[i][j][k] = a[i][j][k] * 2.0;
+  }
+
+  #pragma acc wait (5)
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (b[i][j][k] == a[i][j][k] * 2.0);
+
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h	(working copy)
@@ -0,0 +1,44 @@
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include <stdint.h>
+
+/* Allocate and create a pointer based NDIMS-dimensional array,
+   each dimension DIMLEN long, with ELSIZE sized data elements.  */
+void *
+create_ncarray (size_t elsize, int dimlen, int ndims)
+{
+  size_t blk_size = 0;
+  size_t n = 1;
+
+  for (int i = 0; i < ndims - 1; i++)
+    {
+      n *= dimlen;
+      blk_size += sizeof (void *) * n;
+    }
+  size_t data_rows_num = n;
+  size_t data_rows_offset = blk_size;
+  blk_size += elsize * n * dimlen;
+
+  void *blk = (void *) malloc (blk_size);
+  memset (blk, 0, blk_size);
+  void **curr_dim = (void **) blk;
+  n = 1;
+
+  for (int d = 0; d < ndims - 1; d++)
+    {
+      uintptr_t next_dim = (uintptr_t) (curr_dim + n * dimlen);
+      size_t next_dimlen = dimlen * (d < ndims - 2 ? sizeof (void *) : elsize);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < dimlen; i++)
+	  if (d < ndims - 1)
+	    curr_dim[b * dimlen + i]
+	      = (void*) (next_dim + b * dimlen * next_dimlen + i * next_dimlen);
+
+      n *= dimlen;
+      curr_dim = (void**) next_dim;
+    }
+  assert (n == data_rows_num);
+  return blk;
+}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches
  2019-08-20 11:54                   ` [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches Chung-Lin Tang
  2019-08-20 12:01                     ` [PATCH, OpenACC, 2/3] Non-contiguous array support for OpenACC data clauses (re-submission), compiler patches Chung-Lin Tang
@ 2019-10-07 13:51                     ` Thomas Schwinge
  1 sibling, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2019-10-07 13:51 UTC (permalink / raw)
  To: cltang; +Cc: gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 3281 bytes --]

Hi Chung-Lin!

Thanks for your work on this.


Please reference PR76739 in your submission/ChangeLog updates.


We'll need Jakub to review the generic code changes, but let me provide
some first review remarks, too.


On 2019-08-20T19:36:24+0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> The first patch here are the C/C++ front-end patches.

As far as I'm concerned, it doesn't make sense to artificially split up
patches like that, given that the individual three pieces can only be
considered all together.

And if posting split-up for other reasonse, then please make sure that
the individual patch submission emails have a common "cover letter" email
so that they show up as one email thread.


> --- gcc/c/c-typeck.c	(revision 274618)
> +++ gcc/c/c-typeck.c	(working copy)

> @@ -13099,6 +13100,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
>  		    }
>  		}
>  	    }
> +
> +	  /* For OpenACC, if the low_bound/length suggest this is a subarray,
> +	     and is referenced through by a pointer, then mark this as
> +	     non-contiguous.  */

I don't directly understand this logic.  I'll have to think about it
more.

> +	  if (ort == C_ORT_ACC
> +	      && types.length () > 0
> +	      && (TREE_CODE (low_bound) != INTEGER_CST
> +		  || integer_nonzerop (low_bound)
> +		  || (length && (TREE_CODE (length) != INTEGER_CST
> +				 || !tree_int_cst_equal (size, length)))))
> +	    {
> +	      tree x = types.last ();
> +	      if (TREE_CODE (x) == POINTER_TYPE)
> +		non_contiguous = true;
> +	    }
>  	}
>        else if (length == NULL_TREE)
>  	{
> @@ -13142,7 +13158,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
>        /* If there is a pointer type anywhere but in the very first
>  	 array-section-subscript, the array section can't be contiguous.  */
>        if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
> -	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
> +	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST
> +	  && ort != C_ORT_ACC)
>  	{
>  	  error_at (OMP_CLAUSE_LOCATION (c),
>  		    "array section is not contiguous in %qs clause",
> @@ -13149,6 +13166,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
>  		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
>  	  return error_mark_node;
>  	}
> +      else if (TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
> +	non_contiguous = true;
>      }
>    else
>      {


> @@ -13337,6 +13365,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi
>  		size = size_binop (MULT_EXPR, size, l);
>  	    }
>  	}
> +      if (non_contiguous)
> +	{
> +	  int kind = OMP_CLAUSE_MAP_KIND (c);
> +	  OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY);
> +	  OMP_CLAUSE_DECL (c) = t;
> +	  OMP_CLAUSE_SIZE (c) = ncarray_dims;
> +	  return false;
> +	}

I'm expecting to see front end test cases (probably
'-fdump-tree-original' scanning?) for a good number of different data
clauses/array variants, whether that flag 'GOMP_MAP_NONCONTIG_ARRAY' has
been set or not.  (That would then also document the logic presented
above, and should thus help me understand that.)


> --- gcc/cp/semantics.c	(revision 274618)
> +++ gcc/cp/semantics.c	(working copy)

Likewise.


Grüße
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 658 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, 3/3] Non-contiguous array support for OpenACC data clauses (re-submission), libgomp patches
  2019-08-20 12:16                       ` [PATCH, OpenACC, 3/3] Non-contiguous array support for OpenACC data clauses (re-submission), libgomp patches Chung-Lin Tang
@ 2019-10-07 13:58                         ` Thomas Schwinge
  2019-11-05 14:36                         ` [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses Chung-Lin Tang
  1 sibling, 0 replies; 24+ messages in thread
From: Thomas Schwinge @ 2019-10-07 13:58 UTC (permalink / raw)
  To: cltang; +Cc: gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1547 bytes --]

Hi Chung-Lin!

On 2019-08-20T19:36:56+0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c	(nonexistent)
> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c	(working copy)
> @@ -0,0 +1,103 @@
> +/* { dg-do run { target { ! openacc_host_selected } } } */

Curious about that restriction, I removed it, and see that these test
cases then fail (SIGSEGV) for host-fallback execution.  Same in presence
of 'if (false)' clauses, which do get used in real-world OpenACC code
(with proper conditionals, of course).

    Program received signal SIGSEGV, Segmentation fault.
    0x0000000000400fd0 in test1._omp_fn.0 () at source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c:26
    26            a[i][j] = b[i][j];
    (gdb) bt
    #0  0x0000000000400fd0 in test1._omp_fn.0 () at source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c:26
    #1  0x00007ffff7bbfdf9 in GOACC_parallel_keyed (flags_m=<optimized out>, fn=0x400ef1 <test1._omp_fn.0>, mapnum=2, hostaddrs=0x7fffffffc8c0, sizes=0x606290 <.omp_data_sizes.4>, kinds=0x6062a0 <.omp_data_kinds.5>) at [...]/source-gcc/libgomp/oacc-parallel.c:221
    #2  0x0000000000400a1c in test1 () at source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c:22
    #3  0x0000000000400ee0 in main () at source-gcc/libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c:97

What does it take to make that work?


Grüße
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 658 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses
@ 2019-11-05 14:36                         ` Chung-Lin Tang
  2019-11-07  0:49                           ` Thomas Schwinge
  0 siblings, 1 reply; 24+ messages in thread
From: Chung-Lin Tang @ 2019-11-05 14:36 UTC (permalink / raw)
  To: gcc-patches, Thomas Schwinge

[-- Attachment #1: Type: text/plain, Size: 6001 bytes --]

Hi Thomas,
after your last round of review, I realized that the bulk of the compiler omp-low work was
simply a case of dumb over-engineering in the wrong direction :P
(although it did painstakingly function correctly)

Instead of making code changes for bias adjustment in the child function code in the omp-low
phase, this should simply be done by the libgomp runtime map preparation (similar to how the
current single-dimension array biases are handled)

So this updated patch (1) discards away a large part of the last omp-low.c patch, and
(2) adjusts the libgomp/target.c patch to do the per-dimensional adjustments.

Also, the bit of C/C++ front-end logic you mentioned that was questionable was removed.
After looking closely, it wasn't needed; the relaxing of pointers for OpenACC was enough.
Still some aspects of handling arrays inside the multi-dimension type still need some
more work, e.g. see the catching in the omp-low.c part. A compiler dg-scan testcase
was also added.

However, the issue of ACC_DEVICE_TYPE=host not working (and hence "!openacc_host_selected"
in the testcases) actually is a bit more sophisticated than I thought:

The reason it doesn't work for the host device, is because we use the map pointer (i.e.
a hostaddrs[] entry when passed into libgomp) to point to an array descriptor to pass
the whole array information, and rely on code inside gomp_map_vars_* to setup things,
and place the final on-device address of the non-contig. array into devaddrs[], therefore
only using a single map entry (something I thought was quite clever)

However, this broke down on the host and host-fallback devices, simply because, there
we do NOT do any gomp_map_vars processing; our current code in GOACC_parallel_keyed
simply skips it and passes the offload function the original hostaddrs[] contents.
Lacking the processing to transform the descriptor pointer into a proper array ref,
things of course segfault.

So I think we have three options for this (which may have some interactions with say,
the "proper" host-side parallelization we eventually need to implement for OpenACC 2.7)

(1) The simplest solution: implement a processing which searches and reverts such
non-contiguous array map entries in GOACC_parallel_keyed.
(note: I have implemented this in the current attached "v2" patch)

(2) Make the GOACC_parallel_keyed code to not make short cuts for host-modes;
i.e. still do the proper gomp_map_vars processing for all cases.

(3) Modify the non-contiguous array map conventions: a possible solution is to use
two maps placed together: one for the array pointer, another for the array descriptor (as
opposed to the current style of using only one map) This needs more further elaborate
compiler/runtime work.

The first two options will pessimize host-mode performance somewhat. The third I have
some WIP patches, but it's still buggy ATM. Seeking your opinion on what we should do.

Thanks,
Chung-Lin

	gcc/c/
	* c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/cp/
	* semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/
	* gimplify.c (gimplify_scan_omp_clauses): For non-contiguous array map kinds,
	make sure bias in each dimension are put into firstprivate variables.

	* omp-low.c (append_field_to_record_type): New function.
	(create_noncontig_array_descr_type): Likewise.
	(create_noncontig_array_descr_init_code): Likewise.
	(scan_sharing_clauses): For non-contiguous array map kinds, check for
	supported dimension structure, and install non-contiguous array variable into
	current omp_context.
	(reorder_noncontig_array_clauses): New function.
	(scan_omp_target): Call reorder_noncontig_array_clauses to place
	non-contiguous array map clauses at beginning of clause sequence.
	(lower_omp_target): Add handling for non-contiguous array map kinds.

	* tree-pretty-print.c (dump_omp_clauses): Add cases for printing
	GOMP_MAP_NONCONTIG_ARRAY map kinds.

	include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define.
	(enum gomp_map_kind): Add GOMP_MAP_NONCONTIG_ARRAY,
	GOMP_MAP_NONCONTIG_ARRAY_TO, GOMP_MAP_NONCONTIG_ARRAY_FROM,
	GOMP_MAP_NONCONTIG_ARRAY_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO,
	GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM,
	GOMP_MAP_NONCONTIG_ARRAY_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC,
	GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT.
	(GOMP_MAP_NONCONTIG_ARRAY_P): Define.

	gcc/testsuite/
	* c-c++-common/goacc/noncontig_array-1.c: New test.

	libgomp/
	* target.c (struct gomp_ncarray_dim): New struct declaration.
	(struct gomp_ncarray_descr_type): Likewise.
	(struct ncarray_info): Likewise.
	(gomp_noncontig_array_count_rows): New function.
	(gomp_noncontig_array_compute_info): Likewise.
	(gomp_noncontig_array_fill_rows_1): Likewise.
	(gomp_noncontig_array_fill_rows): Likewise.
	(gomp_noncontig_array_create_ptrblock): Likewise.
	(gomp_map_vars_internal): Add code to handle non-contiguous array map
	kinds.
	* oacc-parallel.c (revert_noncontig_array_map_pointers): New function.
	(GOACC_parallel_keyed): Call revert_noncontig_array_map_pointers
	when executing for host-modes.

	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: Support
	header for new tests.

[-- Attachment #2: openacc-noncontig-arrays-v2.patch --]
[-- Type: text/plain, Size: 48094 bytes --]

Index: gcc/c/c-typeck.c
===================================================================
--- gcc/c/c-typeck.c	(revision 277827)
+++ gcc/c/c-typeck.c	(working copy)
@@ -12868,7 +12868,7 @@ c_finish_omp_cancellation_point (location_t loc, t
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -12953,7 +12953,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
     }
 
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -13160,14 +13161,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
 	  return error_mark_node;
 	}
       /* If there is a pointer type anywhere but in the very first
-	 array-section-subscript, the array section can't be contiguous.  */
+	 array-section-subscript, the array section can't be contiguous.
+	 Note that OpenACC does accept these kinds of non-contiguous pointer
+	 based arrays.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
 	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
 	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-		    "array section is not contiguous in %qs clause",
-		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
-	  return error_mark_node;
+	  if (ort == C_ORT_ACC)
+	    non_contiguous = true;
+	  else
+	    {
+	      error_at (OMP_CLAUSE_LOCATION (c),
+			"array section is not contiguous in %qs clause",
+			omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+	      return error_mark_node;
+	    }
 	}
     }
   else
@@ -13196,6 +13204,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree *tp = &OMP_CLAUSE_DECL (c);
   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
@@ -13205,7 +13214,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
     tp = &TREE_VALUE (*tp);
   tree first = handle_omp_array_sections_1 (c, *tp, types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -13238,6 +13247,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree ncarray_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -13261,6 +13271,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      ncarray_dims = tree_cons (low_bound, length, ncarray_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -13357,6 +13374,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 		size = size_binop (MULT_EXPR, size, l);
 	    }
 	}
+      if (non_contiguous)
+	{
+	  int kind = OMP_CLAUSE_MAP_KIND (c);
+	  OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY);
+	  OMP_CLAUSE_DECL (c) = t;
+	  OMP_CLAUSE_SIZE (c) = ncarray_dims;
+	  return false;
+	}
       if (side_effects)
 	size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
       if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
Index: gcc/cp/semantics.c
===================================================================
--- gcc/cp/semantics.c	(revision 277827)
+++ gcc/cp/semantics.c	(working copy)
@@ -4732,7 +4732,7 @@ omp_privatize_field (tree t, bool shared)
 static tree
 handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 			     bool &maybe_zero_len, unsigned int &first_non_one,
-			     enum c_omp_region_type ort)
+			     bool &non_contiguous, enum c_omp_region_type ort)
 {
   tree ret, low_bound, length, type;
   if (TREE_CODE (t) != TREE_LIST)
@@ -4817,7 +4817,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
       && TREE_CODE (TREE_CHAIN (t)) == FIELD_DECL)
     TREE_CHAIN (t) = omp_privatize_field (TREE_CHAIN (t), false);
   ret = handle_omp_array_sections_1 (c, TREE_CHAIN (t), types,
-				     maybe_zero_len, first_non_one, ort);
+				     maybe_zero_len, first_non_one,
+				     non_contiguous, ort);
   if (ret == error_mark_node || ret == NULL_TREE)
     return ret;
 
@@ -5036,14 +5037,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
 	  return error_mark_node;
 	}
       /* If there is a pointer type anywhere but in the very first
-	 array-section-subscript, the array section can't be contiguous.  */
+	 array-section-subscript, the array section can't be contiguous.
+	 Note that OpenACC does accept these kinds of non-contiguous pointer
+	 based arrays.  */
       if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
 	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
 	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-		    "array section is not contiguous in %qs clause",
-		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
-	  return error_mark_node;
+	  if (ort == C_ORT_ACC)
+	    non_contiguous = true;
+	  else
+	    {
+	      error_at (OMP_CLAUSE_LOCATION (c),
+			"array section is not contiguous in %qs clause",
+			omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+	      return error_mark_node;
+	    }
 	}
     }
   else
@@ -5083,6 +5091,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 {
   bool maybe_zero_len = false;
   unsigned int first_non_one = 0;
+  bool non_contiguous = false;
   auto_vec<tree, 10> types;
   tree *tp = &OMP_CLAUSE_DECL (c);
   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
@@ -5092,7 +5101,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
     tp = &TREE_VALUE (*tp);
   tree first = handle_omp_array_sections_1 (c, *tp, types,
 					    maybe_zero_len, first_non_one,
-					    ort);
+					    non_contiguous, ort);
   if (first == error_mark_node)
     return true;
   if (first == NULL_TREE)
@@ -5126,6 +5135,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
       unsigned int num = types.length (), i;
       tree t, side_effects = NULL_TREE, size = NULL_TREE;
       tree condition = NULL_TREE;
+      tree ncarray_dims = NULL_TREE;
 
       if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
 	maybe_zero_len = true;
@@ -5151,6 +5161,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 	    length = fold_convert (sizetype, length);
 	  if (low_bound == NULL_TREE)
 	    low_bound = integer_zero_node;
+
+	  if (non_contiguous)
+	    {
+	      ncarray_dims = tree_cons (low_bound, length, ncarray_dims);
+	      continue;
+	    }
+
 	  if (!maybe_zero_len && i > first_non_one)
 	    {
 	      if (integer_nonzerop (low_bound))
@@ -5242,6 +5259,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi
 	}
       if (!processing_template_decl)
 	{
+	  if (non_contiguous)
+	    {
+	      int kind = OMP_CLAUSE_MAP_KIND (c);
+	      OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY);
+	      OMP_CLAUSE_DECL (c) = t;
+	      OMP_CLAUSE_SIZE (c) = ncarray_dims;
+	      return false;
+	    }
 	  if (side_effects)
 	    size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
 	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
Index: gcc/gimplify.c
===================================================================
--- gcc/gimplify.c	(revision 277827)
+++ gcc/gimplify.c	(working copy)
@@ -8622,9 +8622,17 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_se
 	  if (OMP_CLAUSE_SIZE (c) == NULL_TREE)
 	    OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl)
 				  : TYPE_SIZE_UNIT (TREE_TYPE (decl));
-	  if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
-			     NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
+	  if (OMP_CLAUSE_SIZE (c)
+	      && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST
+	      && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
 	    {
+	      /* For non-contiguous array maps, OMP_CLAUSE_SIZE is a TREE_LIST
+		 of the individual array dimensions, which gimplify_expr doesn't
+		 handle, so skip the call to gimplify_expr here.  */
+	    }
+	  else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
+				  NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
+	    {
 	      remove = true;
 	      break;
 	    }
Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	(revision 277827)
+++ gcc/omp-low.c	(working copy)
@@ -894,6 +894,137 @@ omp_copy_decl (tree var, copy_body_data *cb)
   return error_mark_node;
 }
 
+/* Helper function for create_noncontig_array_descr_type(), to append a new field
+   to a record type.  */
+
+static void
+append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type)
+{
+  tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type);
+  DECL_CONTEXT (fld) = record_type;
+
+  for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p))
+    ;
+  *p = fld;
+}
+
+/* Create type for non-contiguous array descriptor. Returns created type, and
+   returns the number of dimensions in *DIM_NUM.  */
+
+static tree
+create_noncontig_array_descr_type (tree decl, tree dims, int *dim_num)
+{
+  int n = 0;
+  tree array_descr_type, name, x;
+  gcc_assert (TREE_CODE (dims) == TREE_LIST);
+
+  array_descr_type = lang_hooks.types.make_type (RECORD_TYPE);
+  name = create_tmp_var_name (".omp_noncontig_array_descr_type");
+  name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, array_descr_type);
+  DECL_ARTIFICIAL (name) = 1;
+  DECL_NAMELESS (name) = 1;
+  TYPE_NAME (array_descr_type) = name;
+  TYPE_ARTIFICIAL (array_descr_type) = 1;
+
+  /* Main starting pointer/array.  */
+  tree main_var_type = TREE_TYPE (decl);
+  if (TREE_CODE (main_var_type) == REFERENCE_TYPE)
+    main_var_type = TREE_TYPE (main_var_type);
+  append_field_to_record_type (array_descr_type, DECL_NAME (decl),
+			       (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE
+				? main_var_type
+				: build_pointer_type (main_var_type)));
+  /* Number of dimensions.  */
+  append_field_to_record_type (array_descr_type, get_identifier ("__dim_num"),
+			       sizetype);
+
+  for (x = dims; x; x = TREE_CHAIN (x), n++)
+    {
+      char *fldname;
+      /* One for the start index.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_base", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the length.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_length", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for the element size.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_elem_size", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+      /* One for is_array flag.  */
+      ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_is_array", n);
+      append_field_to_record_type (array_descr_type, get_identifier (fldname),
+				   sizetype);
+    }
+
+  layout_type (array_descr_type);
+  *dim_num = n;
+  return array_descr_type;
+}
+
+/* Generate code sequence for initializing non-contiguous array descriptor.  */
+
+static void
+create_noncontig_array_descr_init_code (tree array_descr, tree array_var,
+					tree dimensions, int dim_num,
+					gimple_seq *ilist)
+{
+  tree fld, fldref;
+  tree array_descr_type = TREE_TYPE (array_descr);
+  tree dim_type = TREE_TYPE (array_var);
+
+  fld = TYPE_FIELDS (array_descr_type);
+  fldref = omp_build_component_ref (array_descr, fld);
+  gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE
+			    ? build_fold_addr_expr (array_var) : array_var),
+		   ilist);
+
+  if (TREE_CODE (dim_type) == REFERENCE_TYPE)
+    dim_type = TREE_TYPE (dim_type);
+
+  fld = TREE_CHAIN (fld);
+  fldref = omp_build_component_ref (array_descr, fld);
+  gimplify_assign (fldref, build_int_cst (sizetype, dim_num), ilist);
+
+  while (dimensions)
+    {
+      tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions));
+      tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions));
+      tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type));
+      tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE
+			   ? integer_one_node : integer_zero_node);
+      /* Set base.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size);
+      gimplify_assign (fldref, dim_base, ilist);
+
+      /* Set length.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size);
+      gimplify_assign (fldref, dim_length, ilist);
+
+      /* Set elem_size.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_elem_size = fold_convert (sizetype, dim_elem_size);
+      gimplify_assign (fldref, dim_elem_size, ilist);
+
+      /* Set is_array flag.  */
+      fld = TREE_CHAIN (fld);
+      fldref = omp_build_component_ref (array_descr, fld);
+      dim_is_array = fold_convert (sizetype, dim_is_array);
+      gimplify_assign (fldref, dim_is_array, ilist);
+
+      dimensions = TREE_CHAIN (dimensions);
+      dim_type = TREE_TYPE (dim_type);
+    }
+  gcc_assert (TREE_CHAIN (fld) == NULL_TREE);
+}
+
 /* Create a new context, with OUTER_CTX being the surrounding context.  */
 
 static omp_context *
@@ -1367,6 +1498,38 @@ scan_sharing_clauses (tree clauses, omp_context *c
 	      install_var_local (decl, ctx);
 	      break;
 	    }
+
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	      && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	    {
+	      tree array_decl = OMP_CLAUSE_DECL (c);
+	      tree array_type = TREE_TYPE (array_decl);
+	      bool by_ref = (TREE_CODE (array_type) == ARRAY_TYPE
+			     ? true : false);
+
+	      /* Checking code to ensure we only have arrays at top dimension.
+		 This limitation might be lifted in the future.  */
+	      if (TREE_CODE (array_type) == REFERENCE_TYPE)
+		array_type = TREE_TYPE (array_type);
+	      tree t = array_type, prev_t = NULL_TREE;
+	      while (t)
+		{
+		  if (TREE_CODE (t) == ARRAY_TYPE && prev_t)
+		    {
+		      error_at (gimple_location (ctx->stmt), "array types are"
+				" only allowed at outermost dimension of"
+				" non-contiguous array");
+		      break;
+		    }
+		  prev_t = t;
+		  t = TREE_TYPE (t);
+		}
+
+	      install_var_field (array_decl, by_ref, 3, ctx);
+	      install_var_local (array_decl, ctx);
+	      break;
+	    }
+
 	  if (DECL_P (decl))
 	    {
 	      if (DECL_SIZE (decl)
@@ -2597,6 +2760,50 @@ scan_omp_single (gomp_single *stmt, omp_context *o
     layout_type (ctx->record_type);
 }
 
+/* Reorder clauses so that non-contiguous array map clauses are placed at the very
+   front of the chain.  */
+
+static void
+reorder_noncontig_array_clauses (tree *clauses_ptr)
+{
+  tree c, clauses = *clauses_ptr;
+  tree prev_clause = NULL_TREE, next_clause;
+  tree array_clauses = NULL_TREE, array_clauses_tail = NULL_TREE;
+
+  for (c = clauses; c; c = next_clause)
+    {
+      next_clause = OMP_CLAUSE_CHAIN (c);
+
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	  && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	{
+	  /* Unchain c from clauses.  */
+	  if (c == clauses)
+	    clauses = next_clause;
+
+	  /* Link on to array_clauses.  */
+	  if (array_clauses_tail)
+	    OMP_CLAUSE_CHAIN (array_clauses_tail) = c;
+	  else
+	    array_clauses = c;
+	  array_clauses_tail = c;
+
+	  if (prev_clause)
+	    OMP_CLAUSE_CHAIN (prev_clause) = next_clause;
+	  continue;
+	}
+
+      prev_clause = c;
+    }  
+
+  /* Place non-contiguous array clauses at the start of the clause list.  */
+  if (array_clauses)
+    {
+      OMP_CLAUSE_CHAIN (array_clauses_tail) = clauses;
+      *clauses_ptr = array_clauses;
+    }
+}
+
 /* Scan a GIMPLE_OMP_TARGET.  */
 
 static void
@@ -2605,7 +2812,6 @@ scan_omp_target (gomp_target *stmt, omp_context *o
   omp_context *ctx;
   tree name;
   bool offloaded = is_gimple_omp_offloaded (stmt);
-  tree clauses = gimple_omp_target_clauses (stmt);
 
   ctx = new_omp_context (stmt, outer_ctx);
   ctx->field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0);
@@ -2624,6 +2830,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o
       gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
     }
 
+  /* If is OpenACC construct, put non-contiguous array clauses (if any)
+     in front of clause chain. The runtime can then test the first to see
+     if the additional map processing for them is required.  */
+  if (is_gimple_omp_oacc (stmt))
+    reorder_noncontig_array_clauses (gimple_omp_target_clauses_ptr (stmt));
+
+  tree clauses = gimple_omp_target_clauses (stmt);
+  
   scan_sharing_clauses (clauses, ctx);
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
 
@@ -11335,6 +11549,15 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 	  case GOMP_MAP_FORCE_PRESENT:
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
+	  case GOMP_MAP_NONCONTIG_ARRAY_TO:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_TOFROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM:
+	  case GOMP_MAP_NONCONTIG_ARRAY_ALLOC:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC:
+	  case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT:
 	  case GOMP_MAP_LINK:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
@@ -11397,7 +11620,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 	if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 			   && OMP_CLAUSE_MAP_IN_REDUCTION (c)))
 	  {
-	    x = build_receiver_ref (var, true, ctx);
+	    tree var_type = TREE_TYPE (var);
+	    bool rcv_by_ref =
+	      (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+	       && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))
+	       && TREE_CODE (var_type) != ARRAY_TYPE
+	       ? false : true);
+
+	    x = build_receiver_ref (var, rcv_by_ref, ctx);
 	    tree new_var = lookup_decl (var, ctx);
 
 	    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
@@ -11647,6 +11877,24 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 		    avar = build_fold_addr_expr (avar);
 		    gimplify_assign (x, avar, &ilist);
 		  }
+		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+			 && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+		  {
+		    int dim_num;
+		    tree dimensions = OMP_CLAUSE_SIZE (c);
+
+		    tree array_descr_type =
+		      create_noncontig_array_descr_type (OMP_CLAUSE_DECL (c),
+							 dimensions, &dim_num);
+		    tree array_descr =
+		      create_tmp_var_raw (array_descr_type, ".omp_noncontig_array_descr");
+		    gimple_add_tmp_var (array_descr);
+
+		    create_noncontig_array_descr_init_code
+		      (array_descr, ovar, dimensions, dim_num, &ilist);
+
+		    gimplify_assign (x, build_fold_addr_expr (array_descr), &ilist);
+		  }
 		else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
 		  {
 		    gcc_assert (is_gimple_omp_oacc (ctx->stmt));
@@ -11718,6 +11966,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp
 		  s = TREE_TYPE (s);
 		s = TYPE_SIZE_UNIT (s);
 	      }
+	    else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+		     && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
+	      s = NULL_TREE;
 	    else
 	      s = OMP_CLAUSE_SIZE (c);
 	    if (s == NULL_TREE)
Index: gcc/testsuite/c-c++-common/goacc/noncontig_array-1.c
===================================================================
--- gcc/testsuite/c-c++-common/goacc/noncontig_array-1.c	(nonexistent)
+++ gcc/testsuite/c-c++-common/goacc/noncontig_array-1.c	(working copy)
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+
+void foo (void)
+{
+  int array_of_array[10][10];
+  int **ptr_to_ptr;
+  int *array_of_ptr[10];
+  int (*ptr_to_array)[10];
+ 
+  #pragma acc parallel copy (array_of_array[2:4][0:10])
+    array_of_array[5][5] = 1;
+
+  #pragma acc parallel copy (ptr_to_ptr[2:4][1:7])
+    ptr_to_ptr[5][5] = 1;
+
+  #pragma acc parallel copy (array_of_ptr[2:4][1:7])
+    array_of_ptr[5][5] = 1;
+
+  #pragma acc parallel copy (ptr_to_array[2:4][1:7]) /* { dg-error "array section is not contiguous in 'map' clause" } */
+    ptr_to_array[5][5] = 1;
+}
+/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom:array_of_array} 1 gimple } } */
+/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom,noncontig_array:ptr_to_ptr \[dimensions: 2 4, 1 7\]} 1 gimple } } */
+/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom,noncontig_array:array_of_ptr \[dimensions: 2 4, 1 7\]} 1 gimple } } */
+/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom,noncontig_array:ptr_to_array \[dimensions: 2 4, 1 7\]} 1 gimple { xfail *-*-* } } } */
Index: gcc/tree-pretty-print.c
===================================================================
--- gcc/tree-pretty-print.c	(revision 277827)
+++ gcc/tree-pretty-print.c	(working copy)
@@ -849,6 +849,33 @@ dump_omp_clause (pretty_printer *pp, tree clause,
 	case GOMP_MAP_LINK:
 	  pp_string (pp, "link");
 	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_TO:
+	  pp_string (pp, "to,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FROM:
+	  pp_string (pp, "from,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_TOFROM:
+	  pp_string (pp, "tofrom,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO:
+	  pp_string (pp, "force_to,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM:
+	  pp_string (pp, "force_from,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM:
+	  pp_string (pp, "force_tofrom,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_ALLOC:
+	  pp_string (pp, "alloc,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC:
+	  pp_string (pp, "force_alloc,noncontig_array");
+	  break;
+	case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT:
+	  pp_string (pp, "force_present,noncontig_array");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
@@ -859,8 +886,15 @@ dump_omp_clause (pretty_printer *pp, tree clause,
       if (OMP_CLAUSE_SIZE (clause))
 	{
 	  switch (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_MAP
-		  ? OMP_CLAUSE_MAP_KIND (clause) : GOMP_MAP_TO)
+		  ? (GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (clause))
+		     ? GOMP_MAP_NONCONTIG_ARRAY
+		     : OMP_CLAUSE_MAP_KIND (clause))
+		  : GOMP_MAP_TO)
 	    {
+	    case GOMP_MAP_NONCONTIG_ARRAY:
+	      gcc_assert (TREE_CODE (OMP_CLAUSE_SIZE (clause)) == TREE_LIST);
+	      pp_string (pp, " [dimensions: ");
+	      break;
 	    case GOMP_MAP_POINTER:
 	    case GOMP_MAP_FIRSTPRIVATE_POINTER:
 	    case GOMP_MAP_FIRSTPRIVATE_REFERENCE:
Index: include/gomp-constants.h
===================================================================
--- include/gomp-constants.h	(revision 277827)
+++ include/gomp-constants.h	(working copy)
@@ -40,6 +40,7 @@
 #define GOMP_MAP_FLAG_SPECIAL_0		(1 << 2)
 #define GOMP_MAP_FLAG_SPECIAL_1		(1 << 3)
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
+#define GOMP_MAP_FLAG_SPECIAL_3		(1 << 5)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
@@ -127,6 +128,26 @@ enum gomp_map_kind
     /* Decrement usage count and deallocate if zero.  */
     GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
 					 | GOMP_MAP_DELETE),
+    /* Mapping kinds for non-contiguous arrays.  */
+    GOMP_MAP_NONCONTIG_ARRAY =		(GOMP_MAP_FLAG_SPECIAL_3),
+    GOMP_MAP_NONCONTIG_ARRAY_TO =	(GOMP_MAP_NONCONTIG_ARRAY
+					 | GOMP_MAP_TO),
+    GOMP_MAP_NONCONTIG_ARRAY_FROM =	(GOMP_MAP_NONCONTIG_ARRAY
+					 | GOMP_MAP_FROM),
+    GOMP_MAP_NONCONTIG_ARRAY_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY
+					 | GOMP_MAP_TOFROM),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO =	(GOMP_MAP_NONCONTIG_ARRAY_TO
+					 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM =	(GOMP_MAP_NONCONTIG_ARRAY_FROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY_TOFROM
+						 | GOMP_MAP_FLAG_FORCE),
+    GOMP_MAP_NONCONTIG_ARRAY_ALLOC =		(GOMP_MAP_NONCONTIG_ARRAY
+						 | GOMP_MAP_ALLOC),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC =	(GOMP_MAP_NONCONTIG_ARRAY
+						 | GOMP_MAP_FORCE_ALLOC),
+    GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT =	(GOMP_MAP_NONCONTIG_ARRAY
+						 | GOMP_MAP_FORCE_PRESENT),
 
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
@@ -155,6 +176,8 @@ enum gomp_map_kind
 #define GOMP_MAP_ALWAYS_P(X) \
   (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM))
 
+#define GOMP_MAP_NONCONTIG_ARRAY_P(X) \
+  ((X) & GOMP_MAP_NONCONTIG_ARRAY)
 
 /* Asynchronous behavior.  Keep in sync with
    libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t.  */
Index: libgomp/oacc-parallel.c
===================================================================
--- libgomp/oacc-parallel.c	(revision 277827)
+++ libgomp/oacc-parallel.c	(working copy)
@@ -111,6 +111,21 @@ handle_ftn_pointers (size_t mapnum, void **hostadd
     }
 }
 
+static inline void
+revert_noncontig_array_map_pointers (size_t mapnum, void **hostaddrs,
+				     unsigned short *kinds)
+{
+  for (int i = 0; i < mapnum; i++)
+    {
+      if (GOMP_MAP_NONCONTIG_ARRAY_P (kinds[i] & 0xff))
+	hostaddrs[i] = *((void **)hostaddrs[i]);
+      else
+	/* We assume all non-contiguous array map entries are placed at the
+	   start; first other map kind means we can exit.  */
+	break;
+    }
+}
+
 static void goacc_wait (int async, int num_waits, va_list *ap);
 
 
@@ -212,6 +227,7 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (voi
       prof_info.device_type = acc_device_host;
       api_info.device_type = prof_info.device_type;
       goacc_save_and_set_bind (acc_device_host);
+      revert_noncontig_array_map_pointers (mapnum, hostaddrs, kinds);
       fn (hostaddrs);
       goacc_restore_bind ();
       goto out_prof;
@@ -218,6 +234,7 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (voi
     }
   else if (acc_device_type (acc_dev->type) == acc_device_host)
     {
+      revert_noncontig_array_map_pointers (mapnum, hostaddrs, kinds);
       fn (hostaddrs);
       goto out_prof;
     }
Index: libgomp/target.c
===================================================================
--- libgomp/target.c	(revision 277827)
+++ libgomp/target.c	(working copy)
@@ -520,6 +520,152 @@ gomp_map_val (struct target_mem_desc *tgt, void **
     }
 }
 
+/* Definitions for data structures describing non-contiguous arrays
+   (Note: interfaces with compiler)
+
+   The compiler generates a descriptor for each such array, places the
+   descriptor on stack, and passes the address of the descriptor to the libgomp
+   runtime as a normal map argument. The runtime then processes the array
+   data structure setup, and replaces the argument with the new actual
+   array address for the child function.
+
+   Care must be taken such that the struct field and layout assumptions
+   of struct gomp_ncarray_dim, gomp_ncarray_descr_type inside the compiler
+   be consistant with the below declarations.  */
+
+struct gomp_ncarray_dim {
+  size_t base;
+  size_t length;
+  size_t elem_size;
+  size_t is_array;
+};
+
+struct gomp_ncarray_descr_type {
+  void *ptr;
+  size_t ndims;
+  struct gomp_ncarray_dim dims[];
+};
+
+/* Internal non-contiguous array info struct, used only here inside the runtime. */
+
+struct ncarray_info
+{
+  struct gomp_ncarray_descr_type *descr;
+  size_t map_index;
+  size_t ptrblock_size;
+  size_t data_row_num;
+  size_t data_row_size;
+};
+
+static size_t
+gomp_noncontig_array_count_rows (struct gomp_ncarray_descr_type *descr)
+{
+  size_t nrows = 1;
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    nrows *= descr->dims[d].length / sizeof (void *);
+  return nrows;
+}
+
+static void
+gomp_noncontig_array_compute_info (struct ncarray_info *nca)
+{
+  size_t d, n = 1;
+  struct gomp_ncarray_descr_type *descr = nca->descr;
+
+  nca->ptrblock_size = 0;
+  for (d = 0; d < descr->ndims - 1; d++)
+    {
+      size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size;
+      size_t dim_ptrblock_size = (descr->dims[d + 1].is_array
+				  ? 0 : descr->dims[d].length * n);
+      nca->ptrblock_size += dim_ptrblock_size;
+      n *= dim_count;
+    }
+  nca->data_row_num = n;
+  nca->data_row_size = descr->dims[d].length;
+}
+
+static void
+gomp_noncontig_array_fill_rows_1 (struct gomp_ncarray_descr_type *descr, void *nca,
+				  size_t d, void ***row_ptr, size_t *count)
+{
+  if (d < descr->ndims - 1)
+    {
+      size_t elsize = descr->dims[d].elem_size;
+      size_t n = descr->dims[d].length / elsize;
+      void *p = nca + descr->dims[d].base;
+      for (size_t i = 0; i < n; i++)
+	{
+	  void *ptr = p + i * elsize;
+	  /* Deref if next dimension is not array.  */
+	  if (!descr->dims[d + 1].is_array)
+	    ptr = *((void **) ptr);
+	  gomp_noncontig_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count);
+	}
+    }
+  else
+    {
+      **row_ptr = nca + descr->dims[d].base;
+      *row_ptr += 1;
+      *count += 1;
+    }
+}
+
+static size_t
+gomp_noncontig_array_fill_rows (struct gomp_ncarray_descr_type *descr, void *rows[])
+{
+  size_t count = 0;
+  void **p = rows;
+  gomp_noncontig_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count);
+  return count;
+}
+
+static void *
+gomp_noncontig_array_create_ptrblock (struct ncarray_info *nca,
+				      void *tgt_addr, void *tgt_data_rows[])
+{
+  struct gomp_ncarray_descr_type *descr = nca->descr;
+  void *ptrblock = gomp_malloc (nca->ptrblock_size);
+  void **curr_dim_ptrblock = (void **) ptrblock;
+  size_t n = 1;
+
+  for (size_t d = 0; d < descr->ndims - 1; d++)
+    {
+      int curr_dim_len = descr->dims[d].length;
+      int next_dim_len = descr->dims[d + 1].length;
+      int curr_dim_num = curr_dim_len / sizeof (void *);
+      size_t next_dim_bias = descr->dims[d + 1].base;
+
+      void *next_dim_ptrblock
+	= (void *)(curr_dim_ptrblock + n * curr_dim_num);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < curr_dim_num; i++)
+	  {
+	    if (d < descr->ndims - 2)
+	      {
+		void *ptr = (next_dim_ptrblock
+			     + b * curr_dim_num * next_dim_len
+			     + i * next_dim_len);
+		void *tgt_ptr = tgt_addr + (ptr - ptrblock) - next_dim_bias;
+		curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr;
+	      }
+	    else
+	      {
+		curr_dim_ptrblock[b * curr_dim_num + i]
+		  = tgt_data_rows[b * curr_dim_num + i] - next_dim_bias;
+	      }
+	    void *addr = &curr_dim_ptrblock[b * curr_dim_num + i];
+	    assert (ptrblock <= addr && addr < ptrblock + nca->ptrblock_size);
+	  }
+
+      n *= curr_dim_num;
+      curr_dim_ptrblock = next_dim_ptrblock;
+    }
+  assert (n == nca->data_row_num);
+  return ptrblock;
+}
+
 static inline __attribute__((always_inline)) struct target_mem_desc *
 gomp_map_vars_internal (struct gomp_device_descr *devicep,
 			struct goacc_asyncqueue *aq, size_t mapnum,
@@ -533,9 +679,37 @@ gomp_map_vars_internal (struct gomp_device_descr *
   const int typemask = short_mapkind ? 0xff : 0x7;
   struct splay_tree_s *mem_map = &devicep->mem_map;
   struct splay_tree_key_s cur_node;
-  struct target_mem_desc *tgt
-    = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
-  tgt->list_count = mapnum;
+  struct target_mem_desc *tgt;
+
+  bool process_noncontig_arrays = false;
+  size_t nca_data_row_num = 0, row_start = 0;
+  size_t nca_info_num = 0, nca_index;
+  struct ncarray_info *nca_info = NULL;
+  struct target_var_desc *row_desc;
+  uintptr_t target_row_addr;
+  void **host_data_rows = NULL, **target_data_rows = NULL;
+  void *row;
+
+  if (mapnum > 0)
+    {
+      int kind = get_kind (short_mapkind, kinds, 0);
+      process_noncontig_arrays = GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask);
+    }
+
+  if (process_noncontig_arrays)
+    for (i = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	  {
+	    nca_data_row_num += gomp_noncontig_array_count_rows (hostaddrs[i]);
+	    nca_info_num += 1;
+	  }
+      }
+
+  tgt = gomp_malloc (sizeof (*tgt)
+		     + sizeof (tgt->list[0]) * (mapnum + nca_data_row_num));
+  tgt->list_count = mapnum + nca_data_row_num;
   tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
   tgt->device_descr = devicep;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
@@ -547,6 +721,14 @@ gomp_map_vars_internal (struct gomp_device_descr *
       return tgt;
     }
 
+  if (nca_info_num)
+    nca_info = gomp_alloca (sizeof (struct ncarray_info) * nca_info_num);
+  if (nca_data_row_num)
+    {
+      host_data_rows = gomp_malloc (2 * sizeof (void *) * nca_data_row_num);
+      target_data_rows = &host_data_rows[nca_data_row_num];
+    }
+
   tgt_align = sizeof (void *);
   tgt_size = 0;
   cbuf.chunks = NULL;
@@ -578,7 +760,7 @@ gomp_map_vars_internal (struct gomp_device_descr *
       return NULL;
     }
 
-  for (i = 0; i < mapnum; i++)
+  for (i = 0, nca_index = 0; i < mapnum; i++)
     {
       int kind = get_kind (short_mapkind, kinds, i);
       if (hostaddrs[i] == NULL
@@ -667,6 +849,20 @@ gomp_map_vars_internal (struct gomp_device_descr *
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	{
+	  /* Ignore non-contiguous arrays for now, we process them together
+	     later.  */
+	  tgt->list[i].key = NULL;
+	  tgt->list[i].offset = 0;
+	  not_found_cnt++;
+
+	  struct ncarray_info *nca = &nca_info[nca_index++];
+	  nca->descr = (struct gomp_ncarray_descr_type *) hostaddrs[i];
+	  nca->map_index = i;
+	  continue;
+	}
+
       cur_node.host_start = (uintptr_t) hostaddrs[i];
       if (!GOMP_MAP_POINTER_P (kind & typemask))
 	cur_node.host_end = cur_node.host_start + sizes[i];
@@ -735,6 +931,56 @@ gomp_map_vars_internal (struct gomp_device_descr *
 	}
     }
 
+  /* For non-contiguous arrays. Each data row is one target item, separated
+     from the normal map clause items, hence we order them after mapnum.  */
+  if (process_noncontig_arrays)
+    for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++)
+      {
+	int kind = get_kind (short_mapkind, kinds, i);
+	if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	  continue;
+
+	struct ncarray_info *nca = &nca_info[nca_index++];
+	struct gomp_ncarray_descr_type *descr = nca->descr;
+	size_t nr;
+
+	gomp_noncontig_array_compute_info (nca);
+
+	/* We have allocated space in host/target_data_rows to place all the
+	   row data block pointers, now we can start filling them in.  */
+	nr = gomp_noncontig_array_fill_rows (descr, &host_data_rows[row_start]);
+	assert (nr == nca->data_row_num);
+
+	size_t align = (size_t) 1 << (kind >> rshift);
+	if (tgt_align < align)
+	  tgt_align = align;
+	tgt_size = (tgt_size + align - 1) & ~(align - 1);
+	tgt_size += nca->ptrblock_size;
+
+	for (size_t j = 0; j < nca->data_row_num; j++)
+	  {
+	    row = host_data_rows[row_start + j];
+	    row_desc = &tgt->list[mapnum + row_start + j];
+
+	    cur_node.host_start = (uintptr_t) row;
+	    cur_node.host_end = cur_node.host_start + nca->data_row_size;
+	    splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+	    if (n)
+	      {
+		assert (n->refcount != REFCOUNT_LINK);
+		gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc,
+					kind & typemask, /* TODO: cbuf? */ NULL);
+	      }
+	    else
+	      {
+		tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		tgt_size += nca->data_row_size;
+		not_found_cnt++;
+	      }
+	  }
+	row_start += nca->data_row_num;
+      }
+
   if (devaddrs)
     {
       if (mapnum != 1)
@@ -895,6 +1141,15 @@ gomp_map_vars_internal (struct gomp_device_descr *
 	      default:
 		break;
 	      }
+
+	    if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+	      {
+		tgt->list[i].key = &array->key;
+		tgt->list[i].key->tgt = tgt;
+		array++;
+		continue;
+	      }
+
 	    splay_tree_key k = &array->key;
 	    k->host_start = (uintptr_t) hostaddrs[i];
 	    if (!GOMP_MAP_POINTER_P (kind & typemask))
@@ -1044,8 +1299,112 @@ gomp_map_vars_internal (struct gomp_device_descr *
 		array++;
 	      }
 	  }
+
+      /* Processing of non-contiguous array rows.  */
+      if (process_noncontig_arrays)
+	{
+	  for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++)
+	    {
+	      int kind = get_kind (short_mapkind, kinds, i);
+	      if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
+		continue;
+
+	      struct ncarray_info *nca = &nca_info[nca_index++];
+	      assert (nca->descr == hostaddrs[i]);
+
+	      /* The map for the non-contiguous array itself is never copied from
+		 during unmapping, its the data rows that count. Set copy-from
+		 flags to false here.  */
+	      tgt->list[i].copy_from = false;
+	      tgt->list[i].always_copy_from = false;
+
+	      size_t align = (size_t) 1 << (kind >> rshift);
+	      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+
+	      /* For the map of the non-contiguous array itself, adjust so that
+		 the passed device address points to the beginning of the
+		 ptrblock. Remember to adjust the first-dimension's bias here.   */
+	      tgt->list[i].key->tgt_offset = tgt_size - nca->descr->dims[0].base;
+
+	      void *target_ptrblock = (void*) tgt->tgt_start + tgt_size;
+	      tgt_size += nca->ptrblock_size;
+
+	      /* Add splay key for each data row in current non-contiguous
+		 array.  */
+	      for (size_t j = 0; j < nca->data_row_num; j++)
+		{
+		  row = host_data_rows[row_start + j];
+		  row_desc = &tgt->list[mapnum + row_start + j];
+
+		  cur_node.host_start = (uintptr_t) row;
+		  cur_node.host_end = cur_node.host_start + nca->data_row_size;
+		  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+		  if (n)
+		    {
+		      assert (n->refcount != REFCOUNT_LINK);
+		      gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc,
+					      kind & typemask, cbufp);
+		      target_row_addr = n->tgt->tgt_start + n->tgt_offset;
+		    }
+		  else
+		    {
+		      tgt->refcount++;
+
+		      splay_tree_key k = &array->key;
+		      k->host_start = (uintptr_t) row;
+		      k->host_end = k->host_start + nca->data_row_size;
+
+		      k->tgt = tgt;
+		      k->refcount = 1;
+		      k->link_key = NULL;
+		      tgt_size = (tgt_size + align - 1) & ~(align - 1);
+		      target_row_addr = tgt->tgt_start + tgt_size;
+		      k->tgt_offset = tgt_size;
+		      tgt_size += nca->data_row_size;
+
+		      row_desc->key = k;
+		      row_desc->copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->always_copy_from
+			= GOMP_MAP_COPY_FROM_P (kind & typemask);
+		      row_desc->offset = 0;
+		      row_desc->length = nca->data_row_size;
+
+		      array->left = NULL;
+		      array->right = NULL;
+		      splay_tree_insert (mem_map, array);
+
+		      if (GOMP_MAP_COPY_TO_P (kind & typemask))
+			gomp_copy_host2dev (devicep, aq,
+					    (void *) tgt->tgt_start + k->tgt_offset,
+					    (void *) k->host_start,
+					    nca->data_row_size, cbufp);
+		      array++;
+		    }
+		  target_data_rows[row_start + j] = (void *) target_row_addr;
+		}
+
+	      /* Now we have the target memory allocated, and target offsets of all
+		 row blocks assigned and calculated, we can construct the
+		 accelerator side ptrblock and copy it in.  */
+	      if (nca->ptrblock_size)
+		{
+		  void *ptrblock = gomp_noncontig_array_create_ptrblock
+		    (nca, target_ptrblock, target_data_rows + row_start);
+		  gomp_copy_host2dev (devicep, aq, target_ptrblock, ptrblock,
+				      nca->ptrblock_size, cbufp);
+		  free (ptrblock);
+		}
+
+	      row_start += nca->data_row_num;
+	    }
+	  assert (row_start == nca_data_row_num && nca_index == nca_info_num);
+	}
     }
 
+  if (nca_data_row_num)
+    free (host_data_rows);
+
   if (pragma_kind == GOMP_MAP_VARS_TARGET)
     {
       for (i = 0; i < mapnum; i++)
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c	(working copy)
@@ -0,0 +1,103 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+#include <assert.h>
+
+#define n 100
+#define m 100
+
+int b[n][m];
+
+void
+test1 (void)
+{
+  int i, j, *a[100];
+
+  /* Array of pointers form test.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+}
+
+void
+test2 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+
+  /* Separately allocated blocks.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = (int *)malloc (sizeof (int) * m);
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    {
+      for (j = 0; j < m; j++)
+	assert (a[i][j] == b[i][j]);
+      /* Clean up.  */
+      free (a[i]);
+    }
+  free (a);
+}
+
+void
+test3 (void)
+{
+  int i, j, **a = (int **) malloc (sizeof (int *) * n);
+  a[0] = (int *) malloc (sizeof (int) * n * m);
+
+  /* Rows allocated in one contiguous block.  */
+  for (i = 0; i < n; i++)
+    {
+      a[i] = *a + i * m;
+      for (j = 0; j < m; j++)
+	b[i][j] = j - i;
+    }
+
+  #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b)
+  for (i = 0; i < n; i++)
+    #pragma acc loop
+    for (j = 0; j < m; j++)
+      a[i][j] = b[i][j];
+
+  for (i = 0; i < n; i++)
+    for (j = 0; j < m; j++)
+      assert (a[i][j] == b[i][j]);
+
+  free (a[0]);
+  free (a);
+}
+
+int
+main (void)
+{
+  test1 ();
+  test2 ();
+  test3 ();
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c	(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+
+#include <assert.h>
+#include "noncontig_array-utils.h"
+
+int
+main (void)
+{
+  int n = 10;
+  int ***a = (int ***) create_ncarray (sizeof (int), n, 3);
+  int ***b = (int ***) create_ncarray (sizeof (int), n, 3);
+  int ***c = (int ***) create_ncarray (sizeof (int), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	{
+	  a[i][j][k] = i + j * k + k;
+	  b[i][j][k] = j + k * i + i * j;
+	  c[i][j][k] = a[i][j][k];
+	}
+
+  #pragma acc parallel copy (a[0:n][0:n][0:n]) copyin (b[0:n][0:n][0:n])
+  {
+    for (int i = 0; i < n; i++)
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  a[i][j][k] += b[k][j][i] + i + j + k;
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (a[i][j][k] == c[i][j][k] + b[k][j][i] + i + j + k);
+
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c	(working copy)
@@ -0,0 +1,45 @@
+/* { dg-do run } */
+
+#include <assert.h>
+#include "noncontig_array-utils.h"
+
+int main (void)
+{
+  int n = 20, x = 5, y = 12;
+  int *****a = (int *****) create_ncarray (sizeof (int), n, 5);
+
+  int sum1 = 0, sum2 = 0, sum3 = 0;
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    {
+	      a[i][j][k][l][m] = 1;
+	      sum1++;
+	    }
+
+  #pragma acc parallel copy (a[x:y][x:y][x:y][x:y][x:y]) copy(sum2)
+  {
+    for (int i = x; i < x + y; i++)
+      for (int j = x; j < x + y; j++)
+	for (int k = x; k < x + y; k++)
+	  for (int l = x; l < x + y; l++)
+	    for (int m = x; m < x + y; m++)
+	      {
+		a[i][j][k][l][m] = 0;
+		sum2++;
+	      }
+  }
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	for (int l = 0; l < n; l++)
+	  for (int m = 0; m < n; m++)
+	    sum3 += a[i][j][k][l][m];
+
+  assert (sum1 == sum2 + sum3);
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c	(working copy)
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+
+#include <assert.h>
+#include "noncontig_array-utils.h"
+
+int main (void)
+{
+  int n = 128;
+  double ***a = (double ***) create_ncarray (sizeof (double), n, 3);
+  double ***b = (double ***) create_ncarray (sizeof (double), n, 3);
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	a[i][j][k] = i + j + k + i * j * k;
+
+  /* This test exercises async copyout of non-contiguous array rows.  */
+  #pragma acc parallel copyin(a[0:n][0:n][0:n]) copyout(b[0:n][0:n][0:n]) async(5)
+  {
+    #pragma acc loop gang
+    for (int i = 0; i < n; i++)
+      #pragma acc loop vector
+      for (int j = 0; j < n; j++)
+	for (int k = 0; k < n; k++)
+	  b[i][j][k] = a[i][j][k] * 2.0;
+  }
+
+  #pragma acc wait (5)
+
+  for (int i = 0; i < n; i++)
+    for (int j = 0; j < n; j++)
+      for (int k = 0; k < n; k++)
+	assert (b[i][j][k] == a[i][j][k] * 2.0);
+
+  return 0;
+}
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h	(nonexistent)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h	(working copy)
@@ -0,0 +1,44 @@
+#include <stdlib.h>
+#include <string.h>
+#include <assert.h>
+#include <stdint.h>
+
+/* Allocate and create a pointer based NDIMS-dimensional array,
+   each dimension DIMLEN long, with ELSIZE sized data elements.  */
+void *
+create_ncarray (size_t elsize, int dimlen, int ndims)
+{
+  size_t blk_size = 0;
+  size_t n = 1;
+
+  for (int i = 0; i < ndims - 1; i++)
+    {
+      n *= dimlen;
+      blk_size += sizeof (void *) * n;
+    }
+  size_t data_rows_num = n;
+  size_t data_rows_offset = blk_size;
+  blk_size += elsize * n * dimlen;
+
+  void *blk = (void *) malloc (blk_size);
+  memset (blk, 0, blk_size);
+  void **curr_dim = (void **) blk;
+  n = 1;
+
+  for (int d = 0; d < ndims - 1; d++)
+    {
+      uintptr_t next_dim = (uintptr_t) (curr_dim + n * dimlen);
+      size_t next_dimlen = dimlen * (d < ndims - 2 ? sizeof (void *) : elsize);
+
+      for (int b = 0; b < n; b++)
+        for (int i = 0; i < dimlen; i++)
+	  if (d < ndims - 1)
+	    curr_dim[b * dimlen + i]
+	      = (void*) (next_dim + b * dimlen * next_dimlen + i * next_dimlen);
+
+      n *= dimlen;
+      curr_dim = (void**) next_dim;
+    }
+  assert (n == data_rows_num);
+  return blk;
+}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses
  2019-11-05 14:36                         ` [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses Chung-Lin Tang
@ 2019-11-07  0:49                           ` Thomas Schwinge
  2019-11-12 12:42                             ` Chung-Lin Tang
  0 siblings, 1 reply; 24+ messages in thread
From: Thomas Schwinge @ 2019-11-07  0:49 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: gcc-patches, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 16795 bytes --]

Hi Chung-Lin!

On 2019-11-05T22:35:43+0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> Hi Thomas,
> after your last round of review, I realized that the bulk of the compiler omp-low work was
> simply a case of dumb over-engineering in the wrong direction :P
> (although it did painstakingly function correctly)

Hehe -- that happens.  ;-)

> However, the issue of ACC_DEVICE_TYPE=host not working (and hence "!openacc_host_selected"
> in the testcases)

Actually not just for that, but also generally for any shared-memory
models that may come into existance at some point, such as CUDA Unified
Memory, for example?

> actually is a bit more sophisticated than I thought:
>
> The reason it doesn't work for the host device, is because we use the map pointer (i.e.
> a hostaddrs[] entry when passed into libgomp) to point to an array descriptor to pass
> the whole array information, and rely on code inside gomp_map_vars_* to setup things,
> and place the final on-device address of the non-contig. array into devaddrs[], therefore
> only using a single map entry (something I thought was quite clever)
>
> However, this broke down on the host and host-fallback devices, simply because, there
> we do NOT do any gomp_map_vars processing; our current code in GOACC_parallel_keyed
> simply skips it and passes the offload function the original hostaddrs[] contents.
> Lacking the processing to transform the descriptor pointer into a proper array ref,
> things of course segfault.
>
> So I think we have three options for this (which may have some interactions with say,
> the "proper" host-side parallelization we eventually need to implement for OpenACC 2.7)
>
> (1) The simplest solution: implement a processing which searches and reverts such
> non-contiguous array map entries in GOACC_parallel_keyed.
> (note: I have implemented this in the current attached "v2" patch)
>
> (2) Make the GOACC_parallel_keyed code to not make short cuts for host-modes;
> i.e. still do the proper gomp_map_vars processing for all cases.
>
> (3) Modify the non-contiguous array map conventions: a possible solution is to use
> two maps placed together: one for the array pointer, another for the array descriptor (as
> opposed to the current style of using only one map) This needs more further elaborate
> compiler/runtime work.
>
> The first two options will pessimize host-mode performance somewhat. The third I have
> some WIP patches, but it's still buggy ATM. Seeking your opinion on what we should do.

I'll have to think about it some more, but variant (1) doesn't seem so
bad actually, for a first take.  While it's not nice to pessimize in
particular directives with 'if (false)' clauses, at least it does work,
the run-time overhead should not be too bad (also compared to variant
(2), I suppose), and variant (3) can still be implemented later.


A few comments/questions:

Please reference PR76739 in your submission/ChangeLog updates.

> --- gcc/c/c-typeck.c	(revision 277827)
> +++ gcc/c/c-typeck.c	(working copy)
> @@ -12868,7 +12868,7 @@ c_finish_omp_cancellation_point (location_t loc, t
>  static tree
>  handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
>  			     bool &maybe_zero_len, unsigned int &first_non_one,
> -			     enum c_omp_region_type ort)
> +			     bool &non_contiguous, enum c_omp_region_type ort)
>  {
>    tree ret, low_bound, length, type;
>    if (TREE_CODE (t) != TREE_LIST)

> @@ -13160,14 +13161,21 @@ handle_omp_array_sections_1 (tree c, tree t, vec<t
>  	  return error_mark_node;
>  	}
>        /* If there is a pointer type anywhere but in the very first
> -	 array-section-subscript, the array section can't be contiguous.  */
> +	 array-section-subscript, the array section can't be contiguous.
> +	 Note that OpenACC does accept these kinds of non-contiguous pointer
> +	 based arrays.  */

That comment update should instead be moved to the function comment
before the 'handle_omp_array_sections_1' function definition, and should
then also explain the new 'non_contiguous' out variable.  The latter
needs to be done anyway, and the former (no comment here) is easy enough
to tell from the code:

>        if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
>  	  && TREE_CODE (TREE_CHAIN (t)) == TREE_LIST)
>  	{
> -	  error_at (OMP_CLAUSE_LOCATION (c),
> -		    "array section is not contiguous in %qs clause",
> -		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
> -	  return error_mark_node;
> +	  if (ort == C_ORT_ACC)
> +	    non_contiguous = true;
> +	  else
> +	    {
> +	      error_at (OMP_CLAUSE_LOCATION (c),
> +			"array section is not contiguous in %qs clause",
> +			omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
> +	      return error_mark_node;
> +	    }
>  	}

> @@ -13238,6 +13247,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
>        unsigned int num = types.length (), i;
>        tree t, side_effects = NULL_TREE, size = NULL_TREE;
>        tree condition = NULL_TREE;
> +      tree ncarray_dims = NULL_TREE;
>  
>        if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
>  	maybe_zero_len = true;
> @@ -13261,6 +13271,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi
>  	    length = fold_convert (sizetype, length);
>  	  if (low_bound == NULL_TREE)
>  	    low_bound = integer_zero_node;
> +
> +	  if (non_contiguous)
> +	    {
> +	      ncarray_dims = tree_cons (low_bound, length, ncarray_dims);
> +	      continue;
> +	    }
> +
>  	  if (!maybe_zero_len && i > first_non_one)
>  	    {
>  	      if (integer_nonzerop (low_bound))

I'm not at all familiar with this array sections code, will trust your
understanding that we don't need any of the processing that you're
skipping here ('continue'): 'TREE_SIDE_EFFECTS' handling for the length
expressions, and other things.

> @@ -13357,6 +13374,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi
>  		size = size_binop (MULT_EXPR, size, l);
>  	    }
>  	}
> +      if (non_contiguous)
> +	{
> +	  int kind = OMP_CLAUSE_MAP_KIND (c);
> +	  OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY);
> +	  OMP_CLAUSE_DECL (c) = t;
> +	  OMP_CLAUSE_SIZE (c) = ncarray_dims;
> +	  return false;
> +	}
>        if (side_effects)
>  	size = build2 (COMPOUND_EXPR, sizetype, side_effects, size);
>        if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION

Likewise for all the code being skipped here ('return false').

> --- gcc/cp/semantics.c	(revision 277827)
> +++ gcc/cp/semantics.c	(working copy)

Analoguous to the C front end.

> --- gcc/gimplify.c	(revision 277827)
> +++ gcc/gimplify.c	(working copy)
> @@ -8622,9 +8622,17 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_se
>  	  if (OMP_CLAUSE_SIZE (c) == NULL_TREE)
>  	    OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl)
>  				  : TYPE_SIZE_UNIT (TREE_TYPE (decl));
> +	  if (OMP_CLAUSE_SIZE (c)
> +	      && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST
> +	      && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))

Per the code above, 'OMP_CLAUSE_SIZE (c)' will always be set to
something, so no point in checking that here?

Isn't the 'GOMP_MAP_NONCONTIG_ARRAY_P' check alone sufficient already?
And then maybe 'assert (TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST' in
here:

>  	    {
> +	      /* For non-contiguous array maps, OMP_CLAUSE_SIZE is a TREE_LIST
> +		 of the individual array dimensions, which gimplify_expr doesn't
> +		 handle, so skip the call to gimplify_expr here.  */
> +	    }

> -	  if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
> -			     NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
> +	  else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p,
> +				  NULL, is_gimple_val, fb_rvalue) == GS_ERROR)
> +	    {
>  	      remove = true;
>  	      break;
>  	    }

Again, that means we're skipping other code here; don't understand yet.

Your ChangeLog update says:

> 	* gimplify.c (gimplify_scan_omp_clauses): For non-contiguous array map kinds,
> 	make sure bias in each dimension are put into firstprivate variables.

I'm not yet seeing how that's happening.

Ah, I see that ChangeLog comment is probably just a remnant from the
previous version.

> --- gcc/omp-low.c	(revision 277827)
> +++ gcc/omp-low.c	(working copy)

Have not yet reviewed in detail.

> @@ -1367,6 +1498,38 @@ scan_sharing_clauses (tree clauses, omp_context *c
>  	      install_var_local (decl, ctx);
>  	      break;
>  	    }
> +
> +	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> +	      && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)))
> +	    {
> +	      tree array_decl = OMP_CLAUSE_DECL (c);
> +	      tree array_type = TREE_TYPE (array_decl);
> +	      bool by_ref = (TREE_CODE (array_type) == ARRAY_TYPE
> +			     ? true : false);
> +
> +	      /* Checking code to ensure we only have arrays at top dimension.
> +		 This limitation might be lifted in the future.  */

Please reference PR76739 here, and in PR76739 also add a comment about
this limitation.  (As well as any other limitations, of course.)

> +	      if (TREE_CODE (array_type) == REFERENCE_TYPE)
> +		array_type = TREE_TYPE (array_type);
> +	      tree t = array_type, prev_t = NULL_TREE;
> +	      while (t)
> +		{
> +		  if (TREE_CODE (t) == ARRAY_TYPE && prev_t)
> +		    {
> +		      error_at (gimple_location (ctx->stmt), "array types are"
> +				" only allowed at outermost dimension of"
> +				" non-contiguous array");
> +		      break;
> +		    }
> +		  prev_t = t;
> +		  t = TREE_TYPE (t);
> +		}
> +
> +	      install_var_field (array_decl, by_ref, 3, ctx);
> +	      install_var_local (array_decl, ctx);
> +	      break;
> +	    }
> +

Assuming this intentionally means to skip ('break' just above) the
following 'if (DECL_P (decl))' and its 'else' branch, then maybe remove
the 'break' just above, and instead do 'else if (DECL_P (decl))'?

>  	  if (DECL_P (decl))
>  	    {
>  	      if (DECL_SIZE (decl)

> @@ -2624,6 +2830,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o
>        gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
>      }
> 
> +  /* If is OpenACC construct, put non-contiguous array clauses (if any)
> +     in front of clause chain. The runtime can then test the first to see
> +     if the additional map processing for them is required.  */
> +  if (is_gimple_omp_oacc (stmt))
> +    reorder_noncontig_array_clauses (gimple_omp_target_clauses_ptr (stmt));

Should that be deemed unsuitable for any reason, then add a new
'GOACC_FLAG_*' flag to indicate existance of non-contiguous arrays.

> --- include/gomp-constants.h	(revision 277827)
> +++ include/gomp-constants.h	(working copy)
> @@ -40,6 +40,7 @@
>  #define GOMP_MAP_FLAG_SPECIAL_0		(1 << 2)
>  #define GOMP_MAP_FLAG_SPECIAL_1		(1 << 3)
>  #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
> +#define GOMP_MAP_FLAG_SPECIAL_3		(1 << 5)
>  #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
>  					 | GOMP_MAP_FLAG_SPECIAL_0)
>  /* Flag to force a specific behavior (or else, trigger a run-time error).  */
> @@ -127,6 +128,26 @@ enum gomp_map_kind
>      /* Decrement usage count and deallocate if zero.  */
>      GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
>  					 | GOMP_MAP_DELETE),
> +    /* Mapping kinds for non-contiguous arrays.  */
> +    GOMP_MAP_NONCONTIG_ARRAY =		(GOMP_MAP_FLAG_SPECIAL_3),
> +    GOMP_MAP_NONCONTIG_ARRAY_TO =	(GOMP_MAP_NONCONTIG_ARRAY
> +					 | GOMP_MAP_TO),
> +    GOMP_MAP_NONCONTIG_ARRAY_FROM =	(GOMP_MAP_NONCONTIG_ARRAY
> +					 | GOMP_MAP_FROM),
> +    GOMP_MAP_NONCONTIG_ARRAY_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY
> +					 | GOMP_MAP_TOFROM),
> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO =	(GOMP_MAP_NONCONTIG_ARRAY_TO
> +					 | GOMP_MAP_FLAG_FORCE),
> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM =	(GOMP_MAP_NONCONTIG_ARRAY_FROM
> +						 | GOMP_MAP_FLAG_FORCE),
> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY_TOFROM
> +						 | GOMP_MAP_FLAG_FORCE),
> +    GOMP_MAP_NONCONTIG_ARRAY_ALLOC =		(GOMP_MAP_NONCONTIG_ARRAY
> +						 | GOMP_MAP_ALLOC),
> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC =	(GOMP_MAP_NONCONTIG_ARRAY
> +						 | GOMP_MAP_FORCE_ALLOC),
> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT =	(GOMP_MAP_NONCONTIG_ARRAY
> +						 | GOMP_MAP_FORCE_PRESENT),

Just an idea: instead of this long list, would it maybe be better (if
feasible at all?) to have a single "lead-in" mapping
'GOMP_MAP_NONCONTIG_ARRAY_MODE', which specifies how many of the
following (normal) mappings belong to that "non-contiguous array mode".
(Roughly similar to what 'GOMP_MAP_TO_PSET' is doing with any
'GOMP_MAP_POINTER's following it.)  Might that make some things simpler,
or even more complicated (more internal state to keep)?

> --- libgomp/oacc-parallel.c	(revision 277827)
> +++ libgomp/oacc-parallel.c	(working copy)

> +static inline void
> +revert_noncontig_array_map_pointers (size_t mapnum, void **hostaddrs,
> +				     unsigned short *kinds)
> +{
> +  for (int i = 0; i < mapnum; i++)
> +    {
> +      if (GOMP_MAP_NONCONTIG_ARRAY_P (kinds[i] & 0xff))
> +	hostaddrs[i] = *((void **)hostaddrs[i]);

Can we be (or, do we make) sure that 'hostaddrs' will never be in
read-only memory?

And, it's permissible to alter 'hostaddrs'?

Ah, other code (including 'libgomp/target.c') is doing such things, too,
so it must be fine.

> +      else
> +	/* We assume all non-contiguous array map entries are placed at the
> +	   start; first other map kind means we can exit.  */
> +	break;
> +    }
> +}

> --- libgomp/target.c	(revision 277827)
> +++ libgomp/target.c	(working copy)

Have not yet reviewed in detail.

> @@ -533,9 +679,37 @@ gomp_map_vars_internal (struct gomp_device_descr *
>    const int typemask = short_mapkind ? 0xff : 0x7;
>    struct splay_tree_s *mem_map = &devicep->mem_map;
>    struct splay_tree_key_s cur_node;
> -  struct target_mem_desc *tgt
> -    = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
> -  tgt->list_count = mapnum;
> +  struct target_mem_desc *tgt;
> +
> +  bool process_noncontig_arrays = false;
> +  size_t nca_data_row_num = 0, row_start = 0;
> +  size_t nca_info_num = 0, nca_index;
> +  struct ncarray_info *nca_info = NULL;
> +  struct target_var_desc *row_desc;
> +  uintptr_t target_row_addr;
> +  void **host_data_rows = NULL, **target_data_rows = NULL;
> +  void *row;
> +
> +  if (mapnum > 0)
> +    {

Also add such a comment here: "We assume all non-contiguous array map
entries are placed at the start".

> +      int kind = get_kind (short_mapkind, kinds, 0);
> +      process_noncontig_arrays = GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask);
> +    }
> +
> +  if (process_noncontig_arrays)
> +    for (i = 0; i < mapnum; i++)
> +      {
> +	int kind = get_kind (short_mapkind, kinds, i);
> +	if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
> +	  {
> +	    nca_data_row_num += gomp_noncontig_array_count_rows (hostaddrs[i]);
> +	    nca_info_num += 1;
> +	  }
> +      }

Or, actually, can the 'if (mapnum > 0)' above and the 'for' loop here
again be simplified to just one loop with 'break', like you've done in
'libgomp/oacc-parallel.c:revert_noncontig_array_map_pointers'?

> +
> +  tgt = gomp_malloc (sizeof (*tgt)
> +		     + sizeof (tgt->list[0]) * (mapnum + nca_data_row_num));
> +  tgt->list_count = mapnum + nca_data_row_num;
>    tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
>    tgt->device_descr = devicep;
>    struct gomp_coalesce_buf cbuf, *cbufp = NULL;

> @@ -735,6 +931,56 @@ gomp_map_vars_internal (struct gomp_device_descr *
>  	}
>      }
>  
> +  /* For non-contiguous arrays. Each data row is one target item, separated
> +     from the normal map clause items, hence we order them after mapnum.  */
> +  if (process_noncontig_arrays)
> +    for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++)
> +      {
> +	int kind = get_kind (short_mapkind, kinds, i);
> +	if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
> +	  continue;

Can instead 'break' again?

> @@ -1044,8 +1299,112 @@ gomp_map_vars_internal (struct gomp_device_descr *
>  		array++;
>  	      }
>  	  }
> +
> +      /* Processing of non-contiguous array rows.  */
> +      if (process_noncontig_arrays)
> +	{
> +	  for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++)
> +	    {
> +	      int kind = get_kind (short_mapkind, kinds, i);
> +	      if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask))
> +		continue;

Likewise?


It's now gotten too late; more review to follow later.


Grüße
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 658 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses
  2019-11-07  0:49                           ` Thomas Schwinge
@ 2019-11-12 12:42                             ` Chung-Lin Tang
  0 siblings, 0 replies; 24+ messages in thread
From: Chung-Lin Tang @ 2019-11-12 12:42 UTC (permalink / raw)
  To: Thomas Schwinge, Chung-Lin Tang; +Cc: gcc-patches, Jakub Jelinek

Hi Thomas,
thanks for the first review. I'm still working on another revision,
but wanted to respond to some of the issues you raised first:

On 2019/11/7 8:48 AM, Thomas Schwinge wrote:
>> (1) The simplest solution: implement a processing which searches and reverts such
>> non-contiguous array map entries in GOACC_parallel_keyed.
>> (note: I have implemented this in the current attached "v2" patch)
>>
>> (2) Make the GOACC_parallel_keyed code to not make short cuts for host-modes;
>> i.e. still do the proper gomp_map_vars processing for all cases.
>>
>> (3) Modify the non-contiguous array map conventions: a possible solution is to use
>> two maps placed together: one for the array pointer, another for the array descriptor (as
>> opposed to the current style of using only one map) This needs more further elaborate
>> compiler/runtime work.
>>
>> The first two options will pessimize host-mode performance somewhat. The third I have
>> some WIP patches, but it's still buggy ATM. Seeking your opinion on what we should do.
> I'll have to think about it some more, but variant (1) doesn't seem so
> bad actually, for a first take.  While it's not nice to pessimize in
> particular directives with 'if (false)' clauses, at least it does work,
> the run-time overhead should not be too bad (also compared to variant
> (2), I suppose), and variant (3) can still be implemented later.

The issue is that (1),(2) vs (3) have different binary interfaces, so a decision has to be
made first, lest we again have compatibility issues later.

Also, (1) vs (2) also may be somewhat different do to the memory copying effects of
gomp_map_vars()  (possible semantic difference versus the usual shared memory expectations?)

I'm currently working on another way of implementing something similar to (3),
but using the variadic arguments of GOACC_parallel_keyed instead of maps, WDYT?

>> @@ -13238,6 +13247,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi
>>         unsigned int num = types.length (), i;
>>         tree t, side_effects = NULL_TREE, size = NULL_TREE;
>>         tree condition = NULL_TREE;
>> +      tree ncarray_dims = NULL_TREE;
>>   
>>         if (int_size_in_bytes (TREE_TYPE (first)) <= 0)
>>   	maybe_zero_len = true;
>> @@ -13261,6 +13271,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi
>>   	    length = fold_convert (sizetype, length);
>>   	  if (low_bound == NULL_TREE)
>>   	    low_bound = integer_zero_node;
>> +
>> +	  if (non_contiguous)
>> +	    {
>> +	      ncarray_dims = tree_cons (low_bound, length, ncarray_dims);
>> +	      continue;
>> +	    }
>> +
>>   	  if (!maybe_zero_len && i > first_non_one)
>>   	    {
>>   	      if (integer_nonzerop (low_bound))
> I'm not at all familiar with this array sections code, will trust your
> understanding that we don't need any of the processing that you're
> skipping here ('continue'): 'TREE_SIDE_EFFECTS' handling for the length
> expressions, and other things.

I will re-check on this.

Ditto for the other minor issues you raised.

>>   	  if (DECL_P (decl))
>>   	    {
>>   	      if (DECL_SIZE (decl)
>> @@ -2624,6 +2830,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o
>>         gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn);
>>       }
>>
>> +  /* If is OpenACC construct, put non-contiguous array clauses (if any)
>> +     in front of clause chain. The runtime can then test the first to see
>> +     if the additional map processing for them is required.  */
>> +  if (is_gimple_omp_oacc (stmt))
>> +    reorder_noncontig_array_clauses (gimple_omp_target_clauses_ptr (stmt));
> Should that be deemed unsuitable for any reason, then add a new
> 'GOACC_FLAG_*' flag to indicate existance of non-contiguous arrays.

I'm considering using that convention unconditionally, not sure if it's faster
though, since that means we can't do the 'early breaking' you mentioned when
scanning through maps looking for GOMP_MAP_NONCONTIG_ARRAY_P.

>> --- include/gomp-constants.h	(revision 277827)
>> +++ include/gomp-constants.h	(working copy)
>> @@ -40,6 +40,7 @@
>>   #define GOMP_MAP_FLAG_SPECIAL_0		(1 << 2)
>>   #define GOMP_MAP_FLAG_SPECIAL_1		(1 << 3)
>>   #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
>> +#define GOMP_MAP_FLAG_SPECIAL_3		(1 << 5)
>>   #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
>>   					 | GOMP_MAP_FLAG_SPECIAL_0)
>>   /* Flag to force a specific behavior (or else, trigger a run-time error).  */
>> @@ -127,6 +128,26 @@ enum gomp_map_kind
>>       /* Decrement usage count and deallocate if zero.  */
>>       GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
>>   					 | GOMP_MAP_DELETE),
>> +    /* Mapping kinds for non-contiguous arrays.  */
>> +    GOMP_MAP_NONCONTIG_ARRAY =		(GOMP_MAP_FLAG_SPECIAL_3),
>> +    GOMP_MAP_NONCONTIG_ARRAY_TO =	(GOMP_MAP_NONCONTIG_ARRAY
>> +					 | GOMP_MAP_TO),
>> +    GOMP_MAP_NONCONTIG_ARRAY_FROM =	(GOMP_MAP_NONCONTIG_ARRAY
>> +					 | GOMP_MAP_FROM),
>> +    GOMP_MAP_NONCONTIG_ARRAY_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY
>> +					 | GOMP_MAP_TOFROM),
>> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO =	(GOMP_MAP_NONCONTIG_ARRAY_TO
>> +					 | GOMP_MAP_FLAG_FORCE),
>> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM =	(GOMP_MAP_NONCONTIG_ARRAY_FROM
>> +						 | GOMP_MAP_FLAG_FORCE),
>> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM =	(GOMP_MAP_NONCONTIG_ARRAY_TOFROM
>> +						 | GOMP_MAP_FLAG_FORCE),
>> +    GOMP_MAP_NONCONTIG_ARRAY_ALLOC =		(GOMP_MAP_NONCONTIG_ARRAY
>> +						 | GOMP_MAP_ALLOC),
>> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC =	(GOMP_MAP_NONCONTIG_ARRAY
>> +						 | GOMP_MAP_FORCE_ALLOC),
>> +    GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT =	(GOMP_MAP_NONCONTIG_ARRAY
>> +						 | GOMP_MAP_FORCE_PRESENT),
> Just an idea: instead of this long list, would it maybe be better (if
> feasible at all?) to have a single "lead-in" mapping
> 'GOMP_MAP_NONCONTIG_ARRAY_MODE', which specifies how many of the
> following (normal) mappings belong to that "non-contiguous array mode".
> (Roughly similar to what 'GOMP_MAP_TO_PSET' is doing with any
> 'GOMP_MAP_POINTER's following it.)  Might that make some things simpler,
> or even more complicated (more internal state to keep)?

I prefer not, wrangling with multiple-map sequences in the complex gomp_map_vars code
is proving to be a tedious task; my now given-up version of method (3) above tried using
two map kinds (an 'array' and an 'array descriptor'). Haven't yet got it to work properly.

Also, a non-contiguous array is just a data clause specification feature, and should support
all modes (copy/in/out,present,alloc,etc.) Using a whole GOMP_MAP_FLAG_SPECIAL_3 bit in
combination with other flags independently should be warranted.


>> --- libgomp/oacc-parallel.c	(revision 277827)
>> +++ libgomp/oacc-parallel.c	(working copy)
>> +static inline void
>> +revert_noncontig_array_map_pointers (size_t mapnum, void **hostaddrs,
>> +				     unsigned short *kinds)
>> +{
>> +  for (int i = 0; i < mapnum; i++)
>> +    {
>> +      if (GOMP_MAP_NONCONTIG_ARRAY_P (kinds[i] & 0xff))
>> +	hostaddrs[i] = *((void **)hostaddrs[i]);
> Can we be (or, do we make) sure that 'hostaddrs' will never be in
> read-only memory?
> 
> And, it's permissible to alter 'hostaddrs'?
> 
> Ah, other code (including 'libgomp/target.c') is doing such things, too,
> so it must be fine.

The hostaddrs[] array is the 'receiver' record built on stack by omp-low,
so it should always be safe to modify, I think.

Thanks again for the review!
Chung-Lin

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-11-12 12:35 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-10  8:27 [gomp4] Support multi-dimensional pointer based arrays in OpenACC data clauses Chung-Lin Tang
2018-10-16 12:56 ` [PATCH, OpenACC, 0/8] Multi-dimensional dynamic array support for " Chung-Lin Tang
2018-10-16 12:56   ` [PATCH, OpenACC, 1/8] Multi-dimensional dynamic array support for OpenACC data clauses, gomp-constants.h additions Chung-Lin Tang
2018-10-16 12:57     ` [PATCH, OpenACC, 2/8] Multi-dimensional dynamic array support for OpenACC data clauses, C/C++ front-end parts Chung-Lin Tang
2018-10-16 12:57       ` [PATCH, OpenACC, 3/8] Multi-dimensional dynamic array support for OpenACC data clauses, gimplify patch Chung-Lin Tang
2018-10-16 13:13         ` [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation Chung-Lin Tang
2018-10-16 13:54           ` [PATCH, OpenACC, 5/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: bias scanning/adjustment during omp-lowering Chung-Lin Tang
2018-10-16 14:11             ` [PATCH, OpenACC, 6/8] Multi-dimensional dynamic array support for OpenACC data clauses, tree pretty-printing additions Chung-Lin Tang
2018-10-16 14:20               ` [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support Chung-Lin Tang
2018-10-16 14:28                 ` [PATCH, OpenACC, 8/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp testsuite additions Chung-Lin Tang
2019-08-20 11:54                   ` [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches Chung-Lin Tang
2019-08-20 12:01                     ` [PATCH, OpenACC, 2/3] Non-contiguous array support for OpenACC data clauses (re-submission), compiler patches Chung-Lin Tang
2019-08-20 12:16                       ` [PATCH, OpenACC, 3/3] Non-contiguous array support for OpenACC data clauses (re-submission), libgomp patches Chung-Lin Tang
2019-10-07 13:58                         ` Thomas Schwinge
2019-11-05 14:36                         ` [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses Chung-Lin Tang
2019-11-07  0:49                           ` Thomas Schwinge
2019-11-12 12:42                             ` Chung-Lin Tang
2019-10-07 13:51                     ` [PATCH, OpenACC, 1/3] Non-contiguous array support for OpenACC data clauses (re-submission), front-end patches Thomas Schwinge
2018-10-16 14:49                 ` [PATCH, OpenACC, 7/8] Multi-dimensional dynamic array support for OpenACC data clauses, libgomp support Jakub Jelinek
2018-12-06 14:20                   ` Chung-Lin Tang
2018-12-06 14:43                     ` Jakub Jelinek
2018-12-13 14:52                       ` Chung-Lin Tang
2018-12-13 14:52           ` [PATCH, OpenACC, 4/8] Multi-dimensional dynamic array support for OpenACC data clauses, omp-low: dynamic array descriptor creation Chung-Lin Tang
2018-12-18 12:51             ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).