From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <burnus@sourceware.org>
Received: by sourceware.org (Postfix, from userid 1534)
	id 74EFF3858421; Thu,  8 Sep 2022 12:25:29 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 74EFF3858421
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1662639929;
	bh=tCYTACjlGpnqUueR6jEQsiryJDst9Mq96/JsA9lkP0g=;
	h=From:To:Subject:Date:From;
	b=Gv4rISH2bTT/9PEdG3EFOhZNrt6oDAr9rPiASa8VFDfC+fMLexF8qKvkjHgpYYAKQ
	 IFdx7c3uHD+4OPVpwgre3u8UdS4D839lyOHFfDt0c4/S2Q12r1SUAvE2SZa+MWsAdF
	 EEJLs2SfHX8ayofWxTQ/B6CdrSbP20FUYqkgnA34=
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: Tobias Burnus <burnus@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc/devel/omp/gcc-12] openmp: Implement doacross(sink:
 omp_cur_iteration - 1)
X-Act-Checkin: gcc
X-Git-Author: Jakub Jelinek <jakub@redhat.com>
X-Git-Refname: refs/heads/devel/omp/gcc-12
X-Git-Oldrev: b79c973bddadb6ac8103c73a85025c997576415d
X-Git-Newrev: b52d3705e27a5ca62ffaeda867392cbf417b0082
Message-Id: <20220908122529.74EFF3858421@sourceware.org>
Date: Thu,  8 Sep 2022 12:25:29 +0000 (GMT)
List-Id: <gcc-cvs.sourceware.org>

https://gcc.gnu.org/g:b52d3705e27a5ca62ffaeda867392cbf417b0082

commit b52d3705e27a5ca62ffaeda867392cbf417b0082
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Sep 8 14:06:38 2022 +0200

    openmp: Implement doacross(sink: omp_cur_iteration - 1)
    
    This patch implements doacross(sink: omp_cur_iteration - 1) that the
    previous patchset emitted a sorry on during omp expansion.
    It can be implemented with existing library functions.
    
    To recap, depend(source)/doacross(source:)/doacross(source:omp_cur_iteration)
    is implemented calling GOMP_doacross_post or GOMP_doacross_ull_post,
    called with an array of long or unsigned long long elements, one for
    all collapsed loops together and one for each further ordered loop if any.
    We initialize that array in each thread when grabbing further set of iterations
    and update it at the end of loops, so that it represents the current iteration
    (as 0 based counters).  When the worksharing loop is created, we tell the
    library through another similar array the counts (the loop needs to be
    rectangular) in each dimension, first element is count of all logical iterations
    in the collapsed loops.
    
    depend(sink:v1 op N1, v2 op N2, ...) is then implemented by conditionally calling
    GOMP_doacross_wait/GOMP_doacross_ull_wait.  For N? of 0 there is no check,
    otherwise if it wants to wait in a particular dimension for a previous iteration,
    we check that the corresponding iterator isn't the first one (or first few),
    where the previous iterator in that dimension would be out of range, and similarly
    for checking of next iteration in a dimension that it isn't the last one (or last few)
    where it would be similarly out of bounds.  Then the collapsed loop counters are
    folded into a single 0 based counter (first argument) and then other 0 based
    iterations counters on what iteration it should wait for.
    
    Now, doacross(sink: omp_cur_iteration - 1) is supposed to wait for the previous
    logical iteration in the combined iteration space of all ordered loops.
    For the very first iteration in that combined iteration space it does nothing,
    there is no previous iteration.  And similarly it does nothing if there
    are more ordered loops than collapsed loop and it isn't the first logical
    iteration of the combined loops inside of the collapsed loops, because as implemented
    we know the previous iteration in that case is always executed by the same thread
    as the current one.
    In the implementation, we use the same value as is stored in the first element
    of the array for GOMP_doacross_post/GOMP_doacross_ull_post, if that value is 0,
    we do nothing.  The rest is different based on if ordered argument is equal to
    collapse or not.  If it is, then we otherwise call
    GOMP_doacross_wait/GOMP_doacross_ull_wait with a single argument, one less than
    that counter we compare against 0.
    If ordered argument is bigger than collapse, we add a per-thread boolean variable
    .first.N, which we set to true at the start of the outermost ordered loop inside
    of the collapsed set of loops and set to false at the end of the innermost
    ordered loop.  If .first.N is false, we don't do anything (we know the previous
    iteration was handled by the current thread and by my reading of the spec we don't
    need to emit even a memory barrier in that case, because it is just synchronization
    with the same thread), otherwise we call GOMP_doacross_wait/GOMP_doacross_ull_wait
    with the first argument one less than the counter we compare against 0, and then
    one less than 2nd and following counts if iterations we pass to the workshare
    initialization.  If say .counts.N passed to the workshare initialization is
    { 256, 13, 5, 2 } for collapse(3) ordered(6) loop, then
    GOMP_doacross_post/GOMP_doacross_ull_post is called with arguments equal to
    .ordereda.N[0] - 1, 12, 4, 1.
    
    2022-09-08  Jakub Jelinek  <jakub@redhat.com>
    
    gcc/
            * omp-expand.cc (expand_omp_ordered_sink): Add CONT_BB argument.
            Add doacross(sink:omp_cur_iteration-1) support.
            (expand_omp_ordered_source_sink): Clear counts[fd->ordered + 1].
            Adjust expand_omp_ordered_sink caller.
            (expand_omp_for_ordered_loops): If counts[fd->ordered + 1] is
            non-NULL, set that variable to true at the start of outermost
            non-collapsed loop and set it to false at the end of innermost
            ordered loop.
            (expand_omp_for_generic): If fd->ordered, allocate
            1 + (fd->ordered - fd->collapse) further elements in counts array.
            Copy to counts + 2 + fd->ordered the counts of fd->collapse ..
            fd->ordered - 1 loop if any.
    gcc/testsuite/
            * c-c++-common/gomp/doacross-7.c: New test.
    libgomp/
            * libgomp.texi (OpenMP 5.2): Mention that omp_cur_iteration is now
            fully supported.
            * testsuite/libgomp.c/doacross-4.c: New test.
            * testsuite/libgomp.c/doacross-5.c: New test.
            * testsuite/libgomp.c/doacross-6.c: New test.
            * testsuite/libgomp.c/doacross-7.c: New test.
    
    (cherry picked from commit f25a6767ec28780a3e3b6d38f5e54e1122a26fd6)

Diff:
---
 gcc/ChangeLog.omp                            |  20 ++-
 gcc/omp-expand.cc                            | 117 +++++++++++++-
 gcc/testsuite/ChangeLog.omp                  |   7 +
 gcc/testsuite/c-c++-common/gomp/doacross-7.c |  78 +++++++++
 libgomp/ChangeLog.omp                        |  12 ++
 libgomp/libgomp.texi                         |   3 +-
 libgomp/testsuite/libgomp.c/doacross-4.c     | 228 ++++++++++++++++++++++++++
 libgomp/testsuite/libgomp.c/doacross-5.c     | 198 +++++++++++++++++++++++
 libgomp/testsuite/libgomp.c/doacross-6.c     | 231 +++++++++++++++++++++++++++
 libgomp/testsuite/libgomp.c/doacross-7.c     | 231 +++++++++++++++++++++++++++
 10 files changed, 1116 insertions(+), 9 deletions(-)

diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
index a2dbe630bab..f7a0a77780e 100644
--- a/gcc/ChangeLog.omp
+++ b/gcc/ChangeLog.omp
@@ -1,4 +1,22 @@
-2022-09-07  Tobias Burnus  <tobias@codesourcery.com>
+2022-09-08  Tobias Burnus  <tobias@codesourcery.com>
+
+	Backport from mainline:
+	2022-09-08  Jakub Jelinek  <jakub@redhat.com>
+
+	* omp-expand.cc (expand_omp_ordered_sink): Add CONT_BB argument.
+	Add doacross(sink:omp_cur_iteration-1) support.
+	(expand_omp_ordered_source_sink): Clear counts[fd->ordered + 1].
+	Adjust expand_omp_ordered_sink caller.
+	(expand_omp_for_ordered_loops): If counts[fd->ordered + 1] is
+	non-NULL, set that variable to true at the start of outermost
+	non-collapsed loop and set it to false at the end of innermost
+	ordered loop.
+	(expand_omp_for_generic): If fd->ordered, allocate
+	1 + (fd->ordered - fd->collapse) further elements in counts array.
+	Copy to counts + 2 + fd->ordered the counts of fd->collapse ..
+	fd->ordered - 1 loop if any.
+
+2022-09-08  Tobias Burnus  <tobias@codesourcery.com>
 
 	Backport from mainline:
 	2022-09-06  Jakub Jelinek  <jakub@redhat.com>
diff --git a/gcc/omp-expand.cc b/gcc/omp-expand.cc
index a178ef7126f..fa7d3649c83 100644
--- a/gcc/omp-expand.cc
+++ b/gcc/omp-expand.cc
@@ -3295,7 +3295,8 @@ expand_omp_ordered_source (gimple_stmt_iterator *gsi, struct omp_for_data *fd,
 
 static void
 expand_omp_ordered_sink (gimple_stmt_iterator *gsi, struct omp_for_data *fd,
-			 tree *counts, tree c, location_t loc)
+			 tree *counts, tree c, location_t loc,
+			 basic_block cont_bb)
 {
   auto_vec<tree, 10> args;
   enum built_in_function sink_ix
@@ -3308,7 +3309,93 @@ expand_omp_ordered_sink (gimple_stmt_iterator *gsi, struct omp_for_data *fd,
 
   if (deps == NULL)
     {
-      sorry_at (loc, "%<doacross(sink:omp_cur_iteration-1)%> not supported yet");
+      /* Handle doacross(sink: omp_cur_iteration - 1).  */
+      gsi_prev (&gsi2);
+      edge e1 = split_block (gsi_bb (gsi2), gsi_stmt (gsi2));
+      edge e2 = split_block_after_labels (e1->dest);
+      gsi2 = gsi_after_labels (e1->dest);
+      *gsi = gsi_last_bb (e1->src);
+      gimple_stmt_iterator gsi3 = *gsi;
+
+      if (counts[fd->collapse - 1])
+	{
+	  gcc_assert (fd->collapse == 1);
+	  t = counts[fd->collapse - 1];
+	}
+      else if (fd->collapse > 1)
+	t = fd->loop.v;
+      else
+	{
+	  t = fold_build2 (MINUS_EXPR, TREE_TYPE (fd->loops[0].v),
+			   fd->loops[0].v, fd->loops[0].n1);
+	  t = fold_convert (fd->iter_type, t);
+	}
+
+      t = force_gimple_operand_gsi (gsi, t, true, NULL_TREE,
+				    false, GSI_CONTINUE_LINKING);
+      gsi_insert_after (gsi, gimple_build_cond (NE_EXPR, t,
+						build_zero_cst (TREE_TYPE (t)),
+						NULL_TREE, NULL_TREE),
+			GSI_NEW_STMT);
+
+      t = fold_build2 (PLUS_EXPR, TREE_TYPE (t), t,
+		       build_minus_one_cst (TREE_TYPE (t)));
+      t = force_gimple_operand_gsi (&gsi2, t, true, NULL_TREE,
+				    true, GSI_SAME_STMT);
+      args.safe_push (t);
+      for (i = fd->collapse; i < fd->ordered; i++)
+	{
+	  t = counts[fd->ordered + 2 + (i - fd->collapse)];
+	  t = fold_build2 (PLUS_EXPR, TREE_TYPE (t), t,
+			   build_minus_one_cst (TREE_TYPE (t)));
+	  t = fold_convert (fd->iter_type, t);
+	  t = force_gimple_operand_gsi (&gsi2, t, true, NULL_TREE,
+					true, GSI_SAME_STMT);
+	  args.safe_push (t);
+	}
+
+      gimple *g = gimple_build_call_vec (builtin_decl_explicit (sink_ix),
+					 args);
+      gimple_set_location (g, loc);
+      gsi_insert_before (&gsi2, g, GSI_SAME_STMT);
+
+      edge e3 = make_edge (e1->src, e2->dest, EDGE_FALSE_VALUE);
+      e3->probability = profile_probability::guessed_always ().apply_scale (1, 8);
+      e1->probability = e3->probability.invert ();
+      e1->flags = EDGE_TRUE_VALUE;
+      set_immediate_dominator (CDI_DOMINATORS, e2->dest, e1->src);
+
+      if (fd->ordered > fd->collapse && cont_bb)
+	{
+	  if (counts[fd->ordered + 1] == NULL_TREE)
+	    counts[fd->ordered + 1]
+	      = create_tmp_var (boolean_type_node, ".first");
+
+	  edge e4;
+	  if (gsi_end_p (gsi3))
+	    e4 = split_block_after_labels (e1->src);
+	  else
+	    {
+	      gsi_prev (&gsi3);
+	      e4 = split_block (gsi_bb (gsi3), gsi_stmt (gsi3));
+	    }
+	  gsi3 = gsi_last_bb (e4->src);
+
+	  gsi_insert_after (&gsi3,
+			    gimple_build_cond (NE_EXPR,
+					       counts[fd->ordered + 1],
+					       boolean_false_node,
+					       NULL_TREE, NULL_TREE),
+			    GSI_NEW_STMT);
+
+	  edge e5 = make_edge (e4->src, e2->dest, EDGE_FALSE_VALUE);
+	  e4->probability = profile_probability::guessed_always ().apply_scale (1, 8);
+	  e5->probability = e4->probability.invert ();
+	  e4->flags = EDGE_TRUE_VALUE;
+	  set_immediate_dominator (CDI_DOMINATORS, e2->dest, e4->src);
+	}
+
+      *gsi = gsi_after_labels (e2->dest);
       return;
     }
   for (i = 0; i < fd->ordered; i++)
@@ -3566,6 +3653,7 @@ expand_omp_ordered_source_sink (struct omp_region *region,
     = build_array_type_nelts (fd->iter_type, fd->ordered - fd->collapse + 1);
   counts[fd->ordered] = create_tmp_var (atype, ".orditera");
   TREE_ADDRESSABLE (counts[fd->ordered]) = 1;
+  counts[fd->ordered + 1] = NULL_TREE;
 
   for (inner = region->inner; inner; inner = inner->next)
     if (inner->type == GIMPLE_OMP_ORDERED)
@@ -3583,7 +3671,7 @@ expand_omp_ordered_source_sink (struct omp_region *region,
 	for (c = gimple_omp_ordered_clauses (ord_stmt);
 	     c; c = OMP_CLAUSE_CHAIN (c))
 	  if (OMP_CLAUSE_DOACROSS_KIND (c) == OMP_CLAUSE_DOACROSS_SINK)
-	    expand_omp_ordered_sink (&gsi, fd, counts, c, loc);
+	    expand_omp_ordered_sink (&gsi, fd, counts, c, loc, cont_bb);
 	gsi_remove (&gsi, true);
       }
 }
@@ -3619,6 +3707,9 @@ expand_omp_for_ordered_loops (struct omp_for_data *fd, tree *counts,
     {
       tree t, type = TREE_TYPE (fd->loops[i].v);
       gimple_stmt_iterator gsi = gsi_after_labels (body_bb);
+      if (counts[fd->ordered + 1] && i == fd->collapse)
+	expand_omp_build_assign (&gsi, counts[fd->ordered + 1],
+				 boolean_true_node);
       expand_omp_build_assign (&gsi, fd->loops[i].v,
 			       fold_convert (type, fd->loops[i].n1));
       if (counts[i])
@@ -3666,6 +3757,9 @@ expand_omp_for_ordered_loops (struct omp_for_data *fd, tree *counts,
 			 size_int (i - fd->collapse + 1),
 			 NULL_TREE, NULL_TREE);
 	  expand_omp_build_assign (&gsi, aref, t);
+	  if (counts[fd->ordered + 1] && i == fd->ordered - 1)
+	    expand_omp_build_assign (&gsi, counts[fd->ordered + 1],
+				     boolean_false_node);
 	  gsi_prev (&gsi);
 	  e2 = split_block (cont_bb, gsi_stmt (gsi));
 	  new_header = e2->dest;
@@ -3924,7 +4018,10 @@ expand_omp_for_generic (struct omp_region *region,
       int first_zero_iter1 = -1, first_zero_iter2 = -1;
       basic_block zero_iter1_bb = NULL, zero_iter2_bb = NULL, l2_dom_bb = NULL;
 
-      counts = XALLOCAVEC (tree, fd->ordered ? fd->ordered + 1 : fd->collapse);
+      counts = XALLOCAVEC (tree, fd->ordered
+				 ? fd->ordered + 2
+				   + (fd->ordered - fd->collapse)
+				 : fd->collapse);
       expand_omp_for_init_counts (fd, &gsi, entry_bb, counts,
 				  zero_iter1_bb, first_zero_iter1,
 				  zero_iter2_bb, first_zero_iter2, l2_dom_bb);
@@ -4383,13 +4480,21 @@ expand_omp_for_generic (struct omp_region *region,
   if (fd->ordered)
     {
       /* Until now, counts array contained number of iterations or
-	 variable containing it for ith loop.  From now on, we need
+	 variable containing it for ith loop.  From now on, we usually need
 	 those counts only for collapsed loops, and only for the 2nd
 	 till the last collapsed one.  Move those one element earlier,
 	 we'll use counts[fd->collapse - 1] for the first source/sink
 	 iteration counter and so on and counts[fd->ordered]
 	 as the array holding the current counter values for
-	 depend(source).  */
+	 depend(source).  For doacross(sink:omp_cur_iteration - 1) we need
+	 the counts from fd->collapse to fd->ordered - 1; make a copy of
+	 those to counts[fd->ordered + 2] and onwards.
+	 counts[fd->ordered + 1] can be a flag whether it is the first
+	 iteration with a new collapsed counter (used only if
+	 fd->ordered > fd->collapse).  */
+      if (fd->ordered > fd->collapse)
+	memcpy (counts + fd->ordered + 2, counts + fd->collapse,
+		(fd->ordered - fd->collapse) * sizeof (counts[0]));
       if (fd->collapse > 1)
 	memmove (counts, counts + 1, (fd->collapse - 1) * sizeof (counts[0]));
       if (broken_loop)
diff --git a/gcc/testsuite/ChangeLog.omp b/gcc/testsuite/ChangeLog.omp
index ed42b20b099..5fccb81aed8 100644
--- a/gcc/testsuite/ChangeLog.omp
+++ b/gcc/testsuite/ChangeLog.omp
@@ -1,3 +1,10 @@
+2022-09-08  Tobias Burnus  <tobias@codesourcery.com>
+
+	Backport from mainline:
+	2022-09-08  Jakub Jelinek  <jakub@redhat.com>
+
+	* c-c++-common/gomp/doacross-7.c: New test.
+
 2022-09-07  Tobias Burnus  <tobias@codesourcery.com>
 
 	Backport from mainline:
diff --git a/gcc/testsuite/c-c++-common/gomp/doacross-7.c b/gcc/testsuite/c-c++-common/gomp/doacross-7.c
new file mode 100644
index 00000000000..8ead1679522
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/doacross-7.c
@@ -0,0 +1,78 @@
+void
+foo (int l)
+{
+  int i, j, k;
+  #pragma omp parallel
+  {
+    #pragma omp for schedule(static) ordered (3)
+    for (i = 2; i < 256 / 16 - 1; i++)
+      for (j = 0; j < 8; j += 2)
+	for (k = 1; k <= 3; k++)
+	  {
+	    #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	    #pragma omp ordered doacross(source:)
+	  }
+    #pragma omp for schedule(static) ordered (3) collapse(2)
+    for (i = 2; i < 256 / 16 - 1; i++)
+      for (j = 0; j < 8; j += 2)
+	for (k = 1; k <= 3; k++)
+	  {
+	    #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	    #pragma omp ordered doacross(source:)
+	  }
+    #pragma omp for schedule(static) ordered (3) collapse(3)
+    for (i = 2; i < 256 / 16 - 1; i++)
+      for (j = 0; j < 8; j += 2)
+	for (k = 1; k <= 3; k++)
+	  {
+	    #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	    #pragma omp ordered doacross(source: omp_cur_iteration)
+	  }
+    #pragma omp for schedule(static) ordered (1) nowait
+    for (i = 2; i < 256 / 16 - 1; i += l)
+      {
+	#pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	#pragma omp ordered doacross(source:)
+      }
+  }
+}
+
+void
+bar (int l, int m, int n, int o)
+{
+  int i, j, k;
+  #pragma omp for schedule(static) ordered (3)
+  for (i = 2; i < 256 / 16 - 1; i++)
+    for (j = 0; j < m; j += n)
+      for (k = o; k <= 3; k++)
+	{
+	  foo (l);
+	  #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	  #pragma omp ordered doacross(source:omp_cur_iteration)
+	}
+  #pragma omp for schedule(static) ordered (3) collapse(2)
+  for (i = 2; i < 256 / 16 - m; i += n)
+    for (j = 0; j < 8; j += o)
+      for (k = 1; k <= 3; k++)
+	{
+	  foo (l);
+	  #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	  #pragma omp ordered doacross(source : omp_cur_iteration)
+	}
+  #pragma omp for schedule(static) ordered (3) collapse(3)
+  for (i = m; i < 256 / 16 - 1; i++)
+    for (j = 0; j < n; j += 2)
+      for (k = 1; k <= o; k++)
+	{
+	  foo (l);
+	  #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	  #pragma omp ordered doacross(source :)
+	}
+  #pragma omp for schedule(static) ordered
+  for (i = m; i < n / 16 - 1; i += l)
+    {
+      foo (l);
+      #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+      #pragma omp ordered doacross(source: omp_cur_iteration)
+    }
+}
diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index b5075dfec57..725142ca718 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,3 +1,15 @@
+2022-09-08  Tobias Burnus  <tobias@codesourcery.com>
+
+	Backport from mainline:
+	2022-09-08  Jakub Jelinek  <jakub@redhat.com>
+
+	* libgomp.texi (OpenMP 5.2): Mention that omp_cur_iteration is now
+	fully supported.
+	* testsuite/libgomp.c/doacross-4.c: New test.
+	* testsuite/libgomp.c/doacross-5.c: New test.
+	* testsuite/libgomp.c/doacross-6.c: New test.
+	* testsuite/libgomp.c/doacross-7.c: New test.
+
 2022-09-08  Tobias Burnus  <tobias@codesourcery.com>
 
 	Backport from mainline:
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 452eb112bd6..5e7ec253f10 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -397,8 +397,7 @@ to address of matching mapped list item per 5.1, Sect. 2.21.7.2 @tab N @tab
       @code{source}/@code{sink} modifier @tab Y @tab
 @item Deprecation of @code{depend} with @code{source}/@code{sink} modifier
       @tab N @tab
-@item @code{omp_cur_iteration} keyword @tab P
-      @tab @code{sink: omp_cur_iteration - 1} unsupported
+@item @code{omp_cur_iteration} keyword @tab Y @tab
 @end multitable
 
 @unnumberedsubsec Other new OpenMP 5.2 features
diff --git a/libgomp/testsuite/libgomp.c/doacross-4.c b/libgomp/testsuite/libgomp.c/doacross-4.c
new file mode 100644
index 00000000000..4bc451d5b69
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/doacross-4.c
@@ -0,0 +1,228 @@
+extern void abort (void);
+
+#define N 256
+int a[N], b[N / 16][8][4], c[N / 32][8][8], g[N / 16][8][6];
+volatile int d, e;
+volatile unsigned long long f;
+
+int
+main ()
+{
+  unsigned long long i;
+  int j, k, l, m;
+  #pragma omp parallel private (l)
+  {
+    #pragma omp for schedule(static, 1) ordered nowait
+    for (i = 1; i < N + f; i++)
+      {
+	#pragma omp atomic write
+	a[i] = 1;
+	#pragma omp ordered doacross(sink: i - 1)
+	if (i > 1)
+	  {
+	    #pragma omp atomic read
+	    l = a[i - 1];
+	    if (l < 2)
+	      abort ();
+	  }
+	#pragma omp atomic write
+	a[i] = 2;
+	if (i < N - 1)
+	  {
+	    #pragma omp atomic read
+	    l = a[i + 1];
+	    if (l == 3)
+	      abort ();
+	  }
+	#pragma omp ordered doacross(source : omp_cur_iteration)
+	#pragma omp atomic write
+	a[i] = 3;
+      }
+    #pragma omp for schedule(static) ordered (3) nowait
+    for (i = 3; i < N / 16 - 1 + f; i++)
+      for (j = 0; j < 8; j += 2)
+	for (k = 1; k <= 3; k++)
+	  {
+	    #pragma omp atomic write
+	    b[i][j][k] = 1;
+	    #pragma omp ordered doacross(sink: i, j - 2, k - 1) \
+				doacross(sink: i - 2, j - 2, k + 1)
+	    #pragma omp ordered doacross(sink: i - 3, j + 2, k - 2)
+	    if (j >= 2 && k > 1)
+	      {
+		#pragma omp atomic read
+		l = b[i][j - 2][k - 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp atomic write
+	    b[i][j][k] = 2;
+	    if (i >= 5 && j >= 2 && k < 3)
+	      {
+		#pragma omp atomic read
+		l = b[i - 2][j - 2][k + 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    if (i >= 6 && j < N / 16 - 3 && k == 3)
+	      {
+		#pragma omp atomic read
+		l = b[i - 3][j + 2][k - 2];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp ordered doacross(source : )
+	    #pragma omp atomic write
+	    b[i][j][k] = 3;
+	  }
+#define A(n) int n;
+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3)
+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3)
+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3)
+    D(m)
+#undef A
+    #pragma omp for collapse (2) ordered(61) schedule(dynamic, 15)
+    for (i = 2; i < N / 32 + f; i++)
+      for (j = 7; j > 1; j--)
+	for (k = 6; k >= 0; k -= 2)
+#define A(n) for (n = 4; n < 5; n++)
+	  D(m)
+#undef A
+	    {
+	      #pragma omp atomic write
+	      c[i][j][k] = 1;
+#define A(n) ,n
+#define E(n) C(n##0) C(n##1) C(n##2) B(n##30) B(n##31) A(n##320) A(n##321)
+	      #pragma omp ordered doacross (sink: i, j, k + 2 E(m)) \
+				  doacross (sink:i - 2, j + 1, k - 4 E(m)) \
+				  doacross(sink: i - 1, j - 2, k - 2 E(m))
+	      if (k <= 4)
+		{
+		  #pragma omp atomic read
+		  l = c[i][j][k + 2];
+		  if (l < 2)
+		    abort ();
+		}
+	      #pragma omp atomic write
+	      c[i][j][k] = 2;
+	      if (i >= 4 && j < 7 && k >= 4)
+		{
+		  #pragma omp atomic read
+		  l = c[i - 2][j + 1][k - 4];
+		  if (l < 2)
+		    abort ();
+		}
+	      if (i >= 3 && j >= 4 && k >= 2)
+		{
+		  #pragma omp atomic read
+		  l = c[i - 1][j - 2][k - 2];
+		  if (l < 2)
+		    abort ();
+		}
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp atomic write
+	      c[i][j][k] = 3;
+	    }
+    #pragma omp for schedule(static) ordered (3) nowait
+    for (j = 0; j < N / 16 - 1; j++)
+      for (k = 0; k < 8; k += 2)
+	for (i = 3; i <= 5 + f; i++)
+	  {
+	    #pragma omp atomic write
+	    g[j][k][i] = 1;
+	    #pragma omp ordered doacross(sink: j, k - 2, i - 1) \
+				doacross(sink: j - 2, k - 2, i + 1)
+	    #pragma omp ordered doacross(sink: j - 3, k + 2, i - 2)
+	    if (k >= 2 && i > 3)
+	      {
+		#pragma omp atomic read
+		l = g[j][k - 2][i - 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp atomic write
+	    g[j][k][i] = 2;
+	    if (j >= 2 && k >= 2 && i < 5)
+	      {
+		#pragma omp atomic read
+		l = g[j - 2][k - 2][i + 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    if (j >= 3 && k < N / 16 - 3 && i == 5)
+	      {
+		#pragma omp atomic read
+		l = g[j - 3][k + 2][i - 2];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp ordered doacross(source :)
+	    #pragma omp atomic write
+	    g[j][k][i] = 3;
+	  }
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k)
+    for (i = 2; i < f + 3; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d; k++)
+	  for (l = 0; l < d + 2; l++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink:i - 2, j + 2, k - 2, l)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp single
+    {
+      if (i != 3 || j != -1 || k != 0)
+	abort ();
+      i = 8; j = 9; k = 10;
+    }
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m)
+    for (i = 2; i < f + 3; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (m = 0; m < d; m++)
+	    {
+	      #pragma omp ordered doacross (source:)
+	      #pragma omp ordered doacross (sink:i - 2, j + 2, k - 2, m)
+	      abort ();
+	    }
+    #pragma omp single
+    if (i != 3 || j != -1 || k != 2 || m != 0)
+      abort ();
+    #pragma omp for collapse(2) ordered(4) nowait
+    for (i = 2; i < f + 3; i++)
+      for (j = d; j > 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (l = 0; l < d + 4; l++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink:i - 2, j + 2, k - 2, l)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp for nowait
+    for (i = 0; i < N; i++)
+      if (a[i] != 3 * (i >= 1))
+	abort ();
+    #pragma omp for collapse(2) private(k) nowait
+    for (i = 0; i < N / 16; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 4; k++)
+	  if (b[i][j][k] != 3 * (i >= 3 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1))
+	    abort ();
+    #pragma omp for collapse(3) nowait
+    for (i = 0; i < N / 32; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 8; k++)
+	  if (c[i][j][k] != 3 * (i >= 2 && j >= 2 && (k & 1) == 0))
+	    abort ();
+    #pragma omp for collapse(2) private(k) nowait
+    for (i = 0; i < N / 16; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 6; k++)
+	  if (g[i][j][k] != 3 * (i < N / 16 - 1 && (j & 1) == 0 && k >= 3))
+	    abort ();
+  }
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/doacross-5.c b/libgomp/testsuite/libgomp.c/doacross-5.c
new file mode 100644
index 00000000000..7bb1993647c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/doacross-5.c
@@ -0,0 +1,198 @@
+extern void abort (void);
+
+#define N 256
+int a[N], b[N / 16][8][4], c[N / 32][8][8];
+volatile int d, e;
+
+int
+main ()
+{
+  int i, j, k, l, m;
+  #pragma omp parallel private (l)
+  {
+    #pragma omp for schedule(static, 1) ordered (1) nowait
+    for (i = 0; i < N; i++)
+      {
+	#pragma omp atomic write
+	a[i] = 1;
+	#pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	if (i)
+	  {
+	    #pragma omp atomic read
+	    l = a[i - 1];
+	    if (l < 2)
+	      abort ();
+	  }
+	#pragma omp atomic write
+	a[i] = 2;
+	if (i < N - 1)
+	  {
+	    #pragma omp atomic read
+	    l = a[i + 1];
+	    if (l == 3)
+	      abort ();
+	  }
+	#pragma omp ordered doacross(source :)
+	#pragma omp atomic write
+	a[i] = 3;
+      }
+    #pragma omp for schedule(static) ordered (3) nowait
+    for (i = 2; i < N / 16 - 1; i++)
+      for (j = 0; j < 8; j += 2)
+	for (k = 1; k <= 3; k++)
+	  {
+	    #pragma omp atomic write
+	    b[i][j][k] = 1;
+	    #pragma omp ordered doacross(sink: omp_cur_iteration - 1) \
+				doacross(sink: i - 2, j - 2, k + 1)
+	    #pragma omp ordered doacross(sink: i - 3, j + 2, k - 2)
+	    if (i != 2 || j || k != 1)
+	      {
+		if (k != 1)
+		  #pragma omp atomic read
+		  l = b[i][j][k - 1];
+		else if (j)
+		  #pragma omp atomic read
+		  l = b[i][j - 2][3];
+		else
+		  #pragma omp atomic read
+		  l = b[i - 1][6][3];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp atomic write
+	    b[i][j][k] = 2;
+	    if (i >= 4 && j >= 2 && k < 3)
+	      {
+		#pragma omp atomic read
+		l = b[i - 2][j - 2][k + 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    if (i >= 5 && j < N / 16 - 3 && k == 3)
+	      {
+		#pragma omp atomic read
+		l = b[i - 3][j + 2][k - 2];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp ordered doacross(source : omp_cur_iteration)
+	    #pragma omp atomic write
+	    b[i][j][k] = 3;
+	  }
+#define A(n) int n;
+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3)
+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3)
+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3)
+    D(m)
+#undef A
+    #pragma omp for collapse (2) ordered(61) schedule(dynamic, 15)
+    for (i = 0; i < N / 32; i++)
+      for (j = 7; j > 1; j--)
+	for (k = 6; k >= 0; k -= 2)
+#define A(n) for (n = 4; n < 5; n++)
+	  D(m)
+#undef A
+	    {
+	      #pragma omp atomic write
+	      c[i][j][k] = 1;
+#define A(n) ,n
+#define E(n) C(n##0) C(n##1) C(n##2) B(n##30) B(n##31) A(n##320) A(n##321)
+	      #pragma omp ordered doacross (sink: i, j, k + 2 E(m)) \
+				  doacross (sink:omp_cur_iteration - 1) \
+				  doacross(sink: i - 1, j - 2, k - 2 E(m))
+	      if (k <= 4)
+		{
+		  #pragma omp atomic read
+		  l = c[i][j][k + 2];
+		  if (l < 2)
+		    abort ();
+		}
+	      #pragma omp atomic write
+	      c[i][j][k] = 2;
+	      if (i || j != 7 && k != 6)
+		{
+		  if (k != 6)
+		    #pragma omp atomic read
+		    l = c[i][j][k + 2];
+		  else if (j != 7)
+		    #pragma omp atomic read
+		    l = c[i][j + 1][0];
+		  else
+		    #pragma omp atomic read
+		    l = c[i - 1][2][0];
+		  if (l < 2)
+		    abort ();
+		}
+	      if (i >= 1 && j >= 4 && k >= 2)
+		{
+		  #pragma omp atomic read
+		  l = c[i - 1][j - 2][k - 2];
+		  if (l < 2)
+		    abort ();
+		}
+	      #pragma omp ordered doacross (source: )
+	      #pragma omp atomic write
+	      c[i][j][k] = 3;
+	    }
+
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k)
+    for (i = 0; i < d + 1; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d; k++)
+	  for (l = 0; l < d + 2; l++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink: omp_cur_iteration - 1)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp single
+    {
+      if (i != 1 || j != -1 || k != 0)
+	abort ();
+      i = 8; j = 9; k = 10;
+    }
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m)
+    for (i = 0; i < d + 1; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (m = 0; m < d; m++)
+	    {
+	      #pragma omp ordered doacross (source : )
+	      #pragma omp ordered doacross (sink:omp_cur_iteration - 1)
+	      abort ();
+	    }
+    #pragma omp single
+    if (i != 1 || j != -1 || k != 2 || m != 0)
+      abort ();
+    #pragma omp for collapse(2) ordered(4) nowait
+    for (i = 0; i < d + 1; i++)
+      for (j = d; j > 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (l = 0; l < d + 4; l++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink:omp_cur_iteration - 1)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp for nowait
+    for (i = 0; i < N; i++)
+      if (a[i] != 3)
+	abort ();
+    #pragma omp for collapse(2) private(k) nowait
+    for (i = 0; i < N / 16; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 4; k++)
+	  if (b[i][j][k] != 3 * (i >= 2 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1))
+	    abort ();
+    #pragma omp for collapse(3) nowait
+    for (i = 0; i < N / 32; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 8; k++)
+	  if (c[i][j][k] != 3 * (j >= 2 && (k & 1) == 0))
+	    abort ();
+  }
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/doacross-6.c b/libgomp/testsuite/libgomp.c/doacross-6.c
new file mode 100644
index 00000000000..de5dee6f607
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/doacross-6.c
@@ -0,0 +1,231 @@
+extern void abort (void);
+
+#define N 256
+int a[N], b[N / 16][8][4], c[N / 32][8][8], g[N / 16][8][6];
+volatile int d, e;
+volatile unsigned long long f;
+
+int
+main ()
+{
+  unsigned long long i;
+  int j, k, l, m;
+  #pragma omp parallel private (l)
+  {
+    #pragma omp for schedule(static, 1) ordered (1) nowait
+    for (i = 1; i < N + f; i++)
+      {
+	#pragma omp atomic write
+	a[i] = 1;
+	#pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	if (i > 1)
+	  {
+	    #pragma omp atomic read
+	    l = a[i - 1];
+	    if (l < 2)
+	      abort ();
+	  }
+	#pragma omp atomic write
+	a[i] = 2;
+	if (i < N - 1)
+	  {
+	    #pragma omp atomic read
+	    l = a[i + 1];
+	    if (l == 3)
+	      abort ();
+	  }
+	#pragma omp ordered doacross(source : omp_cur_iteration)
+	#pragma omp atomic write
+	a[i] = 3;
+      }
+    #pragma omp for schedule(static) ordered (3) nowait
+    for (i = 3; i < N / 16 - 1 + f; i++)
+      for (j = 0; j < 8; j += 2)
+	for (k = 1; k <= 3; k++)
+	  {
+	    #pragma omp atomic write
+	    b[i][j][k] = 1;
+	    #pragma omp ordered doacross(sink: i, j - 2, k - 1) \
+				doacross(sink: i - 2, j - 2, k + 1)
+	    #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	    if (j >= 2 && k > 1)
+	      {
+		#pragma omp atomic read
+		l = b[i][j - 2][k - 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp atomic write
+	    b[i][j][k] = 2;
+	    if (i >= 5 && j >= 2 && k < 3)
+	      {
+		#pragma omp atomic read
+		l = b[i - 2][j - 2][k + 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    if (i != 3 || j || k != 1)
+	      {
+		if (k != 1)
+		  #pragma omp atomic read
+		  l = b[i][j][k - 1];
+		else if (j)
+		  #pragma omp atomic read
+		  l = b[i][j - 2][3];
+		else
+		  #pragma omp atomic read
+		  l = b[i - 1][6][3];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp ordered doacross(source:)
+	    #pragma omp atomic write
+	    b[i][j][k] = 3;
+	  }
+#define A(n) int n;
+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3)
+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3)
+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3)
+    D(m)
+#undef A
+    #pragma omp for collapse (2) ordered(61) schedule(dynamic, 15)
+    for (i = 2; i < N / 32 + f; i++)
+      for (j = 7; j > 1; j--)
+	for (k = 6; k >= 0; k -= 2)
+#define A(n) for (n = 4; n < 5; n++)
+	  D(m)
+#undef A
+	    {
+	      #pragma omp atomic write
+	      c[i][j][k] = 1;
+	      #pragma omp ordered doacross (sink: omp_cur_iteration - 1)
+	      if (i != 2 || j != 7 || k != 6)
+		{
+		  if (k != 6)
+		    #pragma omp atomic read
+		    l = c[i][j][k + 2];
+		  else if (j != 7)
+		    #pragma omp atomic read
+		    l = c[i][j + 1][0];
+		  else
+		    #pragma omp atomic read
+		    l = c[i - 1][2][0];
+		  if (l < 2)
+		    abort ();
+		}
+	      #pragma omp atomic write
+	      c[i][j][k] = 2;
+	      #pragma omp ordered doacross (source:)
+	      #pragma omp atomic write
+	      c[i][j][k] = 3;
+	    }
+    #pragma omp for schedule(static) ordered (3) nowait
+    for (j = 0; j < N / 16 - 1; j++)
+      for (k = 0; k < 8; k += 2)
+	for (i = 3; i <= 5 + f; i++)
+	  {
+	    #pragma omp atomic write
+	    g[j][k][i] = 1;
+	    #pragma omp ordered doacross(sink: j, k - 2, i - 1) \
+				doacross(sink: omp_cur_iteration - 1)
+	    #pragma omp ordered doacross(sink: j - 3, k + 2, i - 2)
+	    if (k >= 2 && i > 3)
+	      {
+		#pragma omp atomic read
+		l = g[j][k - 2][i - 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp atomic write
+	    g[j][k][i] = 2;
+	    if (j || k || i != 3)
+	      {
+		if (i != 3)
+		  #pragma omp atomic read
+		  l = g[j][k][i - 1];
+		else if (k)
+		  #pragma omp atomic read
+		  l = g[j][k - 2][5 + f];
+		else
+		  #pragma omp atomic read
+		  l = g[j - 1][6][5 + f];
+		if (l < 2)
+		  abort ();
+	      }
+	    if (j >= 3 && k < N / 16 - 3 && i == 5)
+	      {
+		#pragma omp atomic read
+		l = g[j - 3][k + 2][i - 2];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp ordered doacross(source : omp_cur_iteration)
+	    #pragma omp atomic write
+	    g[j][k][i] = 3;
+	  }
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k)
+    for (i = 2; i < f + 3; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d; k++)
+	  for (l = 0; l < d + 2; l++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink:omp_cur_iteration - 1)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp single
+    {
+      if (i != 3 || j != -1 || k != 0)
+	abort ();
+      i = 8; j = 9; k = 10;
+    }
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m)
+    for (i = 2; i < f + 3; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (m = 0; m < d; m++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink:omp_cur_iteration - 1)
+	      abort ();
+	    }
+    #pragma omp single
+    if (i != 3 || j != -1 || k != 2 || m != 0)
+      abort ();
+    #pragma omp for collapse(2) ordered(4) nowait
+    for (i = 2; i < f + 3; i++)
+      for (j = d; j > 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (l = 0; l < d + 4; l++)
+	    {
+	      #pragma omp ordered doacross (source:)
+	      #pragma omp ordered doacross (sink:omp_cur_iteration-1)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp for nowait
+    for (i = 0; i < N; i++)
+      if (a[i] != 3 * (i >= 1))
+	abort ();
+    #pragma omp for collapse(2) private(k) nowait
+    for (i = 0; i < N / 16; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 4; k++)
+	  if (b[i][j][k] != 3 * (i >= 3 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1))
+	    abort ();
+    #pragma omp for collapse(3) nowait
+    for (i = 0; i < N / 32; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 8; k++)
+	  if (c[i][j][k] != 3 * (i >= 2 && j >= 2 && (k & 1) == 0))
+	    abort ();
+    #pragma omp for collapse(2) private(k) nowait
+    for (i = 0; i < N / 16; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 6; k++)
+	  if (g[i][j][k] != 3 * (i < N / 16 - 1 && (j & 1) == 0 && k >= 3))
+	    abort ();
+  }
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/doacross-7.c b/libgomp/testsuite/libgomp.c/doacross-7.c
new file mode 100644
index 00000000000..f0ad38849dd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/doacross-7.c
@@ -0,0 +1,231 @@
+extern void abort (void);
+
+#define N 256
+int a[N], b[N / 16][8][4], c[N / 32][8][8], g[N / 16][8][6];
+volatile int d, e;
+volatile unsigned long long f;
+
+int
+main ()
+{
+  unsigned long long i;
+  int j, k, l, m;
+  #pragma omp parallel private (l)
+  {
+    #pragma omp for schedule(guided, 3) ordered (1) nowait
+    for (i = 1; i < N + f; i++)
+      {
+	#pragma omp atomic write
+	a[i] = 1;
+	#pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	if (i > 1)
+	  {
+	    #pragma omp atomic read
+	    l = a[i - 1];
+	    if (l < 2)
+	      abort ();
+	  }
+	#pragma omp atomic write
+	a[i] = 2;
+	if (i < N - 1)
+	  {
+	    #pragma omp atomic read
+	    l = a[i + 1];
+	    if (l == 3)
+	      abort ();
+	  }
+	#pragma omp ordered doacross(source : omp_cur_iteration)
+	#pragma omp atomic write
+	a[i] = 3;
+      }
+    #pragma omp for schedule(guided) ordered (3) nowait
+    for (i = 3; i < N / 16 - 1 + f; i++)
+      for (j = 0; j < 8; j += 2)
+	for (k = 1; k <= 3; k++)
+	  {
+	    #pragma omp atomic write
+	    b[i][j][k] = 1;
+	    #pragma omp ordered doacross(sink: i, j - 2, k - 1) \
+				doacross(sink: i - 2, j - 2, k + 1)
+	    #pragma omp ordered doacross(sink: omp_cur_iteration - 1)
+	    if (j >= 2 && k > 1)
+	      {
+		#pragma omp atomic read
+		l = b[i][j - 2][k - 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp atomic write
+	    b[i][j][k] = 2;
+	    if (i >= 5 && j >= 2 && k < 3)
+	      {
+		#pragma omp atomic read
+		l = b[i - 2][j - 2][k + 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    if (i != 3 || j || k != 1)
+	      {
+		if (k != 1)
+		  #pragma omp atomic read
+		  l = b[i][j][k - 1];
+		else if (j)
+		  #pragma omp atomic read
+		  l = b[i][j - 2][3];
+		else
+		  #pragma omp atomic read
+		  l = b[i - 1][6][3];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp ordered doacross(source:)
+	    #pragma omp atomic write
+	    b[i][j][k] = 3;
+	  }
+#define A(n) int n;
+#define B(n) A(n##0) A(n##1) A(n##2) A(n##3)
+#define C(n) B(n##0) B(n##1) B(n##2) B(n##3)
+#define D(n) C(n##0) C(n##1) C(n##2) C(n##3)
+    D(m)
+#undef A
+    #pragma omp for collapse (2) ordered(61) schedule(guided, 15)
+    for (i = 2; i < N / 32 + f; i++)
+      for (j = 7; j > 1; j--)
+	for (k = 6; k >= 0; k -= 2)
+#define A(n) for (n = 4; n < 5; n++)
+	  D(m)
+#undef A
+	    {
+	      #pragma omp atomic write
+	      c[i][j][k] = 1;
+	      #pragma omp ordered doacross (sink: omp_cur_iteration - 1)
+	      if (i != 2 || j != 7 || k != 6)
+		{
+		  if (k != 6)
+		    #pragma omp atomic read
+		    l = c[i][j][k + 2];
+		  else if (j != 7)
+		    #pragma omp atomic read
+		    l = c[i][j + 1][0];
+		  else
+		    #pragma omp atomic read
+		    l = c[i - 1][2][0];
+		  if (l < 2)
+		    abort ();
+		}
+	      #pragma omp atomic write
+	      c[i][j][k] = 2;
+	      #pragma omp ordered doacross (source:)
+	      #pragma omp atomic write
+	      c[i][j][k] = 3;
+	    }
+    #pragma omp for schedule(guided, 5) ordered (3) nowait
+    for (j = 0; j < N / 16 - 1; j++)
+      for (k = 0; k < 8; k += 2)
+	for (i = 3; i <= 5 + f; i++)
+	  {
+	    #pragma omp atomic write
+	    g[j][k][i] = 1;
+	    #pragma omp ordered doacross(sink: j, k - 2, i - 1) \
+				doacross(sink: omp_cur_iteration - 1)
+	    #pragma omp ordered doacross(sink: j - 3, k + 2, i - 2)
+	    if (k >= 2 && i > 3)
+	      {
+		#pragma omp atomic read
+		l = g[j][k - 2][i - 1];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp atomic write
+	    g[j][k][i] = 2;
+	    if (j || k || i != 3)
+	      {
+		if (i != 3)
+		  #pragma omp atomic read
+		  l = g[j][k][i - 1];
+		else if (k)
+		  #pragma omp atomic read
+		  l = g[j][k - 2][5 + f];
+		else
+		  #pragma omp atomic read
+		  l = g[j - 1][6][5 + f];
+		if (l < 2)
+		  abort ();
+	      }
+	    if (j >= 3 && k < N / 16 - 3 && i == 5)
+	      {
+		#pragma omp atomic read
+		l = g[j - 3][k + 2][i - 2];
+		if (l < 2)
+		  abort ();
+	      }
+	    #pragma omp ordered doacross(source : omp_cur_iteration)
+	    #pragma omp atomic write
+	    g[j][k][i] = 3;
+	  }
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k)
+    for (i = 2; i < f + 3; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d; k++)
+	  for (l = 0; l < d + 2; l++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink:omp_cur_iteration - 1)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp single
+    {
+      if (i != 3 || j != -1 || k != 0)
+	abort ();
+      i = 8; j = 9; k = 10;
+    }
+    #pragma omp for collapse(2) ordered(4) lastprivate (i, j, k, m)
+    for (i = 2; i < f + 3; i++)
+      for (j = d + 1; j >= 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (m = 0; m < d; m++)
+	    {
+	      #pragma omp ordered doacross (source : omp_cur_iteration)
+	      #pragma omp ordered doacross (sink:omp_cur_iteration - 1)
+	      abort ();
+	    }
+    #pragma omp single
+    if (i != 3 || j != -1 || k != 2 || m != 0)
+      abort ();
+    #pragma omp for collapse(2) ordered(4) nowait
+    for (i = 2; i < f + 3; i++)
+      for (j = d; j > 0; j--)
+	for (k = 0; k < d + 2; k++)
+	  for (l = 0; l < d + 4; l++)
+	    {
+	      #pragma omp ordered doacross (source:)
+	      #pragma omp ordered doacross (sink:omp_cur_iteration-1)
+	      if (!e)
+		abort ();
+	    }
+    #pragma omp for nowait
+    for (i = 0; i < N; i++)
+      if (a[i] != 3 * (i >= 1))
+	abort ();
+    #pragma omp for collapse(2) private(k) nowait
+    for (i = 0; i < N / 16; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 4; k++)
+	  if (b[i][j][k] != 3 * (i >= 3 && i < N / 16 - 1 && (j & 1) == 0 && k >= 1))
+	    abort ();
+    #pragma omp for collapse(3) nowait
+    for (i = 0; i < N / 32; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 8; k++)
+	  if (c[i][j][k] != 3 * (i >= 2 && j >= 2 && (k & 1) == 0))
+	    abort ();
+    #pragma omp for collapse(2) private(k) nowait
+    for (i = 0; i < N / 16; i++)
+      for (j = 0; j < 8; j++)
+	for (k = 0; k < 6; k++)
+	  if (g[i][j][k] != 3 * (i < N / 16 - 1 && (j & 1) == 0 && k >= 3))
+	    abort ();
+  }
+  return 0;
+}