public inbox for fortran@gcc.gnu.org
 help / color / mirror / Atom feed
* [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
@ 2023-01-19 14:40 Tobias Burnus
  2023-01-20 17:39 ` Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Tobias Burnus @ 2023-01-19 14:40 UTC (permalink / raw)
  To: gcc-patches, fortran, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2218 bytes --]

This is all about non-rectangular loop nests in OpenMP.

The attached patch depends on the obvious fix for https://gcc.gnu.org/PR108459,
which is together with a nice testcase in Jakub's WIP patch attached to the PR;
without, gfortran.dg/gomp/canonical-loop-1.f90 fails with an ICE (segfault).

My patch fixes part of the Fortran issues found. Namely, it ensures that a "regular"
non-rectangular loop nest actually works by passing the outer-loop-var, the multiplier
and offset in a TREE_VEC to the middle end. It additionally avoids pointlessly
creating a temporary variable for a VAR_DECL (main advantage: dump looks cleaner and
avoids some dependency analysis) - and likewise for 'step' given that 'step' was
evaluated before.

There is an additional issue - not quite addressed in this patch: There are cases
when a loop variable is replaced by another variable ('count') and then at the beginning
of the loop body, the original variable gets the value from the count variable. Obviously,
this no longer works with non-rectangular loop nests.
The 'count' appears in two cases: (a) when the iteration step is not 1 or -1 and (b) if
the iteration variable is a pointer (scalar with allocatable, pointer, optional argument
or just a dummy argument; oddly, even if it has the value attribute).

There is pending work to be done in this case, as mentioned in comment 6 and 8 of the PR.
This patch adds some 'sorry' messages for them. I hope and think that I have not missed
a case where 'count' is used which I did not catch, but I should have all or at least most.

OK for mainline, once the other patch has been committed?

Tobias

PS: I still need to verify that everything is fine, once the other patch has been committed.
A flaky mainboard on the laptop causes multiple random freezes per day, which makes testing
+ patch writing a bit harder. (At least the mainboard replacement is scheduled for tomorrow :-) )
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: fix-omp-non-rect.diff --]
[-- Type: text/x-patch, Size: 44909 bytes --]

OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

This patch ensures that loop bounds depending on outer loop vars use the
proper TREE_VEC format. It additionally gives a sorry if such an outer
var has a non-one/non-minus-one increment as currently a count variable
is used in this case (see PR).

gcc/fortran/ChangeLog:

	PR fortran/107424
	* trans-openmp.cc (gfc_nonrect_loop_expr): New.
	(gfc_trans_omp_do): Call it for start/end loop bound
	for non-rectangular loop nests.

gcc/testsuite/

	PR fortran/107424
	* gfortran.dg/gomp/non-rectangular-loop-3.f90: New test.

libgomp/ChangeLog:

	PR fortran/107424
	* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.

 gcc/fortran/trans-openmp.cc                        | 167 +++++-
 .../gfortran.dg/gomp/non-rectangular-loop-3.f90    |  85 +++
 .../libgomp.fortran/non-rectangular-loop-1.f90     | 637 +++++++++++++++++++++
 .../libgomp.fortran/non-rectangular-loop-1a.f90    | 374 ++++++++++++
 .../libgomp.fortran/non-rectangular-loop-2.f90     | 243 ++++++++
 5 files changed, 1495 insertions(+), 11 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 87213de0918..73376894316 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -5120,6 +5120,136 @@ typedef struct dovar_init_d {
   tree init;
 } dovar_init;
 
+static bool
+gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n,
+		       gfc_code *code, gfc_expr *expr, vec<dovar_init> *inits)
+{
+  int i;
+  for (i = 0; i < loop_n; i++)
+    {
+      gcc_assert (code->ext.iterator->var->expr_type == EXPR_VARIABLE);
+      if (gfc_find_sym_in_expr (code->ext.iterator->var->symtree->n.sym, expr))
+	break;
+      code = code->block->next;
+    }
+  if (i >= loop_n)
+    return false;
+
+  /* Canonic format: TREE_VEC with [var, multiplier, offset].  */
+  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
+
+  gfc_se se;
+  tree tree_var, a1, a2;
+  a1 = integer_one_node;
+  a2 = integer_zero_node;
+
+  gfc_init_se (&se, NULL);
+  gfc_conv_expr_lhs (&se, code->ext.iterator->var);
+  gfc_add_block_to_block (pblock, &se.pre);
+  tree_var = se.expr;
+
+  {
+    /* FIXME: Handle non-unity iterations, cf. PR fortran/107424.
+       The issue is that for those a 'count' variable is used.  */
+    dovar_init *di;
+    unsigned ix;
+    tree t = tree_var;
+    while (TREE_CODE (t) == INDIRECT_REF)
+      t = TREE_OPERAND (t, 0);
+    FOR_EACH_VEC_ELT (*inits, ix, di)
+      {
+	tree t2 = di->var;
+	while (TREE_CODE (t2) == INDIRECT_REF)
+	  t2 = TREE_OPERAND (t2, 0);
+	if (t == t2)
+	  {
+	    HOST_WIDE_INT intval;
+	    if (gfc_extract_hwi (code->ext.iterator->step, &intval, 0) == 0
+		&& intval != 1 && intval != -1)
+	      sorry_at (gfc_get_location (&code->loc),
+			"non-rectangular loop nest with non-unit loop iteration"
+			" step for %qs", var->name);
+	    else
+	      sorry_at (gfc_get_location (&code->loc),
+			"non-rectangular loop nest with dummy-argument or "
+			"pointer, optional or allocatable do-variable %qs",
+			var->name);
+
+	    inform (gfc_get_location (&expr->where), "Used here");
+	    return false;
+	  }
+      }
+  }
+
+  if (expr->expr_type == EXPR_VARIABLE)
+    gcc_assert (expr->symtree->n.sym == var);
+  else if (expr->expr_type != EXPR_OP
+	   || (expr->value.op.op != INTRINSIC_TIMES
+	       && expr->value.op.op != INTRINSIC_PLUS
+	       && expr->value.op.op != INTRINSIC_MINUS))
+    gcc_unreachable ();
+  else
+    {
+      gfc_expr *et = NULL, *eo = NULL, *e = expr;
+      if (expr->value.op.op != INTRINSIC_TIMES)
+	{
+	  if (gfc_find_sym_in_expr (var, expr->value.op.op1))
+	    {
+	      e = expr->value.op.op1;
+	      eo = expr->value.op.op2;
+	    }
+	  else
+	    {
+	      eo = expr->value.op.op1;
+	      e = expr->value.op.op2;
+	    }
+	}
+      if (e->value.op.op == INTRINSIC_TIMES)
+	{
+	  if (e->value.op.op1->expr_type == EXPR_VARIABLE
+	      && e->value.op.op1->symtree->n.sym == var)
+	    et = e->value.op.op2;
+	  else
+	    {
+	      et = e->value.op.op1;
+	      gcc_assert (e->value.op.op2->expr_type == EXPR_VARIABLE
+			  && e->value.op.op2->symtree->n.sym == var);
+	    }
+	}
+      else
+	gcc_assert (e->expr_type == EXPR_VARIABLE && e->symtree->n.sym == var);
+      if (et != NULL)
+	{
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr_val (&se, et);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  a1 = se.expr;
+	}
+      if (eo != NULL)
+	{
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr_val (&se, eo);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  a2 = se.expr;
+	  if (expr->value.op.op == INTRINSIC_MINUS && expr->value.op.op2 == eo)
+	    /* outer-var - a2.  */
+	    a2 = fold_build1 (NEGATE_EXPR, TREE_TYPE (a2), a2);
+	  else if (expr->value.op.op == INTRINSIC_MINUS)
+	    /* a2 - outer-var.  */
+	    a1 = fold_build1 (NEGATE_EXPR, TREE_TYPE (a1), a1);
+	}
+      a1 = DECL_P (a1) ? a1 : gfc_evaluate_now (a1, pblock);
+      a2 = DECL_P (a2) ? a2 : gfc_evaluate_now (a2, pblock);
+    }
+
+  gfc_init_se (sep, NULL);
+  sep->expr = make_tree_vec (3);
+  TREE_VEC_ELT (sep->expr, 0) = tree_var;
+  TREE_VEC_ELT (sep->expr, 1) = fold_convert (TREE_TYPE (tree_var), a1);
+  TREE_VEC_ELT (sep->expr, 2) = fold_convert (TREE_TYPE (tree_var), a2);
+
+  return true;
+}
 
 static tree
 gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
@@ -5219,19 +5349,35 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
       gcc_assert (TREE_CODE (type) == INTEGER_TYPE);
 
       gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->start);
-      gfc_add_block_to_block (pblock, &se.pre);
-      from = gfc_evaluate_now (se.expr, pblock);
+      if (!clauses->non_rectangular
+	  || !gfc_nonrect_loop_expr (pblock, &se, i, orig_code->block->next,
+				     code->ext.iterator->start, &inits))
+	{
+	  gfc_conv_expr_val (&se, code->ext.iterator->start);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  if (!DECL_P (se.expr))
+	    se.expr = gfc_evaluate_now (se.expr, pblock);
+	}
+      from = se.expr;
 
       gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->end);
-      gfc_add_block_to_block (pblock, &se.pre);
-      to = gfc_evaluate_now (se.expr, pblock);
+      if (!clauses->non_rectangular
+	  || !gfc_nonrect_loop_expr (pblock, &se, i, orig_code->block->next,
+				     code->ext.iterator->end, &inits))
+	{
+	  gfc_conv_expr_val (&se, code->ext.iterator->end);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  if (!DECL_P (se.expr))
+	    se.expr = gfc_evaluate_now (se.expr, pblock);
+	}
+      to = se.expr;
 
       gfc_init_se (&se, NULL);
       gfc_conv_expr_val (&se, code->ext.iterator->step);
       gfc_add_block_to_block (pblock, &se.pre);
-      step = gfc_evaluate_now (se.expr, pblock);
+      if (!DECL_P (se.expr))
+	se.expr = gfc_evaluate_now (se.expr, pblock);
+      step = se.expr;
       dovar_decl = dovar;
 
       /* Special case simple loops.  */
@@ -5331,9 +5477,9 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	      OMP_CLAUSE_LINEAR_NO_COPYIN (tmp) = 1;
 	      OMP_CLAUSE_DECL (tmp) = dovar_decl;
 	      omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
+	      if (!simple)
+		dovar_found = 3;
 	    }
-	  if (!simple)
-	    dovar_found = 3;
 	}
       else if (!dovar_found && !simple)
 	{
@@ -5367,9 +5513,8 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 		}
 	      else
 		{
-		  tmp = gfc_evaluate_now (step, pblock);
 		  tmp = fold_build2_loc (input_location, PLUS_EXPR, type,
-					 dovar, tmp);
+					 dovar, step);
 		}
 	      tmp = fold_build2_loc (input_location, MODIFY_EXPR, type,
 				     dovar, tmp);
diff --git a/gcc/testsuite/gfortran.dg/gomp/non-rectangular-loop-3.f90 b/gcc/testsuite/gfortran.dg/gomp/non-rectangular-loop-3.f90
new file mode 100644
index 00000000000..5c8e92589dc
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/non-rectangular-loop-3.f90
@@ -0,0 +1,85 @@
+! PR fortran/107424
+
+subroutine foo (av, avo, a0, a0o, a1, a2, a3, a4)
+implicit none
+
+integer, value :: av
+integer, value :: avo
+integer :: a0
+integer :: a0o
+integer, pointer :: a1
+integer, pointer, optional :: a2
+integer, allocatable :: a3
+integer, allocatable, optional :: a4
+integer :: a5
+integer, pointer :: a6
+integer, allocatable :: a7
+
+integer :: j
+
+!$omp simd collapse(2)
+do av = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'av'" }
+  do j = av, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do avo = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'avo'" }
+  do j = avo, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a0 = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a0'" }
+  do j = a0, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a0o = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a0o'" }
+  do j = a0o, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a1 = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a1'" }
+  do j = a1, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a2 = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a2'" }
+  do j = a2, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a3 = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a3'" }
+  do j = a3, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a4 = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a4'" }
+  do j = a4, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a5 = 1, 10
+  do j = a5, 20
+  end do
+end do
+
+!$omp simd collapse(2)
+do a6 = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a6'" }
+  do j = a6, 20  ! { dg-note "Used here" }
+  end do
+end do
+
+!$omp simd collapse(2)
+do a7 = 1, 10  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with dummy-argument or pointer, optional or allocatable do-variable 'a7'" }
+  do j = a7, 20  ! { dg-note "Used here" }
+  end do
+end do
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90
new file mode 100644
index 00000000000..99bfa1a0c98
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90
@@ -0,0 +1,637 @@
+! { dg-do run }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! See PR or non-rectangular-loop-1a.f90 for the commented tests
+! Hint: Those use strides for loop vars part of nonrectangular loop nests
+
+module m
+  implicit none (type, external)
+contains
+
+! The 'k' loop uses i or j as start value
+! but a constant end value such that 'lastprivate'
+! should be well-defined
+subroutine lastprivate_check_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
+  ! Then same, execpt use nonunit stride for 'k'
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) then
+          print *, i, j, k,p, " -> i, j, k, p   (k < 1 - 41 .or. k > p)"
+          error stop
+        end if
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!
+!  !$omp simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_simd_1
+
+
+! Same but with do simd
+subroutine lastprivate_check_do_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
+  ! Then same, execpt use nonunit stride for 'k'
+
+!  !$omp parallel do simd collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_do_simd_1
+
+
+
+! Same but with do
+subroutine lastprivate_check_do_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
+  ! Then same, execpt use nonunit stride for 'k'
+
+!  !$omp parallel do collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_do_1
+
+
+
+subroutine lastprivate_check_2
+  integer :: n,m,p, i,j,k,ll
+
+  n = 11
+  m = 23
+  p = 27
+
+!  !$omp parallel do simd collapse(3) lastprivate(p)
+!  do i = 1, n
+!    do j = 1, m,2
+!      do k = 1, j + 41
+!        do ll = 1, p, 2
+!          if (k > 23 + 41 .or. k < 1) error stop
+!        end do
+!      end do
+!    end do
+!  end do
+!  if (ll /= 29) error stop
+
+!  !$omp simd collapse(3) lastprivate(p)
+!  do i = 1, n
+!    do j = 1, m,2
+!      do k = 1, j + 41
+!        do ll = 1, p, 2
+!          if (k > 23 + 41 .or. k < 1) error stop
+!        end do
+!      end do
+!    end do
+!  end do
+!  if (ll /= 29) error stop
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n,2
+!    do j = 1, m
+!      do k = 1, i + 41
+!        if (k > 11 + 41 .or. k < 1) error stop
+!      end do
+!    end do
+!  end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - Same but without 'private':
+!!$omp simd collapse(3) lastprivate(k)
+!do i = 1, n
+!  do j = 1, m,2
+!    do k = 1, j + 41
+!      if (k > 23 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 65) then
+!  print *, k, 65
+!  error stop
+!endif
+
+
+!!$omp simd collapse(3) lastprivate(k)
+!do i = 1, n,2
+!  do j = 1, m
+!    do k = 1, i + 41
+!      if (k > 11 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - all with lastprivate
+!!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+!do i = 1, n
+!  do j = 1, m,2
+!    do k = 1, j + 41
+!      if (k > 23 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 65) then
+!  print *, k, 65
+!  error stop
+!endif
+
+
+!!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+!do i = 1, n,2
+!  do j = 1, m
+!    do k = 1, i + 41
+!      if (k > 11 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+end
+end module m
+
+program main
+  use m
+  implicit none (type, external)
+  call lastprivate_check_simd_1
+  call lastprivate_check_do_simd_1
+  call lastprivate_check_do_1
+  call lastprivate_check_2
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90
new file mode 100644
index 00000000000..9607fcc1038
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90
@@ -0,0 +1,374 @@
+! { dg-do compile }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! ========================================================
+! NOTE: The testcases are from non-rectangular-loop-1.f90,
+! but commented there. Feel free to remove this
+! file + uncomment them in non-rectangular-loop-1.f90
+! Otherwise, you need to change it to 'dg-do run'!
+! ========================================================
+
+module m
+  implicit none (type, external)
+contains
+
+! The 'k' loop uses i or j as start value
+! but a constant end value such that 'lastprivate'
+! should be well-defined
+subroutine lastprivate_check_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
+  ! Then same, execpt use nonunit stride for 'k'
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_simd_1
+
+
+! Same but with do simd
+subroutine lastprivate_check_do_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
+  ! Then same, execpt use nonunit stride for 'k'
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_do_simd_1
+
+
+
+! Same but with do
+subroutine lastprivate_check_do_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
+  ! Then same, execpt use nonunit stride for 'k'
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_do_1
+
+
+
+subroutine lastprivate_check_2
+  integer :: n,m,p, i,j,k,ll
+
+  n = 11
+  m = 23
+  p = 27
+
+  !$omp parallel do simd collapse(3) lastprivate(p)
+  do i = 1, n
+    do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = 1, j + 41  ! { dg-note "Used here" }
+        do ll = 1, p, 2
+          if (k > 23 + 41 .or. k < 1) error stop
+        end do
+      end do
+    end do
+  end do
+  if (ll /= 29) error stop
+
+  !$omp simd collapse(3) lastprivate(p)
+  do i = 1, n
+    do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+      do k = 1, j + 41  ! { dg-note "Used here" }
+        do ll = 1, p, 2
+          if (k > 23 + 41 .or. k < 1) error stop
+        end do
+      end do
+    end do
+  end do
+  if (ll /= 29) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+    do j = 1, m
+      do k = 1, i + 41  ! { dg-note "Used here" }
+        if (k > 11 + 41 .or. k < 1) error stop
+      end do
+    end do
+  end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - Same but without 'private':
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+    do k = 1, j + 41  ! { dg-note "Used here" }
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+  do j = 1, m
+    do k = 1, i + 41  ! { dg-note "Used here" }
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - all with lastprivate
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n
+  do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'j'" }
+    do k = 1, j + 41  ! { dg-note "Used here" }
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with non-unit loop iteration step for 'i'" }
+  do j = 1, m
+    do k = 1, i + 41  ! { dg-note "Used here" }
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+end
+end module m
+
+program main
+  use m
+  implicit none (type, external)
+  call lastprivate_check_simd_1
+  call lastprivate_check_do_simd_1
+  call lastprivate_check_do_1
+  call lastprivate_check_2
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90
new file mode 100644
index 00000000000..0cea61e5f0d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90
@@ -0,0 +1,243 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original -fcheck=all" }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! Valid patterns are:
+!  (1)  a2 - var-outer
+!  (2)  a1 * var-outer
+!  (3)  a1 * var-outer + a2
+!  (4)  a2 + a1 * var-outer
+!  (5)  a1 * var-outer - a2
+!  (6)  a2 - a1 * var-outer
+!  (7)  var-outer * a1
+!  (8)  var-outer * a1 + a2
+!  (9)  a2 + var-outer * a1
+! (10)  var-outer * a1 - a2
+! (11)  a2 - var-outer * a1
+
+module m
+contains
+
+
+! { dg-final { scan-tree-dump-times "for \\(one_two_inner = one_two_outer \\* -1 \\+ one_a2; one_two_inner <= one_two_outer \\* two_a1 \\+ 0; one_two_inner = one_two_inner \\+ 1\\)" 1 original } }
+
+!  (1)  a2 - var-outer
+!  (2)  a1 * var-outer
+subroutine one_two()
+  implicit none
+  integer :: one_a2
+  integer :: two_a1
+  integer :: one_two_outer, one_two_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  one_a2 = 13
+  two_a1 = 5
+  allocate(var(1:10, one_a2 - 10:two_a1 * 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do one_two_outer = 1, 10
+    do one_two_inner = one_a2 - one_two_outer, two_a1 * one_two_outer
+      !$omp atomic update
+      var(one_two_outer,one_two_inner) = var(one_two_outer,one_two_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = one_a2 - i, two_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(three_four_inner = three_four_outer \\* three_a1 \\+ three_a2; three_four_inner <= three_four_outer \\* four_a1 \\+ four_a2; three_four_inner = three_four_inner \\+ 1\\)" 1 original } }
+
+!  (3)  a1 * var-outer + a2
+!  (4)  a2 + a1 * var-outer
+subroutine three_four()
+  implicit none
+  integer :: three_a1, three_a2
+  integer :: four_a1, four_a2
+  integer :: three_four_outer, three_four_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  three_a1 = 2
+  three_a2 = 3
+  four_a1 = 3
+  four_a2 = 5
+  allocate(var(1:10, three_a1 * 1 + three_a2:four_a2 + four_a1 * 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do three_four_outer = 1, 10
+    do three_four_inner = three_a1 * three_four_outer + three_a2, four_a2 + four_a1 * three_four_outer
+      !$omp atomic update
+      var(three_four_outer, three_four_inner) = var(three_four_outer, three_four_inner) + 2
+    end do
+  end do
+  do i = 1, 10
+    do j = three_a1 * i + three_a2, four_a2 + four_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(five_six_inner = five_six_outer \\* five_a1 \\+ D\\.\[0-9\]+; five_six_inner <= five_six_outer \\* D\\.\[0-9\]+ \\+ six_a2; five_six_inner = five_six_inner \\+ 1\\)" 1 original } }
+
+!  (5)  a1 * var-outer - a2
+!  (6)  a2 - a1 * var-outer
+subroutine five_six()
+  implicit none
+  integer :: five_a1, five_a2
+  integer :: six_a1, six_a2
+  integer :: five_six_outer, five_six_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  five_a1 = 2
+  five_a2 = -3
+  six_a1 = 3
+  six_a2 = 20
+  allocate(var(1:10, five_a1 * 1 - five_a2:six_a2 - six_a1 * 1), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do five_six_outer = 1, 10
+    do five_six_inner = five_a1 * five_six_outer - five_a2, six_a2 - six_a1 * five_six_outer
+      !$omp atomic update
+      var(five_six_outer, five_six_inner) = var(five_six_outer, five_six_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = five_a1 * i - five_a2, six_a2 - six_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(seven_eight_inner = seven_eight_outer \\* seven_a1 \\+ 0; seven_eight_inner <= seven_eight_outer \\* eight_a1 \\+ eight_a2; seven_eight_inner = seven_eight_inner \\+ 1\\)" 1 original } }
+
+!  (7)  var-outer * a1
+!  (8)  var-outer * a1 + a2
+subroutine seven_eight()
+  implicit none
+  integer :: seven_a1
+  integer :: eight_a1, eight_a2
+  integer :: seven_eight_outer, seven_eight_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  seven_a1 = 3
+  eight_a1 = 2
+  eight_a2 = -4
+  allocate(var(1:10, 1 * seven_a1 : 10 * eight_a1 + eight_a2), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do seven_eight_outer = 1, 10
+    do seven_eight_inner = seven_eight_outer * seven_a1, seven_eight_outer * eight_a1 + eight_a2
+      !$omp atomic update
+      var(seven_eight_outer, seven_eight_inner) = var(seven_eight_outer, seven_eight_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = i * seven_a1, i * eight_a1 + eight_a2
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(nine_ten_inner = nine_ten_outer \\* nine_a1 \\+ nine_a2; nine_ten_inner <= nine_ten_outer \\* ten_a1 \\+ D\\.\[0-9\]+; nine_ten_inner = nine_ten_inner \\+ 1\\)" 1 original } }
+
+!  (9)  a2 + var-outer * a1
+! (10)  var-outer * a1 - a2
+subroutine nine_ten()
+  implicit none
+  integer :: nine_a1, nine_a2
+  integer :: ten_a1, ten_a2
+  integer :: nine_ten_outer, nine_ten_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  nine_a1 = 3
+  nine_a2 = 5
+  ten_a1 = 2
+  ten_a2 = 3
+  allocate(var(1:10, nine_a2 + 1 * nine_a1:10 * ten_a1 - ten_a2), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do nine_ten_outer = 1, 10
+    do nine_ten_inner = nine_a2 + nine_ten_outer * nine_a1, nine_ten_outer * ten_a1 - ten_a2
+      !$omp atomic update
+      var(nine_ten_outer, nine_ten_inner) = var(nine_ten_outer, nine_ten_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = nine_a2 + i * nine_a1, i * ten_a1 - ten_a2
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(eleven_inner = eleven_outer \\* D\\.\[0-9\]+ \\+ eleven_a2; eleven_inner <= 10; eleven_inner = eleven_inner \\+ 1\\)" 1 original } }
+
+! (11)  a2 - var-outer * a1
+
+subroutine eleven()
+  implicit none
+  integer :: eleven_a1, eleven_a2
+  integer :: eleven_outer, eleven_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  eleven_a1 = 2
+  eleven_a2 = 3
+  allocate(var(1:10, eleven_a2 - 10 * eleven_a1 : 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do eleven_outer = 1, 10
+    do eleven_inner = eleven_a2 - eleven_outer * eleven_a1, 10
+      !$omp atomic update
+      var(eleven_outer, eleven_inner) = var(eleven_outer, eleven_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = eleven_a2 - i * eleven_a1, 10
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+end module m
+
+program main
+use m
+implicit none
+call one_two()
+call three_four()
+call five_six()
+call seven_eight()
+call nine_ten()
+call eleven()
+end

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
  2023-01-19 14:40 [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424] Tobias Burnus
@ 2023-01-20 17:39 ` Jakub Jelinek
  2023-01-20 18:00   ` Jakub Jelinek
  2023-01-25 14:47   ` [Patch][v2] " Tobias Burnus
  0 siblings, 2 replies; 8+ messages in thread
From: Jakub Jelinek @ 2023-01-20 17:39 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: gcc-patches, fortran

On Thu, Jan 19, 2023 at 03:40:19PM +0100, Tobias Burnus wrote:
> +  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
> +
> +  gfc_se se;
> +  tree tree_var, a1, a2;
> +  a1 = integer_one_node;
> +  a2 = integer_zero_node;
> +
> +  gfc_init_se (&se, NULL);
> +  gfc_conv_expr_lhs (&se, code->ext.iterator->var);
> +  gfc_add_block_to_block (pblock, &se.pre);
> +  tree_var = se.expr;
> +
> +  {
> +    /* FIXME: Handle non-unity iterations, cf. PR fortran/107424.

I think instead of non-unity etc. it is better to talk about
constant step 1 or -1.

> +       The issue is that for those a 'count' variable is used.  */
> +    dovar_init *di;
> +    unsigned ix;
> +    tree t = tree_var;
> +    while (TREE_CODE (t) == INDIRECT_REF)
> +      t = TREE_OPERAND (t, 0);
> +    FOR_EACH_VEC_ELT (*inits, ix, di)
> +      {
> +	tree t2 = di->var;
> +	while (TREE_CODE (t2) == INDIRECT_REF)
> +	  t2 = TREE_OPERAND (t2, 0);

The actual problem with non-simple loops for non-rectangular loops is
both in case it is an inner loop which uses some outer loop's iterator,
or if it is outer loop whose iterator is used, both of those cases
will not be handled properly.  The former case because instead of
having lb and ub expressions in canonicalized form var-outer * m + a
lb will be 0 (that is fine) and ub will be
(var-outer * m2 + a2 + step - var-outer * m1 - a1) / step
or so (sure, we can simplify that to
(var-outer * (m1 - m2) + (a2 + step - a1)) / step
but the division remains.  And the latter case is bad because we
need var-outer but we actually compute some artificial count iterator
and var-outer is only initialized in the body of the loop.
These sorry_at seems to handle just one of those, when the outer
loop whose var-outer is referenced is not simple, no?

I wonder if it wouldn't be cleaner and easier to simply remember for
each loop in XALLOCAVEC array whether it was simple or not and why
(from the:
      if (VAR_P (dovar))
        {
          if (integer_onep (step))
            simple = 1;
          else if (tree_int_cst_equal (step, integer_minus_one_node))
            simple = -1;
        }
      else
        dovar_decl
          = gfc_trans_omp_variable (code->ext.iterator->var->symtree->n.sym,
                                    false);
remember if it was simple (1/-1) or VAR_P !simple (then we would
if needed for non-rect sorry_at about step not being constant 1 or -1)
or if it is the !VAR_P case.
And then the non-rect sorry can be emitted for both the cases easily
(especially if you precompute the:
      if (VAR_P (dovar))
        {
          if (integer_onep (step))
            simple_loop[i] = 1;
          else if (tree_int_cst_equal (step, integer_minus_one_node))
            simple_loop[i] = -1;
	  else
	    simple_loop[i] = 0;
        }
      else
	simple_loop[i] = 2;
early) and in this function check it for both loop_n and i.

> +	if (t == t2)
> +	  {
> +	    HOST_WIDE_INT intval;
> +	    if (gfc_extract_hwi (code->ext.iterator->step, &intval, 0) == 0
> +		&& intval != 1 && intval != -1)
> +	      sorry_at (gfc_get_location (&code->loc),
> +			"non-rectangular loop nest with non-unit loop iteration"
> +			" step for %qs", var->name);

I'd say step other than constant 1 or -1.

> +  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops

unit ?

> +  ! Then same, execpt use nonunit stride for 'k'

except, non-unit ?

> +  ! Use 'i' or 'j', unite stride on 'i' or on 'j' -> 4 loops
> +  ! Then same, execpt use nonunit stride for 'k'

2x again
(and some more later).

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
  2023-01-20 17:39 ` Jakub Jelinek
@ 2023-01-20 18:00   ` Jakub Jelinek
  2023-01-20 20:02     ` Jakub Jelinek
  2023-01-25 14:47   ` [Patch][v2] " Tobias Burnus
  1 sibling, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2023-01-20 18:00 UTC (permalink / raw)
  To: Tobias Burnus, gcc-patches, fortran

On Fri, Jan 20, 2023 at 06:39:04PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > +       The issue is that for those a 'count' variable is used.  */
> > +    dovar_init *di;
> > +    unsigned ix;
> > +    tree t = tree_var;
> > +    while (TREE_CODE (t) == INDIRECT_REF)
> > +      t = TREE_OPERAND (t, 0);
> > +    FOR_EACH_VEC_ELT (*inits, ix, di)
> > +      {
> > +	tree t2 = di->var;
> > +	while (TREE_CODE (t2) == INDIRECT_REF)
> > +	  t2 = TREE_OPERAND (t2, 0);
> 
> The actual problem with non-simple loops for non-rectangular loops is
> both in case it is an inner loop which uses some outer loop's iterator,
> or if it is outer loop whose iterator is used, both of those cases
> will not be handled properly.  The former case because instead of
> having lb and ub expressions in canonicalized form var-outer * m + a
> lb will be 0 (that is fine) and ub will be
> (var-outer * m2 + a2 + step - var-outer * m1 - a1) / step
> or so (sure, we can simplify that to
> (var-outer * (m1 - m2) + (a2 + step - a1)) / step
> but the division remains.  And the latter case is bad because we
> need var-outer but we actually compute some artificial count iterator
> and var-outer is only initialized in the body of the loop.
> These sorry_at seems to handle just one of those, when the outer
> loop whose var-outer is referenced is not simple, no?

Though, I wonder if we shouldn't for GCC 13 just sorry_at about
steps other than constant 1/-1 (in both outer loop with var-outer referenced
in inner loop and on inner loop that references it) and for the !VAR_P case
actually handle it if step 1/-1 by using simple like translation just with
an artificial iterator.
Say for:
subroutine foo (x, y, z)
  integer :: x, y, z
  !$omp do private (x)
  do x = y, z
  end do
end subroutine foo
we right now in *.original dump have:
    D.4265 = *y;
    D.4266 = *z;
    D.4267 = (1 - D.4265) + D.4266;
    #pragma omp for private(count.0) private(x)
    for (count.0 = 0; count.0 < D.4267; count.0 = count.0 + 1)
      {
        *x = D.4265 + NON_LVALUE_EXPR <count.0>;
        L.1:;
      }
What I'd suggest is:
    D.4265 = *y;
    D.4266 = *z;
    #pragma omp for private(x)
    for (x.0 = D.4265; x.0 <= D.4266; x.0 = x.0 + 1)
      {
        *x = x.0;
        L.1:;
      }
or so.  This could be done independently from the non-rect stuff,
as a first change.

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
  2023-01-20 18:00   ` Jakub Jelinek
@ 2023-01-20 20:02     ` Jakub Jelinek
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Jelinek @ 2023-01-20 20:02 UTC (permalink / raw)
  To: Tobias Burnus, gcc-patches, fortran

On Fri, Jan 20, 2023 at 07:00:18PM +0100, Jakub Jelinek via Gcc-patches wrote:
> Though, I wonder if we shouldn't for GCC 13 just sorry_at about
> steps other than constant 1/-1 (in both outer loop with var-outer referenced
> in inner loop and on inner loop that references it) and for the !VAR_P case
> actually handle it if step 1/-1 by using simple like translation just with
> an artificial iterator.

As for the steps other than constant 1/-1, we have 5 cases:
  do i = x, y, 25
or
  do i = 12, 72, z
or
  do i = x, y, -42
or
  do i = 42, -10, z
or
  do i = x, y, z
The 1st and 3rd are with constant step, 2nd and 4th with constant lower and
upper bounds and the last one has step and at least one of the bounds
non-constant.

I wonder if in the light of e.g. PR108431 which says that
do i = -huge(i), huge(i) is invalid (well, that one would be very wrong
even from OpenMP POV because computing number of iterations definitely
overflows) and the fact that we handle step 1 and -1 the simple way
do do i = huge(i) - 10, huge(i) will not work either, I wonder if even
do i = huge(i) - 5, huge(i) - 1, 2 is undefined (similar reasoning, if
i after loop needs to be set to the huge(i) + 1 it is signed integer
overflow).  If yes, then perhaps at least the first 4 cases could be easily
handled (perhaps for GCC 13 just if clauses->non_rectangular only) as
for (i = x; i <= y; i += 25)
or
for (i = 12; i <= 72; i += z)
or
for (i = x; i >= y; i -= 42)
or
for (i = 42; i >= -10; i += z)

If those give equivalent behavior, then that would mean a sorry
only for the last case - the problem is that we then don't know at compile
time the direction.
Though perhaps even for that case we could play tricks, handle
  do i = x, y, z
as
if (z > 0)
  a = x, b = y, c = z;
else
  a = INT_MIN, b = too_lazy_to_compute_that_now, c = -z;
for (counter = a; counter <= b; counter += c)
{
  if (z > 0)
    i = counter;
  else
    i = counter - (unsigned) INT_MAX;
}
If that works, we'd need to figure also out how to handle that
in the non-rect cases.  But the m1 * var-outer + a1 and m2 * var-outer + a2
factors can be non-constant invariants, so again we could compute something
for them depending on if the outer or inner step was positive or negative.

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Patch][v2] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
  2023-01-20 17:39 ` Jakub Jelinek
  2023-01-20 18:00   ` Jakub Jelinek
@ 2023-01-25 14:47   ` Tobias Burnus
  2023-01-31 11:37     ` Jakub Jelinek
  1 sibling, 1 reply; 8+ messages in thread
From: Tobias Burnus @ 2023-01-25 14:47 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, fortran

[-- Attachment #1: Type: text/plain, Size: 1513 bytes --]

Hi Jakub, hi all,

updated patch included, i.e. avoiding 'count' for 'j' when a 'j.0' would
do (i.e. only local var without the different step calculation). I also
now reject if there is a non-unit step on the loop using an outer var.

Eventually still to be done: replace the 'sorry' by working code, i.e.
implement the suggestions to handle some/all non-unit iteration steps as
proposed in this thread.

On 20.01.23 18:39, Jakub Jelinek wrote:
> I think instead of non-unity etc. it is better to talk about constant
> step 1 or -1.

I concur.


> The actual problem with non-simple loops for non-rectangular loops is
> both in case it is an inner loop which uses some outer loop's iterator,
> or if it is outer loop whose iterator is used, both of those cases
> will not be handled properly.

I have now added a check for the other case as well.

Just to confirm, the following is fine, isn't it?

!$omp simd collapse(4)
do i = 1, 10, 2
   do outer_var = 1, 10  ! step = + 1
     do j = 1, 10, 2
       do inner_var = 1, outer_var  ! step = 1

i.e. both the inner_var and outer_var have 'step = 1',
even if other loops in the 'collapse' have step != 1.
I think it should be fine.

OK mainline?

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: omp-non-rect.diff --]
[-- Type: text/x-patch, Size: 65587 bytes --]

OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

This patch ensures that loop bounds depending on outer loop vars use the
proper TREE_VEC format. It additionally gives a sorry if such an outer
var has a non-one/non-minus-one increment as currently a count variable
is used in this case (see PR).

Finally, it avoids 'count' and just uses a local loop variable if the
step increment is +/-1.

	PR fortran/107424

gcc/fortran/ChangeLog:

	* trans-openmp.cc (struct dovar_init_d): Add 'sym' and
	'non_unit_incr' members.
	(gfc_nonrect_loop_expr): New.
	(gfc_trans_omp_do): Call it; use normal loop bounds
	for unit stride - and only create local loop var.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note.
	* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.

 gcc/fortran/trans-openmp.cc                        | 238 ++++++--
 .../goacc/privatization-1-compute-loop.f90         |   6 +-
 .../goacc/privatization-1-routine_gang-loop.f90    |   3 +-
 .../libgomp.fortran/non-rectangular-loop-1.f90     | 637 +++++++++++++++++++++
 .../libgomp.fortran/non-rectangular-loop-1a.f90    | 374 ++++++++++++
 .../libgomp.fortran/non-rectangular-loop-2.f90     | 243 ++++++++
 .../libgomp.fortran/non-rectangular-loop-3.f90     | 186 ++++++
 .../libgomp.fortran/non-rectangular-loop-4.f90     | 188 ++++++
 .../libgomp.fortran/non-rectangular-loop-5.f90     |  28 +
 9 files changed, 1854 insertions(+), 49 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 87213de0918..ccee9e16648 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -5116,10 +5116,135 @@ gfc_trans_omp_critical (gfc_code *code)
 }
 
 typedef struct dovar_init_d {
+  gfc_symbol *sym;
   tree var;
   tree init;
+  bool non_unit_iter;
 } dovar_init;
 
+static bool
+gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n,
+		       gfc_code *code, gfc_expr *expr, vec<dovar_init> *inits,
+		       int simple, gfc_expr *curr_loop_var)
+{
+  int i;
+  for (i = 0; i < loop_n; i++)
+    {
+      gcc_assert (code->ext.iterator->var->expr_type == EXPR_VARIABLE);
+      if (gfc_find_sym_in_expr (code->ext.iterator->var->symtree->n.sym, expr))
+	break;
+      code = code->block->next;
+    }
+  if (i >= loop_n)
+    return false;
+
+  /* Canonic format: TREE_VEC with [var, multiplier, offset].  */
+  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
+
+  tree tree_var = NULL_TREE;
+  tree a1 = integer_one_node;
+  tree a2 = integer_zero_node;
+
+  if (!simple)
+    {
+      /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
+      sorry_at (gfc_get_location (&curr_loop_var->where),
+		"non-rectangular loop nest with step other than constant 1 "
+		"or -1 for %qs", curr_loop_var->symtree->n.sym->name);
+      return false;
+    }
+
+  dovar_init *di;
+  unsigned ix;
+  FOR_EACH_VEC_ELT (*inits, ix, di)
+    if (di->sym == var && !di->non_unit_iter)
+      {
+	tree_var = di->init;
+	gcc_assert (DECL_P (tree_var));
+	break;
+      }
+    else if (di->sym == var)
+      {
+	/* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
+	sorry_at (gfc_get_location (&code->loc),
+		"non-rectangular loop nest with step other than constant 1 "
+		"or -1 for %qs", var->name);
+	inform (gfc_get_location (&expr->where), "Used here");
+	return false;
+      }
+  if (tree_var == NULL_TREE)
+    tree_var = var->backend_decl;
+
+  if (expr->expr_type == EXPR_VARIABLE)
+    gcc_assert (expr->symtree->n.sym == var);
+  else if (expr->expr_type != EXPR_OP
+	   || (expr->value.op.op != INTRINSIC_TIMES
+	       && expr->value.op.op != INTRINSIC_PLUS
+	       && expr->value.op.op != INTRINSIC_MINUS))
+    gcc_unreachable ();
+  else
+    {
+      gfc_se se;
+      gfc_expr *et = NULL, *eo = NULL, *e = expr;
+      if (expr->value.op.op != INTRINSIC_TIMES)
+	{
+	  if (gfc_find_sym_in_expr (var, expr->value.op.op1))
+	    {
+	      e = expr->value.op.op1;
+	      eo = expr->value.op.op2;
+	    }
+	  else
+	    {
+	      eo = expr->value.op.op1;
+	      e = expr->value.op.op2;
+	    }
+	}
+      if (e->value.op.op == INTRINSIC_TIMES)
+	{
+	  if (e->value.op.op1->expr_type == EXPR_VARIABLE
+	      && e->value.op.op1->symtree->n.sym == var)
+	    et = e->value.op.op2;
+	  else
+	    {
+	      et = e->value.op.op1;
+	      gcc_assert (e->value.op.op2->expr_type == EXPR_VARIABLE
+			  && e->value.op.op2->symtree->n.sym == var);
+	    }
+	}
+      else
+	gcc_assert (e->expr_type == EXPR_VARIABLE && e->symtree->n.sym == var);
+      if (et != NULL)
+	{
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr_val (&se, et);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  a1 = se.expr;
+	}
+      if (eo != NULL)
+	{
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr_val (&se, eo);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  a2 = se.expr;
+	  if (expr->value.op.op == INTRINSIC_MINUS && expr->value.op.op2 == eo)
+	    /* outer-var - a2.  */
+	    a2 = fold_build1 (NEGATE_EXPR, TREE_TYPE (a2), a2);
+	  else if (expr->value.op.op == INTRINSIC_MINUS)
+	    /* a2 - outer-var.  */
+	    a1 = fold_build1 (NEGATE_EXPR, TREE_TYPE (a1), a1);
+	}
+      a1 = DECL_P (a1) ? a1 : gfc_evaluate_now (a1, pblock);
+      a2 = DECL_P (a2) ? a2 : gfc_evaluate_now (a2, pblock);
+    }
+
+  gfc_init_se (sep, NULL);
+  sep->expr = make_tree_vec (3);
+  TREE_VEC_ELT (sep->expr, 0) = tree_var;
+  TREE_VEC_ELT (sep->expr, 1) = fold_convert (TREE_TYPE (tree_var), a1);
+  TREE_VEC_ELT (sep->expr, 2) = fold_convert (TREE_TYPE (tree_var), a2);
+
+  return true;
+}
 
 static tree
 gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
@@ -5127,7 +5252,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 {
   gfc_se se;
   tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls;
-  tree count = NULL_TREE, cycle_label, tmp, omp_clauses;
+  tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses;
   stmtblock_t block;
   stmtblock_t body;
   gfc_omp_clauses *clauses = code->ext.omp_clauses;
@@ -5214,52 +5339,72 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
       gfc_init_se (&se, NULL);
       gfc_conv_expr_lhs (&se, code->ext.iterator->var);
       gfc_add_block_to_block (pblock, &se.pre);
-      dovar = se.expr;
+      local_dovar = dovar_decl = dovar = se.expr;
       type = TREE_TYPE (dovar);
       gcc_assert (TREE_CODE (type) == INTEGER_TYPE);
 
       gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->start);
+      gfc_conv_expr_val (&se, code->ext.iterator->step);
       gfc_add_block_to_block (pblock, &se.pre);
-      from = gfc_evaluate_now (se.expr, pblock);
+      step = gfc_evaluate_now (se.expr, pblock);
 
-      gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->end);
-      gfc_add_block_to_block (pblock, &se.pre);
-      to = gfc_evaluate_now (se.expr, pblock);
+      if (integer_onep (step))
+	simple = 1;
+      else if (tree_int_cst_equal (step, integer_minus_one_node))
+	simple = -1;
 
       gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->step);
-      gfc_add_block_to_block (pblock, &se.pre);
-      step = gfc_evaluate_now (se.expr, pblock);
-      dovar_decl = dovar;
+      if (!clauses->non_rectangular
+	  || !gfc_nonrect_loop_expr (pblock, &se, i, orig_code->block->next,
+				     code->ext.iterator->start, &inits, simple,
+				     code->ext.iterator->var))
+	{
+	  gfc_conv_expr_val (&se, code->ext.iterator->start);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  if (!DECL_P (se.expr))
+	    se.expr = gfc_evaluate_now (se.expr, pblock);
+	}
+      from = se.expr;
 
-      /* Special case simple loops.  */
-      if (VAR_P (dovar))
+      gfc_init_se (&se, NULL);
+      if (!clauses->non_rectangular
+	  || !gfc_nonrect_loop_expr (pblock, &se, i, orig_code->block->next,
+				     code->ext.iterator->end, &inits, simple,
+				     code->ext.iterator->var))
 	{
-	  if (integer_onep (step))
-	    simple = 1;
-	  else if (tree_int_cst_equal (step, integer_minus_one_node))
-	    simple = -1;
+	  gfc_conv_expr_val (&se, code->ext.iterator->end);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  if (!DECL_P (se.expr))
+	    se.expr = gfc_evaluate_now (se.expr, pblock);
 	}
-      else
+      to = se.expr;
+
+      if (!DECL_P (dovar))
 	dovar_decl
 	  = gfc_trans_omp_variable (code->ext.iterator->var->symtree->n.sym,
 				    false);
-
+      if (simple && !DECL_P (dovar))
+	{
+	  const char *name = code->ext.iterator->var->symtree->n.sym->name;
+	  local_dovar = gfc_create_var (type, name);
+	  dovar_init e = {code->ext.iterator->var->symtree->n.sym,
+			  dovar, local_dovar, false};
+	  inits.safe_push (e);
+	}
       /* Loop body.  */
       if (simple)
 	{
-	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, dovar, from);
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, local_dovar, from);
 	  /* The condition should not be folded.  */
 	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, simple > 0
 					       ? LE_EXPR : GE_EXPR,
-					       logical_type_node, dovar, to);
+					       logical_type_node, local_dovar,
+					       to);
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
-						    type, dovar, step);
+						    type, local_dovar, step);
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
 						    MODIFY_EXPR,
-						    type, dovar,
+						    type, local_dovar,
 						    TREE_VEC_ELT (incr, i));
 	  if (orig_decls && !clauses->orderedc)
 	    orig_decls = NULL;
@@ -5280,24 +5425,27 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	  tmp = fold_build2_loc (input_location, TRUNC_DIV_EXPR, type, tmp,
 				 step);
 	  tmp = gfc_evaluate_now (tmp, pblock);
-	  count = gfc_create_var (type, "count");
-	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, count,
+	  local_dovar = gfc_create_var (type, "count");
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, local_dovar,
 					     build_int_cst (type, 0));
 	  /* The condition should not be folded.  */
 	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, LT_EXPR,
 					       logical_type_node,
-					       count, tmp);
+					       local_dovar, tmp);
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
-						    type, count,
+						    type, local_dovar,
 						    build_int_cst (type, 1));
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
-						    MODIFY_EXPR, type, count,
+						    MODIFY_EXPR, type,
+						    local_dovar,
 						    TREE_VEC_ELT (incr, i));
 
 	  /* Initialize DOVAR.  */
-	  tmp = fold_build2_loc (input_location, MULT_EXPR, type, count, step);
+	  tmp = fold_build2_loc (input_location, MULT_EXPR, type, local_dovar,
+				 step);
 	  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, from, tmp);
-	  dovar_init e = {dovar, tmp};
+	  dovar_init e = {code->ext.iterator->var->symtree->n.sym,
+			  dovar, tmp, true};
 	  inits.safe_push (e);
 	  if (clauses->orderedc)
 	    {
@@ -5312,7 +5460,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
       if (dovar_found == 3
 	  && op == EXEC_OMP_SIMD
 	  && collapse == 1
-	  && !simple)
+	  && local_dovar != dovar)
 	{
 	  for (tmp = omp_clauses; tmp; tmp = OMP_CLAUSE_CHAIN (tmp))
 	    if (OMP_CLAUSE_CODE (tmp) == OMP_CLAUSE_LINEAR
@@ -5331,11 +5479,11 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	      OMP_CLAUSE_LINEAR_NO_COPYIN (tmp) = 1;
 	      OMP_CLAUSE_DECL (tmp) = dovar_decl;
 	      omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
+	      if (local_dovar != dovar)
+		dovar_found = 3;
 	    }
-	  if (!simple)
-	    dovar_found = 3;
 	}
-      else if (!dovar_found && !simple)
+      else if (!dovar_found && local_dovar != dovar)
 	{
 	  tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE);
 	  OMP_CLAUSE_DECL (tmp) = dovar_decl;
@@ -5346,7 +5494,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	  tree c = NULL;
 
 	  tmp = NULL;
-	  if (!simple)
+	  if (local_dovar != dovar)
 	    {
 	      /* If dovar is lastprivate, but different counter is used,
 		 dovar += step needs to be added to
@@ -5356,21 +5504,19 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	      if (clauses->orderedc)
 		{
 		  if (clauses->collapse <= 1 || i >= clauses->collapse)
-		    tmp = count;
+		    tmp = local_dovar;
 		  else
 		    tmp = fold_build2_loc (input_location, PLUS_EXPR,
-					   type, count, build_one_cst (type));
+					   type, local_dovar,
+					   build_one_cst (type));
 		  tmp = fold_build2_loc (input_location, MULT_EXPR, type,
 					 tmp, step);
 		  tmp = fold_build2_loc (input_location, PLUS_EXPR, type,
 					 from, tmp);
 		}
 	      else
-		{
-		  tmp = gfc_evaluate_now (step, pblock);
-		  tmp = fold_build2_loc (input_location, PLUS_EXPR, type,
-					 dovar, tmp);
-		}
+		tmp = fold_build2_loc (input_location, PLUS_EXPR, type,
+				       dovar, step);
 	      tmp = fold_build2_loc (input_location, MODIFY_EXPR, type,
 				     dovar, tmp);
 	      for (c = omp_clauses; c ; c = OMP_CLAUSE_CHAIN (c))
@@ -5405,9 +5551,9 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 		    break;
 		  }
 	    }
-	  gcc_assert (simple || c != NULL);
+	  gcc_assert (local_dovar == dovar || c != NULL);
 	}
-      if (!simple)
+      if (local_dovar != dovar)
 	{
 	  if (op != EXEC_OMP_SIMD || dovar_found == 1)
 	    tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE);
@@ -5420,7 +5566,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	    }
 	  else
 	    tmp = build_omp_clause (input_location, OMP_CLAUSE_LASTPRIVATE);
-	  OMP_CLAUSE_DECL (tmp) = count;
+	  OMP_CLAUSE_DECL (tmp) = local_dovar;
 	  omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
 	}
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
index 13772c185ce..ad5e11abf91 100644
--- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
@@ -47,8 +47,10 @@ contains
        end do
     end do
     !$acc end parallel
-    ! { dg-note {variable 'count\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'count\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'i\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'j\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'j\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'a' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
index 6878d856919..c5c2f2b9845 100644
--- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
@@ -46,7 +46,8 @@ contains
           y = a
        end do
     end do
-    ! { dg-note {variable 'count\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'j\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'a' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90
new file mode 100644
index 00000000000..df74763d9d7
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90
@@ -0,0 +1,637 @@
+! { dg-do run }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! See PR or non-rectangular-loop-1a.f90 for the commented tests
+! Hint: Those use step for loop vars part of nonrectangular loop nests
+
+module m
+  implicit none (type, external)
+contains
+
+! The 'k' loop uses i or j as start value
+! but a constant end value such that 'lastprivate'
+! should be well-defined
+subroutine lastprivate_check_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) then
+          print *, i, j, k,p, " -> i, j, k, p   (k < 1 - 41 .or. k > p)"
+          error stop
+        end if
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!
+!  !$omp simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_simd_1
+
+
+! Same but with do simd
+subroutine lastprivate_check_do_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+!  !$omp parallel do simd collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_do_simd_1
+
+
+
+! Same but with do
+subroutine lastprivate_check_do_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+!  !$omp parallel do collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_do_1
+
+
+
+subroutine lastprivate_check_2
+  integer :: n,m,p, i,j,k,ll
+
+  n = 11
+  m = 23
+  p = 27
+
+!  !$omp parallel do simd collapse(3) lastprivate(p)
+!  do i = 1, n
+!    do j = 1, m,2
+!      do k = 1, j + 41
+!        do ll = 1, p, 2
+!          if (k > 23 + 41 .or. k < 1) error stop
+!        end do
+!      end do
+!    end do
+!  end do
+!  if (ll /= 29) error stop
+
+!  !$omp simd collapse(3) lastprivate(p)
+!  do i = 1, n
+!    do j = 1, m,2
+!      do k = 1, j + 41
+!        do ll = 1, p, 2
+!          if (k > 23 + 41 .or. k < 1) error stop
+!        end do
+!      end do
+!    end do
+!  end do
+!  if (ll /= 29) error stop
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n,2
+!    do j = 1, m
+!      do k = 1, i + 41
+!        if (k > 11 + 41 .or. k < 1) error stop
+!      end do
+!    end do
+!  end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - Same but without 'private':
+!!$omp simd collapse(3) lastprivate(k)
+!do i = 1, n
+!  do j = 1, m,2
+!    do k = 1, j + 41
+!      if (k > 23 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 65) then
+!  print *, k, 65
+!  error stop
+!endif
+
+
+!!$omp simd collapse(3) lastprivate(k)
+!do i = 1, n,2
+!  do j = 1, m
+!    do k = 1, i + 41
+!      if (k > 11 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - all with lastprivate
+!!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+!do i = 1, n
+!  do j = 1, m,2
+!    do k = 1, j + 41
+!      if (k > 23 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 65) then
+!  print *, k, 65
+!  error stop
+!endif
+
+
+!!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+!do i = 1, n,2
+!  do j = 1, m
+!    do k = 1, i + 41
+!      if (k > 11 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+end
+end module m
+
+program main
+  use m
+  implicit none (type, external)
+  call lastprivate_check_simd_1
+  call lastprivate_check_do_simd_1
+  call lastprivate_check_do_1
+  call lastprivate_check_2
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90
new file mode 100644
index 00000000000..77aa887942e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90
@@ -0,0 +1,374 @@
+! { dg-do compile }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! ========================================================
+! NOTE: The testcases are from non-rectangular-loop-1.f90,
+! but commented there. Feel free to remove this
+! file + uncomment them in non-rectangular-loop-1.f90
+! Otherwise, you need to change it to 'dg-do run'!
+! ========================================================
+
+module m
+  implicit none (type, external)
+contains
+
+! The 'k' loop uses i or j as start value
+! but a constant end value such that 'lastprivate'
+! should be well-defined
+subroutine lastprivate_check_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_simd_1
+
+
+! Same but with do simd
+subroutine lastprivate_check_do_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_do_simd_1
+
+
+
+! Same but with do
+subroutine lastprivate_check_do_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_do_1
+
+
+
+subroutine lastprivate_check_2
+  integer :: n,m,p, i,j,k,ll
+
+  n = 11
+  m = 23
+  p = 27
+
+  !$omp parallel do simd collapse(3) lastprivate(p)
+  do i = 1, n
+    do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = 1, j + 41  ! { dg-note "Used here" }
+        do ll = 1, p, 2
+          if (k > 23 + 41 .or. k < 1) error stop
+        end do
+      end do
+    end do
+  end do
+  if (ll /= 29) error stop
+
+  !$omp simd collapse(3) lastprivate(p)
+  do i = 1, n
+    do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = 1, j + 41  ! { dg-note "Used here" }
+        do ll = 1, p, 2
+          if (k > 23 + 41 .or. k < 1) error stop
+        end do
+      end do
+    end do
+  end do
+  if (ll /= 29) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = 1, i + 41  ! { dg-note "Used here" }
+        if (k > 11 + 41 .or. k < 1) error stop
+      end do
+    end do
+  end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - Same but without 'private':
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+    do k = 1, j + 41  ! { dg-note "Used here" }
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+  do j = 1, m
+    do k = 1, i + 41  ! { dg-note "Used here" }
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - all with lastprivate
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n
+  do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+    do k = 1, j + 41  ! { dg-note "Used here" }
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+  do j = 1, m
+    do k = 1, i + 41  ! { dg-note "Used here" }
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+end
+end module m
+
+program main
+  use m
+  implicit none (type, external)
+  call lastprivate_check_simd_1
+  call lastprivate_check_do_simd_1
+  call lastprivate_check_do_1
+  call lastprivate_check_2
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90
new file mode 100644
index 00000000000..0cea61e5f0d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90
@@ -0,0 +1,243 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original -fcheck=all" }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! Valid patterns are:
+!  (1)  a2 - var-outer
+!  (2)  a1 * var-outer
+!  (3)  a1 * var-outer + a2
+!  (4)  a2 + a1 * var-outer
+!  (5)  a1 * var-outer - a2
+!  (6)  a2 - a1 * var-outer
+!  (7)  var-outer * a1
+!  (8)  var-outer * a1 + a2
+!  (9)  a2 + var-outer * a1
+! (10)  var-outer * a1 - a2
+! (11)  a2 - var-outer * a1
+
+module m
+contains
+
+
+! { dg-final { scan-tree-dump-times "for \\(one_two_inner = one_two_outer \\* -1 \\+ one_a2; one_two_inner <= one_two_outer \\* two_a1 \\+ 0; one_two_inner = one_two_inner \\+ 1\\)" 1 original } }
+
+!  (1)  a2 - var-outer
+!  (2)  a1 * var-outer
+subroutine one_two()
+  implicit none
+  integer :: one_a2
+  integer :: two_a1
+  integer :: one_two_outer, one_two_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  one_a2 = 13
+  two_a1 = 5
+  allocate(var(1:10, one_a2 - 10:two_a1 * 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do one_two_outer = 1, 10
+    do one_two_inner = one_a2 - one_two_outer, two_a1 * one_two_outer
+      !$omp atomic update
+      var(one_two_outer,one_two_inner) = var(one_two_outer,one_two_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = one_a2 - i, two_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(three_four_inner = three_four_outer \\* three_a1 \\+ three_a2; three_four_inner <= three_four_outer \\* four_a1 \\+ four_a2; three_four_inner = three_four_inner \\+ 1\\)" 1 original } }
+
+!  (3)  a1 * var-outer + a2
+!  (4)  a2 + a1 * var-outer
+subroutine three_four()
+  implicit none
+  integer :: three_a1, three_a2
+  integer :: four_a1, four_a2
+  integer :: three_four_outer, three_four_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  three_a1 = 2
+  three_a2 = 3
+  four_a1 = 3
+  four_a2 = 5
+  allocate(var(1:10, three_a1 * 1 + three_a2:four_a2 + four_a1 * 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do three_four_outer = 1, 10
+    do three_four_inner = three_a1 * three_four_outer + three_a2, four_a2 + four_a1 * three_four_outer
+      !$omp atomic update
+      var(three_four_outer, three_four_inner) = var(three_four_outer, three_four_inner) + 2
+    end do
+  end do
+  do i = 1, 10
+    do j = three_a1 * i + three_a2, four_a2 + four_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(five_six_inner = five_six_outer \\* five_a1 \\+ D\\.\[0-9\]+; five_six_inner <= five_six_outer \\* D\\.\[0-9\]+ \\+ six_a2; five_six_inner = five_six_inner \\+ 1\\)" 1 original } }
+
+!  (5)  a1 * var-outer - a2
+!  (6)  a2 - a1 * var-outer
+subroutine five_six()
+  implicit none
+  integer :: five_a1, five_a2
+  integer :: six_a1, six_a2
+  integer :: five_six_outer, five_six_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  five_a1 = 2
+  five_a2 = -3
+  six_a1 = 3
+  six_a2 = 20
+  allocate(var(1:10, five_a1 * 1 - five_a2:six_a2 - six_a1 * 1), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do five_six_outer = 1, 10
+    do five_six_inner = five_a1 * five_six_outer - five_a2, six_a2 - six_a1 * five_six_outer
+      !$omp atomic update
+      var(five_six_outer, five_six_inner) = var(five_six_outer, five_six_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = five_a1 * i - five_a2, six_a2 - six_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(seven_eight_inner = seven_eight_outer \\* seven_a1 \\+ 0; seven_eight_inner <= seven_eight_outer \\* eight_a1 \\+ eight_a2; seven_eight_inner = seven_eight_inner \\+ 1\\)" 1 original } }
+
+!  (7)  var-outer * a1
+!  (8)  var-outer * a1 + a2
+subroutine seven_eight()
+  implicit none
+  integer :: seven_a1
+  integer :: eight_a1, eight_a2
+  integer :: seven_eight_outer, seven_eight_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  seven_a1 = 3
+  eight_a1 = 2
+  eight_a2 = -4
+  allocate(var(1:10, 1 * seven_a1 : 10 * eight_a1 + eight_a2), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do seven_eight_outer = 1, 10
+    do seven_eight_inner = seven_eight_outer * seven_a1, seven_eight_outer * eight_a1 + eight_a2
+      !$omp atomic update
+      var(seven_eight_outer, seven_eight_inner) = var(seven_eight_outer, seven_eight_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = i * seven_a1, i * eight_a1 + eight_a2
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(nine_ten_inner = nine_ten_outer \\* nine_a1 \\+ nine_a2; nine_ten_inner <= nine_ten_outer \\* ten_a1 \\+ D\\.\[0-9\]+; nine_ten_inner = nine_ten_inner \\+ 1\\)" 1 original } }
+
+!  (9)  a2 + var-outer * a1
+! (10)  var-outer * a1 - a2
+subroutine nine_ten()
+  implicit none
+  integer :: nine_a1, nine_a2
+  integer :: ten_a1, ten_a2
+  integer :: nine_ten_outer, nine_ten_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  nine_a1 = 3
+  nine_a2 = 5
+  ten_a1 = 2
+  ten_a2 = 3
+  allocate(var(1:10, nine_a2 + 1 * nine_a1:10 * ten_a1 - ten_a2), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do nine_ten_outer = 1, 10
+    do nine_ten_inner = nine_a2 + nine_ten_outer * nine_a1, nine_ten_outer * ten_a1 - ten_a2
+      !$omp atomic update
+      var(nine_ten_outer, nine_ten_inner) = var(nine_ten_outer, nine_ten_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = nine_a2 + i * nine_a1, i * ten_a1 - ten_a2
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(eleven_inner = eleven_outer \\* D\\.\[0-9\]+ \\+ eleven_a2; eleven_inner <= 10; eleven_inner = eleven_inner \\+ 1\\)" 1 original } }
+
+! (11)  a2 - var-outer * a1
+
+subroutine eleven()
+  implicit none
+  integer :: eleven_a1, eleven_a2
+  integer :: eleven_outer, eleven_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  eleven_a1 = 2
+  eleven_a2 = 3
+  allocate(var(1:10, eleven_a2 - 10 * eleven_a1 : 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do eleven_outer = 1, 10
+    do eleven_inner = eleven_a2 - eleven_outer * eleven_a1, 10
+      !$omp atomic update
+      var(eleven_outer, eleven_inner) = var(eleven_outer, eleven_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = eleven_a2 - i * eleven_a1, 10
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+end module m
+
+program main
+use m
+implicit none
+call one_two()
+call three_four()
+call five_six()
+call seven_eight()
+call nine_ten()
+call eleven()
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-3.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-3.f90
new file mode 100644
index 00000000000..133d48432d8
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-3.f90
@@ -0,0 +1,186 @@
+! { dg-additional-options "-fdump-tree-original" }
+! PR fortran/107424
+
+module m
+contains
+subroutine foo (av, avo, a0, a0o, a1, a2, a3, a4)
+implicit none
+
+integer, value :: av
+integer, value, optional :: avo
+integer :: a0
+integer, optional :: a0o
+integer, pointer :: a1
+integer, pointer, optional :: a2
+integer, allocatable :: a3
+integer, allocatable, optional :: a4
+integer :: a5
+integer, pointer :: a6
+integer, allocatable :: a7
+integer :: arr(20,10), ref(20,10)
+
+integer :: j, i
+
+allocate(a6, a7)
+
+ref = 44
+do i = 1, 10
+  do j = i, 20
+    ref(j, i) = j + 100 * i
+  end do
+end do
+
+! { dg-final { scan-tree-dump-times "for \\(av = 1; av <= 10; av = av \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = av \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+!$omp simd collapse(2)
+do av = 1, 10
+  do j = av, 20
+    arr(j, av) = j + 100 * av
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(avo = 1; avo <= 10; avo = avo \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = avo \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+!$omp simd collapse(2)
+do avo = 1, 10
+  do j = avo, 20
+    arr(j, avo) = j + 100 * avo
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0\\.\[0-9\]+ = 1; a0\\.\[0-9\]+ <= 10; a0\\.\[0-9\]+ = a0\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a0\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0 = a0\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a0 = 1, 10
+  do j = a0, 20
+    arr(j, a0) = j + 100 * a0
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0o\\.\[0-9\]+ = 1; a0o\\.\[0-9\]+ <= 10; a0o\\.\[0-9\]+ = a0o\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a0o\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0o = a0o\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a0o = 1, 10
+  do j = a0o, 20
+    arr(j, a0o) = j + 100 * a0o
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a1\\.\[0-9\]+ = 1; a1\\.\[0-9\]+ <= 10; a1\\.\[0-9\]+ = a1\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a1\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a1 = a1\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a1 = 1, 10
+  do j = a1, 20
+    arr(j, a1) = j + 100 * a1
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a2\\.\[0-9\]+ = 1; a2\\.\[0-9\]+ <= 10; a2\\.\[0-9\]+ = a2\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a2\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a2 = a2\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a2 = 1, 10
+  do j = a2, 20
+    arr(j, a2) = j + 100 * a2
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a3\\.\[0-9\]+ = 1; a3\\.\[0-9\]+ <= 10; a3\\.\[0-9\]+ = a3\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a3\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a3 = a3\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a3 = 1, 10
+  do j = a3, 20
+    arr(j, a3) = j + 100 * a3
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a4\\.\[0-9\]+ = 1; a4\\.\[0-9\]+ <= 10; a4\\.\[0-9\]+ = a4\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a4\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a4 = a4\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a4 = 1, 10
+  do j = a4, 20
+    arr(j, a4) = j + 100 * a4
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a5 = 1; a5 <= 10; a5 = a5 \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a5 \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+!$omp simd collapse(2)
+do a5 = 1, 10
+  do j = a5, 20
+    arr(j, a5) = j + 100 * a5
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a6\\.\[0-9\]+ = 1; a6\\.\[0-9\]+ <= 10; a6\\.\[0-9\]+ = a6\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a6\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a6 = a6\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a6 = 1, 10
+  do j = a6, 20
+    arr(j, a6) = j + 100 * a6
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a7\\.\[0-9\]+ = 1; a7\\.\[0-9\]+ <= 10; a7\\.\[0-9\]+ = a7\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a7\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a7 = a7\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a7 = 1, 10
+  do j = a7, 20
+    arr(j, a7) = j + 100 * a7
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+deallocate(a6, a7)
+end
+
+end module m
+
+
+use m
+implicit none
+
+integer :: av
+integer :: avo
+integer :: a0
+integer :: a0o
+integer, pointer :: a1
+integer, pointer :: a2
+integer, allocatable :: a3
+integer, allocatable :: a4
+
+allocate(a1,a2,a3,a4)
+call foo (av, avo, a0, a0o, a1, a2, a3, a4)
+deallocate(a1,a2,a3,a4)
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-4.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-4.f90
new file mode 100644
index 00000000000..7e265725170
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-4.f90
@@ -0,0 +1,188 @@
+! { dg-additional-options "-fdump-tree-original" }
+! PR fortran/107424
+
+! Same as non-rectangular-loop-4.f90 but expr in upper bound 
+
+module m
+contains
+subroutine foo (av, avo, a0, a0o, a1, a2, a3, a4)
+implicit none
+
+integer, value :: av
+integer, value, optional :: avo
+integer :: a0
+integer, optional :: a0o
+integer, pointer :: a1
+integer, pointer, optional :: a2
+integer, allocatable :: a3
+integer, allocatable, optional :: a4
+integer :: a5
+integer, pointer :: a6
+integer, allocatable :: a7
+integer :: arr(20,10), ref(20,10)
+
+integer :: j, i
+
+allocate(a6, a7)
+
+ref = 44
+do i = 1, 10
+  do j = 1, i*2-1
+    ref(j, i) = j + 100 * i
+  end do
+end do
+
+! { dg-final { scan-tree-dump-times "for \\(av = 1; av <= 10; av = av \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= av \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+!$omp simd collapse(2)
+do av = 1, 10
+  do j = 1, av*2-1
+    arr(j, av) = j + 100 * av
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(avo = 1; avo <= 10; avo = avo \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= avo \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+!$omp simd collapse(2)
+do avo = 1, 10
+  do j = 1, avo*2-1
+    arr(j, avo) = j + 100 * avo
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0\\.\[0-9\]+ = 1; a0\\.\[0-9\]+ <= 10; a0\\.\[0-9\]+ = a0\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a0\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0 = a0\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a0 = 1, 10
+  do j = 1, a0*2-1
+    arr(j, a0) = j + 100 * a0
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0o\\.\[0-9\]+ = 1; a0o\\.\[0-9\]+ <= 10; a0o\\.\[0-9\]+ = a0o\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a0o\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0o = a0o\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a0o = 1, 10
+  do j = 1, a0o*2-1
+    arr(j, a0o) = j + 100 * a0o
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a1\\.\[0-9\]+ = 1; a1\\.\[0-9\]+ <= 10; a1\\.\[0-9\]+ = a1\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a1\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a1 = a1\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a1 = 1, 10
+  do j = 1, a1*2-1
+    arr(j, a1) = j + 100 * a1
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a2\\.\[0-9\]+ = 1; a2\\.\[0-9\]+ <= 10; a2\\.\[0-9\]+ = a2\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a2\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a2 = a2\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a2 = 1, 10
+  do j = 1, a2*2-1
+    arr(j, a2) = j + 100 * a2
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a3\\.\[0-9\]+ = 1; a3\\.\[0-9\]+ <= 10; a3\\.\[0-9\]+ = a3\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a3\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a3 = a3\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a3 = 1, 10
+  do j = 1, a3*2-1
+    arr(j, a3) = j + 100 * a3
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a4\\.\[0-9\]+ = 1; a4\\.\[0-9\]+ <= 10; a4\\.\[0-9\]+ = a4\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a4\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a4 = a4\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a4 = 1, 10
+  do j = 1, a4*2-1
+    arr(j, a4) = j + 100 * a4
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a5 = 1; a5 <= 10; a5 = a5 \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a5 \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+!$omp simd collapse(2)
+do a5 = 1, 10
+  do j = 1, a5*2-1
+    arr(j, a5) = j + 100 * a5
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a6\\.\[0-9\]+ = 1; a6\\.\[0-9\]+ <= 10; a6\\.\[0-9\]+ = a6\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a6\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a6 = a6\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a6 = 1, 10
+  do j = 1, a6*2-1
+    arr(j, a6) = j + 100 * a6
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a7\\.\[0-9\]+ = 1; a7\\.\[0-9\]+ <= 10; a7\\.\[0-9\]+ = a7\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a7\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a7 = a7\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+!$omp simd collapse(2)
+do a7 = 1, 10
+  do j = 1, a7*2-1
+    arr(j, a7) = j + 100 * a7
+  end do
+end do
+if (any (ref /= arr)) error stop
+
+deallocate(a6, a7)
+end
+
+end module m
+
+
+use m
+implicit none
+
+integer :: av
+integer :: avo
+integer :: a0
+integer :: a0o
+integer, pointer :: a1
+integer, pointer :: a2
+integer, allocatable :: a3
+integer, allocatable :: a4
+
+allocate(a1,a2,a3,a4)
+call foo (av, avo, a0, a0o, a1, a2, a3, a4)
+deallocate(a1,a2,a3,a4)
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-5.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-5.f90
new file mode 100644
index 00000000000..643ab796a84
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-5.f90
@@ -0,0 +1,28 @@
+! { dg-do compile }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+!$omp simd collapse(2)
+do i = 1, 10
+  do j = i, 10, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+  end do
+end do
+
+!$omp do collapse(2) lastprivate(j)  ! { dg-error "lastprivate variable 'j' is private in outer context" }
+do i = 1, 10
+  do j = i, 10, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+  end do
+end do
+if (i /= 11) stop 1
+
+!$omp simd collapse(2) lastprivate(j)
+do i = 1, 10
+  do j = i, 10, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+  end do
+end do
+if (i /= 11) stop 1
+end

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch][v2] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
  2023-01-25 14:47   ` [Patch][v2] " Tobias Burnus
@ 2023-01-31 11:37     ` Jakub Jelinek
  2023-02-09 14:46       ` Tobias Burnus
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Jelinek @ 2023-01-31 11:37 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: gcc-patches, fortran

On Wed, Jan 25, 2023 at 03:47:18PM +0100, Tobias Burnus wrote:
> updated patch included, i.e. avoiding 'count' for 'j' when a 'j.0' would
> do (i.e. only local var without the different step calculation). I also
> now reject if there is a non-unit step on the loop using an outer var.
> 
> Eventually still to be done: replace the 'sorry' by working code, i.e.
> implement the suggestions to handle some/all non-unit iteration steps as
> proposed in this thread.
> 
> On 20.01.23 18:39, Jakub Jelinek wrote:
> > I think instead of non-unity etc. it is better to talk about constant
> > step 1 or -1.
> 
> I concur.
> 
> 
> > The actual problem with non-simple loops for non-rectangular loops is
> > both in case it is an inner loop which uses some outer loop's iterator,
> > or if it is outer loop whose iterator is used, both of those cases
> > will not be handled properly.
> 
> I have now added a check for the other case as well.
> 
> Just to confirm, the following is fine, isn't it?
> 
> !$omp simd collapse(4)
> do i = 1, 10, 2
>   do outer_var = 1, 10  ! step = + 1
>     do j = 1, 10, 2
>       do inner_var = 1, outer_var  ! step = 1
> 
> i.e. both the inner_var and outer_var have 'step = 1',
> even if other loops in the 'collapse' have step != 1.
> I think it should be fine.

Yes, the loops which don't define outer_var for other loops nor
use outer_var from other loops can be in any form, we can compute
their number of iterations before the whole loop nest for them,
so for the non-rectangular iterations computations we can ignore
those except for multiplication by the pre-computed count.

> OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
> 
> This patch ensures that loop bounds depending on outer loop vars use the
> proper TREE_VEC format. It additionally gives a sorry if such an outer
> var has a non-one/non-minus-one increment as currently a count variable
> is used in this case (see PR).
> 
> Finally, it avoids 'count' and just uses a local loop variable if the
> step increment is +/-1.
> 
> 	PR fortran/107424
> 
> gcc/fortran/ChangeLog:
> 
> 	* trans-openmp.cc (struct dovar_init_d): Add 'sym' and
> 	'non_unit_incr' members.
> 	(gfc_nonrect_loop_expr): New.
> 	(gfc_trans_omp_do): Call it; use normal loop bounds
> 	for unit stride - and only create local loop var.
> 
> libgomp/ChangeLog:
> 
> 	* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
> 	* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
> 	* testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.
> 	* testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test.
> 	* testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test.
> 	* testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note.
> 	* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.
> 
> +static bool
> +gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n,
> +		       gfc_code *code, gfc_expr *expr, vec<dovar_init> *inits,
> +		       int simple, gfc_expr *curr_loop_var)
> +{
> +  int i;
> +  for (i = 0; i < loop_n; i++)
> +    {
> +      gcc_assert (code->ext.iterator->var->expr_type == EXPR_VARIABLE);
> +      if (gfc_find_sym_in_expr (code->ext.iterator->var->symtree->n.sym, expr))
> +	break;
> +      code = code->block->next;
> +    }
> +  if (i >= loop_n)
> +    return false;
> +
> +  /* Canonic format: TREE_VEC with [var, multiplier, offset].  */

I think we use everywhere Canonical rather than Canonic

> +  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
> +
> +  tree tree_var = NULL_TREE;
> +  tree a1 = integer_one_node;
> +  tree a2 = integer_zero_node;
> +
> +  if (!simple)
> +    {
> +      /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
> +      sorry_at (gfc_get_location (&curr_loop_var->where),
> +		"non-rectangular loop nest with step other than constant 1 "
> +		"or -1 for %qs", curr_loop_var->symtree->n.sym->name);
> +      return false;
> +    }
> +
> +  dovar_init *di;
> +  unsigned ix;
> +  FOR_EACH_VEC_ELT (*inits, ix, di)
> +    if (di->sym == var && !di->non_unit_iter)
> +      {
> +	tree_var = di->init;
> +	gcc_assert (DECL_P (tree_var));
> +	break;
> +      }
> +    else if (di->sym == var)
> +      {
> +	/* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
> +	sorry_at (gfc_get_location (&code->loc),
> +		"non-rectangular loop nest with step other than constant 1 "
> +		"or -1 for %qs", var->name);
> +	inform (gfc_get_location (&expr->where), "Used here");
> +	return false;
> +      }

I think it would be better formatted as
    if (di->sym == var)
      {
	if (!di->non_unit_iter)
	  {
	    ...
	  }
	else
	  {
	    ...
	  }
      }

> +      if (simple && !DECL_P (dovar))
> +	{
> +	  const char *name = code->ext.iterator->var->symtree->n.sym->name;
> +	  local_dovar = gfc_create_var (type, name);
> +	  dovar_init e = {code->ext.iterator->var->symtree->n.sym,
> +			  dovar, local_dovar, false};
> +	  inits.safe_push (e);
> +	}

For the separate local_dovar case, I'd be worried if we handle lastprivate
right.  From quick skimming I see some lastprivate clauses in some of
the tests, so if they verify the right value has been computed (say the
same as one would get with -fno-openmp), then fine.

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch][v2] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
  2023-01-31 11:37     ` Jakub Jelinek
@ 2023-02-09 14:46       ` Tobias Burnus
  2023-02-09 14:49         ` Jakub Jelinek
  0 siblings, 1 reply; 8+ messages in thread
From: Tobias Burnus @ 2023-02-09 14:46 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches, fortran

[-- Attachment #1: Type: text/plain, Size: 7192 bytes --]

Updated patch. Changes: the canonic(al) and the if/else(if) cosmetic
formatting changes.

Testcases: Additionally, I checked for the value of a zero-loop trip in
libgomp.fortran/non-rectangular-loop-1.f90 and added lastprivate to all
of .../non-rectangular-loop-{3,4}.f90 (unless the loop-iteration
variable is a pointer).

At least those three files pass with and without -fopenmp, implying that
there the lastprivate check is correct. ('1a' and '5' also use
lastprivate, but those fail with a 'sorry' before a value test can be
done.) In '1' there are also several tests with 'parallel do', but only
SIMD has been tested more extensively (in general and for lastprivate).

I think the test coverage should be sufficient. Any further test idea?
Otherwise, I would commit it now.

Tobias

PS: Next planned by me: revising the 'omp loop' patch and a longer
pending 'map(alloc:' patch. And at some point supporting at least some
non-±1 increments with non-rect loop nests.

On 31.01.23 12:37, Jakub Jelinek wrote:
> On Wed, Jan 25, 2023 at 03:47:18PM +0100, Tobias Burnus wrote:
>> updated patch included, i.e. avoiding 'count' for 'j' when a 'j.0' would
>> do (i.e. only local var without the different step calculation). I also
>> now reject if there is a non-unit step on the loop using an outer var.
>>
>> Eventually still to be done: replace the 'sorry' by working code, i.e.
>> implement the suggestions to handle some/all non-unit iteration steps as
>> proposed in this thread.
>>
>> On 20.01.23 18:39, Jakub Jelinek wrote:
>>> I think instead of non-unity etc. it is better to talk about constant
>>> step 1 or -1.
>> I concur.
>>
>>
>>> The actual problem with non-simple loops for non-rectangular loops is
>>> both in case it is an inner loop which uses some outer loop's iterator,
>>> or if it is outer loop whose iterator is used, both of those cases
>>> will not be handled properly.
>> I have now added a check for the other case as well.
>>
>> Just to confirm, the following is fine, isn't it?
>>
>> !$omp simd collapse(4)
>> do i = 1, 10, 2
>>    do outer_var = 1, 10  ! step = + 1
>>      do j = 1, 10, 2
>>        do inner_var = 1, outer_var  ! step = 1
>>
>> i.e. both the inner_var and outer_var have 'step = 1',
>> even if other loops in the 'collapse' have step != 1.
>> I think it should be fine.
> Yes, the loops which don't define outer_var for other loops nor
> use outer_var from other loops can be in any form, we can compute
> their number of iterations before the whole loop nest for them,
> so for the non-rectangular iterations computations we can ignore
> those except for multiplication by the pre-computed count.
>
>> OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
>>
>> This patch ensures that loop bounds depending on outer loop vars use the
>> proper TREE_VEC format. It additionally gives a sorry if such an outer
>> var has a non-one/non-minus-one increment as currently a count variable
>> is used in this case (see PR).
>>
>> Finally, it avoids 'count' and just uses a local loop variable if the
>> step increment is +/-1.
>>
>>      PR fortran/107424
>>
>> gcc/fortran/ChangeLog:
>>
>>      * trans-openmp.cc (struct dovar_init_d): Add 'sym' and
>>      'non_unit_incr' members.
>>      (gfc_nonrect_loop_expr): New.
>>      (gfc_trans_omp_do): Call it; use normal loop bounds
>>      for unit stride - and only create local loop var.
>>
>> libgomp/ChangeLog:
>>
>>      * testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
>>      * testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
>>      * testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.
>>      * testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test.
>>      * testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test.
>>      * testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      * gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note.
>>      * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.
>>
>> +static bool
>> +gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n,
>> +                   gfc_code *code, gfc_expr *expr, vec<dovar_init> *inits,
>> +                   int simple, gfc_expr *curr_loop_var)
>> +{
>> +  int i;
>> +  for (i = 0; i < loop_n; i++)
>> +    {
>> +      gcc_assert (code->ext.iterator->var->expr_type == EXPR_VARIABLE);
>> +      if (gfc_find_sym_in_expr (code->ext.iterator->var->symtree->n.sym, expr))
>> +    break;
>> +      code = code->block->next;
>> +    }
>> +  if (i >= loop_n)
>> +    return false;
>> +
>> +  /* Canonic format: TREE_VEC with [var, multiplier, offset].  */
> I think we use everywhere Canonical rather than Canonic
>
>> +  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
>> +
>> +  tree tree_var = NULL_TREE;
>> +  tree a1 = integer_one_node;
>> +  tree a2 = integer_zero_node;
>> +
>> +  if (!simple)
>> +    {
>> +      /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
>> +      sorry_at (gfc_get_location (&curr_loop_var->where),
>> +            "non-rectangular loop nest with step other than constant 1 "
>> +            "or -1 for %qs", curr_loop_var->symtree->n.sym->name);
>> +      return false;
>> +    }
>> +
>> +  dovar_init *di;
>> +  unsigned ix;
>> +  FOR_EACH_VEC_ELT (*inits, ix, di)
>> +    if (di->sym == var && !di->non_unit_iter)
>> +      {
>> +    tree_var = di->init;
>> +    gcc_assert (DECL_P (tree_var));
>> +    break;
>> +      }
>> +    else if (di->sym == var)
>> +      {
>> +    /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
>> +    sorry_at (gfc_get_location (&code->loc),
>> +            "non-rectangular loop nest with step other than constant 1 "
>> +            "or -1 for %qs", var->name);
>> +    inform (gfc_get_location (&expr->where), "Used here");
>> +    return false;
>> +      }
> I think it would be better formatted as
>      if (di->sym == var)
>        {
>       if (!di->non_unit_iter)
>         {
>           ...
>         }
>       else
>         {
>           ...
>         }
>        }
>
>> +      if (simple && !DECL_P (dovar))
>> +    {
>> +      const char *name = code->ext.iterator->var->symtree->n.sym->name;
>> +      local_dovar = gfc_create_var (type, name);
>> +      dovar_init e = {code->ext.iterator->var->symtree->n.sym,
>> +                      dovar, local_dovar, false};
>> +      inits.safe_push (e);
>> +    }
> For the separate local_dovar case, I'd be worried if we handle lastprivate
> right.  From quick skimming I see some lastprivate clauses in some of
> the tests, so if they verify the right value has been computed (say the
> same as one would get with -fno-openmp), then fine.
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Attachment #2: omp-non-rect-v3.diff --]
[-- Type: text/x-patch, Size: 68390 bytes --]

OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]

This patch ensures that loop bounds depending on outer loop vars use the
proper TREE_VEC format. It additionally gives a sorry if such an outer
var has a non-one/non-minus-one increment as currently a count variable
is used in this case (see PR).

Finally, it avoids 'count' and just uses a local loop variable if the
step increment is +/-1.

	PR fortran/107424

gcc/fortran/ChangeLog:

	* trans-openmp.cc (struct dovar_init_d): Add 'sym' and
	'non_unit_incr' members.
	(gfc_nonrect_loop_expr): New.
	(gfc_trans_omp_do): Call it; use normal loop bounds
	for unit stride - and only create local loop var.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test.
	* testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note.
	* gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.

 gcc/fortran/trans-openmp.cc                        | 241 ++++++--
 .../goacc/privatization-1-compute-loop.f90         |   6 +-
 .../goacc/privatization-1-routine_gang-loop.f90    |   3 +-
 .../libgomp.fortran/non-rectangular-loop-1.f90     | 668 +++++++++++++++++++++
 .../libgomp.fortran/non-rectangular-loop-1a.f90    | 374 ++++++++++++
 .../libgomp.fortran/non-rectangular-loop-2.f90     | 243 ++++++++
 .../libgomp.fortran/non-rectangular-loop-3.f90     | 212 +++++++
 .../libgomp.fortran/non-rectangular-loop-4.f90     | 215 +++++++
 .../libgomp.fortran/non-rectangular-loop-5.f90     |  28 +
 9 files changed, 1941 insertions(+), 49 deletions(-)

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 5283d0ce5f3..2d16f3be8ea 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -5116,10 +5116,138 @@ gfc_trans_omp_critical (gfc_code *code)
 }
 
 typedef struct dovar_init_d {
+  gfc_symbol *sym;
   tree var;
   tree init;
+  bool non_unit_iter;
 } dovar_init;
 
+static bool
+gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n,
+		       gfc_code *code, gfc_expr *expr, vec<dovar_init> *inits,
+		       int simple, gfc_expr *curr_loop_var)
+{
+  int i;
+  for (i = 0; i < loop_n; i++)
+    {
+      gcc_assert (code->ext.iterator->var->expr_type == EXPR_VARIABLE);
+      if (gfc_find_sym_in_expr (code->ext.iterator->var->symtree->n.sym, expr))
+	break;
+      code = code->block->next;
+    }
+  if (i >= loop_n)
+    return false;
+
+  /* Canonical format: TREE_VEC with [var, multiplier, offset].  */
+  gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
+
+  tree tree_var = NULL_TREE;
+  tree a1 = integer_one_node;
+  tree a2 = integer_zero_node;
+
+  if (!simple)
+    {
+      /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
+      sorry_at (gfc_get_location (&curr_loop_var->where),
+		"non-rectangular loop nest with step other than constant 1 "
+		"or -1 for %qs", curr_loop_var->symtree->n.sym->name);
+      return false;
+    }
+
+  dovar_init *di;
+  unsigned ix;
+  FOR_EACH_VEC_ELT (*inits, ix, di)
+    if (di->sym == var)
+      {
+	if (!di->non_unit_iter)
+	  {
+	    tree_var = di->init;
+	    gcc_assert (DECL_P (tree_var));
+	    break;
+	  }
+	else
+	  {
+	    /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424.  */
+	    sorry_at (gfc_get_location (&code->loc),
+		      "non-rectangular loop nest with step other than constant "
+		      "1 or -1 for %qs", var->name);
+	    inform (gfc_get_location (&expr->where), "Used here");
+	    return false;
+	  }
+      }
+  if (tree_var == NULL_TREE)
+    tree_var = var->backend_decl;
+
+  if (expr->expr_type == EXPR_VARIABLE)
+    gcc_assert (expr->symtree->n.sym == var);
+  else if (expr->expr_type != EXPR_OP
+	   || (expr->value.op.op != INTRINSIC_TIMES
+	       && expr->value.op.op != INTRINSIC_PLUS
+	       && expr->value.op.op != INTRINSIC_MINUS))
+    gcc_unreachable ();
+  else
+    {
+      gfc_se se;
+      gfc_expr *et = NULL, *eo = NULL, *e = expr;
+      if (expr->value.op.op != INTRINSIC_TIMES)
+	{
+	  if (gfc_find_sym_in_expr (var, expr->value.op.op1))
+	    {
+	      e = expr->value.op.op1;
+	      eo = expr->value.op.op2;
+	    }
+	  else
+	    {
+	      eo = expr->value.op.op1;
+	      e = expr->value.op.op2;
+	    }
+	}
+      if (e->value.op.op == INTRINSIC_TIMES)
+	{
+	  if (e->value.op.op1->expr_type == EXPR_VARIABLE
+	      && e->value.op.op1->symtree->n.sym == var)
+	    et = e->value.op.op2;
+	  else
+	    {
+	      et = e->value.op.op1;
+	      gcc_assert (e->value.op.op2->expr_type == EXPR_VARIABLE
+			  && e->value.op.op2->symtree->n.sym == var);
+	    }
+	}
+      else
+	gcc_assert (e->expr_type == EXPR_VARIABLE && e->symtree->n.sym == var);
+      if (et != NULL)
+	{
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr_val (&se, et);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  a1 = se.expr;
+	}
+      if (eo != NULL)
+	{
+	  gfc_init_se (&se, NULL);
+	  gfc_conv_expr_val (&se, eo);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  a2 = se.expr;
+	  if (expr->value.op.op == INTRINSIC_MINUS && expr->value.op.op2 == eo)
+	    /* outer-var - a2.  */
+	    a2 = fold_build1 (NEGATE_EXPR, TREE_TYPE (a2), a2);
+	  else if (expr->value.op.op == INTRINSIC_MINUS)
+	    /* a2 - outer-var.  */
+	    a1 = fold_build1 (NEGATE_EXPR, TREE_TYPE (a1), a1);
+	}
+      a1 = DECL_P (a1) ? a1 : gfc_evaluate_now (a1, pblock);
+      a2 = DECL_P (a2) ? a2 : gfc_evaluate_now (a2, pblock);
+    }
+
+  gfc_init_se (sep, NULL);
+  sep->expr = make_tree_vec (3);
+  TREE_VEC_ELT (sep->expr, 0) = tree_var;
+  TREE_VEC_ELT (sep->expr, 1) = fold_convert (TREE_TYPE (tree_var), a1);
+  TREE_VEC_ELT (sep->expr, 2) = fold_convert (TREE_TYPE (tree_var), a2);
+
+  return true;
+}
 
 static tree
 gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
@@ -5127,7 +5255,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 {
   gfc_se se;
   tree dovar, stmt, from, to, step, type, init, cond, incr, orig_decls;
-  tree count = NULL_TREE, cycle_label, tmp, omp_clauses;
+  tree local_dovar = NULL_TREE, cycle_label, tmp, omp_clauses;
   stmtblock_t block;
   stmtblock_t body;
   gfc_omp_clauses *clauses = code->ext.omp_clauses;
@@ -5214,52 +5342,72 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
       gfc_init_se (&se, NULL);
       gfc_conv_expr_lhs (&se, code->ext.iterator->var);
       gfc_add_block_to_block (pblock, &se.pre);
-      dovar = se.expr;
+      local_dovar = dovar_decl = dovar = se.expr;
       type = TREE_TYPE (dovar);
       gcc_assert (TREE_CODE (type) == INTEGER_TYPE);
 
       gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->start);
+      gfc_conv_expr_val (&se, code->ext.iterator->step);
       gfc_add_block_to_block (pblock, &se.pre);
-      from = gfc_evaluate_now (se.expr, pblock);
+      step = gfc_evaluate_now (se.expr, pblock);
 
-      gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->end);
-      gfc_add_block_to_block (pblock, &se.pre);
-      to = gfc_evaluate_now (se.expr, pblock);
+      if (integer_onep (step))
+	simple = 1;
+      else if (tree_int_cst_equal (step, integer_minus_one_node))
+	simple = -1;
 
       gfc_init_se (&se, NULL);
-      gfc_conv_expr_val (&se, code->ext.iterator->step);
-      gfc_add_block_to_block (pblock, &se.pre);
-      step = gfc_evaluate_now (se.expr, pblock);
-      dovar_decl = dovar;
+      if (!clauses->non_rectangular
+	  || !gfc_nonrect_loop_expr (pblock, &se, i, orig_code->block->next,
+				     code->ext.iterator->start, &inits, simple,
+				     code->ext.iterator->var))
+	{
+	  gfc_conv_expr_val (&se, code->ext.iterator->start);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  if (!DECL_P (se.expr))
+	    se.expr = gfc_evaluate_now (se.expr, pblock);
+	}
+      from = se.expr;
 
-      /* Special case simple loops.  */
-      if (VAR_P (dovar))
+      gfc_init_se (&se, NULL);
+      if (!clauses->non_rectangular
+	  || !gfc_nonrect_loop_expr (pblock, &se, i, orig_code->block->next,
+				     code->ext.iterator->end, &inits, simple,
+				     code->ext.iterator->var))
 	{
-	  if (integer_onep (step))
-	    simple = 1;
-	  else if (tree_int_cst_equal (step, integer_minus_one_node))
-	    simple = -1;
+	  gfc_conv_expr_val (&se, code->ext.iterator->end);
+	  gfc_add_block_to_block (pblock, &se.pre);
+	  if (!DECL_P (se.expr))
+	    se.expr = gfc_evaluate_now (se.expr, pblock);
 	}
-      else
+      to = se.expr;
+
+      if (!DECL_P (dovar))
 	dovar_decl
 	  = gfc_trans_omp_variable (code->ext.iterator->var->symtree->n.sym,
 				    false);
-
+      if (simple && !DECL_P (dovar))
+	{
+	  const char *name = code->ext.iterator->var->symtree->n.sym->name;
+	  local_dovar = gfc_create_var (type, name);
+	  dovar_init e = {code->ext.iterator->var->symtree->n.sym,
+			  dovar, local_dovar, false};
+	  inits.safe_push (e);
+	}
       /* Loop body.  */
       if (simple)
 	{
-	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, dovar, from);
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, local_dovar, from);
 	  /* The condition should not be folded.  */
 	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, simple > 0
 					       ? LE_EXPR : GE_EXPR,
-					       logical_type_node, dovar, to);
+					       logical_type_node, local_dovar,
+					       to);
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
-						    type, dovar, step);
+						    type, local_dovar, step);
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
 						    MODIFY_EXPR,
-						    type, dovar,
+						    type, local_dovar,
 						    TREE_VEC_ELT (incr, i));
 	  if (orig_decls && !clauses->orderedc)
 	    orig_decls = NULL;
@@ -5280,24 +5428,27 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	  tmp = fold_build2_loc (input_location, TRUNC_DIV_EXPR, type, tmp,
 				 step);
 	  tmp = gfc_evaluate_now (tmp, pblock);
-	  count = gfc_create_var (type, "count");
-	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, count,
+	  local_dovar = gfc_create_var (type, "count");
+	  TREE_VEC_ELT (init, i) = build2_v (MODIFY_EXPR, local_dovar,
 					     build_int_cst (type, 0));
 	  /* The condition should not be folded.  */
 	  TREE_VEC_ELT (cond, i) = build2_loc (input_location, LT_EXPR,
 					       logical_type_node,
-					       count, tmp);
+					       local_dovar, tmp);
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location, PLUS_EXPR,
-						    type, count,
+						    type, local_dovar,
 						    build_int_cst (type, 1));
 	  TREE_VEC_ELT (incr, i) = fold_build2_loc (input_location,
-						    MODIFY_EXPR, type, count,
+						    MODIFY_EXPR, type,
+						    local_dovar,
 						    TREE_VEC_ELT (incr, i));
 
 	  /* Initialize DOVAR.  */
-	  tmp = fold_build2_loc (input_location, MULT_EXPR, type, count, step);
+	  tmp = fold_build2_loc (input_location, MULT_EXPR, type, local_dovar,
+				 step);
 	  tmp = fold_build2_loc (input_location, PLUS_EXPR, type, from, tmp);
-	  dovar_init e = {dovar, tmp};
+	  dovar_init e = {code->ext.iterator->var->symtree->n.sym,
+			  dovar, tmp, true};
 	  inits.safe_push (e);
 	  if (clauses->orderedc)
 	    {
@@ -5312,7 +5463,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
       if (dovar_found == 3
 	  && op == EXEC_OMP_SIMD
 	  && collapse == 1
-	  && !simple)
+	  && local_dovar != dovar)
 	{
 	  for (tmp = omp_clauses; tmp; tmp = OMP_CLAUSE_CHAIN (tmp))
 	    if (OMP_CLAUSE_CODE (tmp) == OMP_CLAUSE_LINEAR
@@ -5331,11 +5482,11 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	      OMP_CLAUSE_LINEAR_NO_COPYIN (tmp) = 1;
 	      OMP_CLAUSE_DECL (tmp) = dovar_decl;
 	      omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
+	      if (local_dovar != dovar)
+		dovar_found = 3;
 	    }
-	  if (!simple)
-	    dovar_found = 3;
 	}
-      else if (!dovar_found && !simple)
+      else if (!dovar_found && local_dovar != dovar)
 	{
 	  tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE);
 	  OMP_CLAUSE_DECL (tmp) = dovar_decl;
@@ -5346,7 +5497,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	  tree c = NULL;
 
 	  tmp = NULL;
-	  if (!simple)
+	  if (local_dovar != dovar)
 	    {
 	      /* If dovar is lastprivate, but different counter is used,
 		 dovar += step needs to be added to
@@ -5356,21 +5507,19 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	      if (clauses->orderedc)
 		{
 		  if (clauses->collapse <= 1 || i >= clauses->collapse)
-		    tmp = count;
+		    tmp = local_dovar;
 		  else
 		    tmp = fold_build2_loc (input_location, PLUS_EXPR,
-					   type, count, build_one_cst (type));
+					   type, local_dovar,
+					   build_one_cst (type));
 		  tmp = fold_build2_loc (input_location, MULT_EXPR, type,
 					 tmp, step);
 		  tmp = fold_build2_loc (input_location, PLUS_EXPR, type,
 					 from, tmp);
 		}
 	      else
-		{
-		  tmp = gfc_evaluate_now (step, pblock);
-		  tmp = fold_build2_loc (input_location, PLUS_EXPR, type,
-					 dovar, tmp);
-		}
+		tmp = fold_build2_loc (input_location, PLUS_EXPR, type,
+				       dovar, step);
 	      tmp = fold_build2_loc (input_location, MODIFY_EXPR, type,
 				     dovar, tmp);
 	      for (c = omp_clauses; c ; c = OMP_CLAUSE_CHAIN (c))
@@ -5405,9 +5554,9 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 		    break;
 		  }
 	    }
-	  gcc_assert (simple || c != NULL);
+	  gcc_assert (local_dovar == dovar || c != NULL);
 	}
-      if (!simple)
+      if (local_dovar != dovar)
 	{
 	  if (op != EXEC_OMP_SIMD || dovar_found == 1)
 	    tmp = build_omp_clause (input_location, OMP_CLAUSE_PRIVATE);
@@ -5420,7 +5569,7 @@ gfc_trans_omp_do (gfc_code *code, gfc_exec_op op, stmtblock_t *pblock,
 	    }
 	  else
 	    tmp = build_omp_clause (input_location, OMP_CLAUSE_LASTPRIVATE);
-	  OMP_CLAUSE_DECL (tmp) = count;
+	  OMP_CLAUSE_DECL (tmp) = local_dovar;
 	  omp_clauses = gfc_trans_add_clause (tmp, omp_clauses);
 	}
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
index 13772c185ce..ad5e11abf91 100644
--- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute-loop.f90
@@ -47,8 +47,10 @@ contains
        end do
     end do
     !$acc end parallel
-    ! { dg-note {variable 'count\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
-    ! { dg-note {variable 'count\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'i\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'j\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_compute$c_compute }
+    ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'j\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'a' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
diff --git a/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90 b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
index 6878d856919..c5c2f2b9845 100644
--- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
+++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-routine_gang-loop.f90
@@ -46,7 +46,8 @@ contains
           y = a
        end do
     end do
-    ! { dg-note {variable 'count\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'i\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
+    ! { dg-note {variable 'j\.[0-9]+' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'i' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'j' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
     ! { dg-note {variable 'a' in 'private' clause isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } l_loop$c_loop }
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90
new file mode 100644
index 00000000000..dbbd18a1444
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1.f90
@@ -0,0 +1,668 @@
+! { dg-do run }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! See PR or non-rectangular-loop-1a.f90 for the commented tests
+! Hint: Those use step for loop vars part of nonrectangular loop nests
+
+module m
+  implicit none (type, external)
+contains
+
+! The 'k' loop uses i or j as start value
+! but a constant end value such that 'lastprivate'
+! should be well-defined
+subroutine lastprivate_check_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) then
+          print *, i, j, k,p, " -> i, j, k, p   (k < 1 - 41 .or. k > p)"
+          error stop
+        end if
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  k = -43
+  m = 0
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) then
+          print *, i, j, k,p, " -> i, j, k, p   (k < 1 - 41 .or. k > p)"
+          error stop
+        end if
+      end do
+    end do
+  end do
+  if (k /= -43) error stop
+
+  m = 23
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  n = -5
+  k = - 70
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= -70) error stop
+
+  n = 11
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!
+!  !$omp simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_simd_1
+
+
+! Same but with do simd
+subroutine lastprivate_check_do_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+!  !$omp parallel do simd collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_do_simd_1
+
+
+
+! Same but with do
+subroutine lastprivate_check_do_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+!  !$omp parallel do collapse(3) lastprivate(k)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+!  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+!  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n
+!    do j = 1, m, 2
+!      do k = j - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 1 .or. j /= m + 2) error stop
+
+!  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+!  do i = 1, n, 2
+!    do j = 1, m
+!      do k = i - 41, p
+!        if (k < 1 - 41 .or. k > p) error stop
+!      end do
+!    end do
+!  end do
+!  if (k /= p + 1) error stop
+!  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2
+    do j = 1, m
+      do k = j - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2
+      do k = i - 41, p
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+end subroutine lastprivate_check_do_1
+
+
+
+subroutine lastprivate_check_2
+  integer :: n,m,p, i,j,k,ll
+
+  n = 11
+  m = 23
+  p = 27
+
+!  !$omp parallel do simd collapse(3) lastprivate(p)
+!  do i = 1, n
+!    do j = 1, m,2
+!      do k = 1, j + 41
+!        do ll = 1, p, 2
+!          if (k > 23 + 41 .or. k < 1) error stop
+!        end do
+!      end do
+!    end do
+!  end do
+!  if (ll /= 29) error stop
+
+!  !$omp simd collapse(3) lastprivate(p)
+!  do i = 1, n
+!    do j = 1, m,2
+!      do k = 1, j + 41
+!        do ll = 1, p, 2
+!          if (k > 23 + 41 .or. k < 1) error stop
+!        end do
+!      end do
+!    end do
+!  end do
+!  if (ll /= 29) error stop
+
+!  !$omp simd collapse(3) lastprivate(k)
+!  do i = 1, n,2
+!    do j = 1, m
+!      do k = 1, i + 41
+!        if (k > 11 + 41 .or. k < 1) error stop
+!      end do
+!    end do
+!  end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - Same but without 'private':
+!!$omp simd collapse(3) lastprivate(k)
+!do i = 1, n
+!  do j = 1, m,2
+!    do k = 1, j + 41
+!      if (k > 23 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 65) then
+!  print *, k, 65
+!  error stop
+!endif
+
+
+!!$omp simd collapse(3) lastprivate(k)
+!do i = 1, n,2
+!  do j = 1, m
+!    do k = 1, i + 41
+!      if (k > 11 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - all with lastprivate
+!!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+!do i = 1, n
+!  do j = 1, m,2
+!    do k = 1, j + 41
+!      if (k > 23 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 65) then
+!  print *, k, 65
+!  error stop
+!endif
+
+
+!!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+!do i = 1, n,2
+!  do j = 1, m
+!    do k = 1, i + 41
+!      if (k > 11 + 41 .or. k < 1) error stop
+!    end do
+!  end do
+!end do
+!if (k /= 53) then
+!  print *, k, 53
+!  error stop
+!endif
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n,2
+  do j = 1, m
+    do k = 1, j + 41
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n
+  do j = 1, m,2
+    do k = 1, i + 41
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+end
+end module m
+
+program main
+  use m
+  implicit none (type, external)
+  call lastprivate_check_simd_1
+  call lastprivate_check_do_simd_1
+  call lastprivate_check_do_1
+  call lastprivate_check_2
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90
new file mode 100644
index 00000000000..77aa887942e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-1a.f90
@@ -0,0 +1,374 @@
+! { dg-do compile }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! ========================================================
+! NOTE: The testcases are from non-rectangular-loop-1.f90,
+! but commented there. Feel free to remove this
+! file + uncomment them in non-rectangular-loop-1.f90
+! Otherwise, you need to change it to 'dg-do run'!
+! ========================================================
+
+module m
+  implicit none (type, external)
+contains
+
+! The 'k' loop uses i or j as start value
+! but a constant end value such that 'lastprivate'
+! should be well-defined
+subroutine lastprivate_check_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_simd_1
+
+
+! Same but with do simd
+subroutine lastprivate_check_do_simd_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp parallel do simd collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_do_simd_1
+
+
+
+! Same but with do
+subroutine lastprivate_check_do_1
+  integer :: n,m,p, i,j,k
+
+  n = 11
+  m = 23
+  p = 27
+
+  ! Use 'i' or 'j', unit step on 'i' or on 'j' -> 4 loops
+  ! Then same, except use non-unit step for 'k'
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same but 'private' for all (i,j) vars
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) private(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+
+  ! Same - but with lastprivate(i,j)
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n
+    do j = 1, m, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = j - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 1 .or. j /= m + 2) error stop
+
+  !$omp parallel do collapse(3) lastprivate(k) lastprivate(i,j)
+  do i = 1, n, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = i - 41, p  ! { dg-note "Used here" }
+        if (k < 1 - 41 .or. k > p) error stop
+      end do
+    end do
+  end do
+  if (k /= p + 1) error stop
+  if (i /= n + 2 .or. j /= m + 1) error stop
+
+end subroutine lastprivate_check_do_1
+
+
+
+subroutine lastprivate_check_2
+  integer :: n,m,p, i,j,k,ll
+
+  n = 11
+  m = 23
+  p = 27
+
+  !$omp parallel do simd collapse(3) lastprivate(p)
+  do i = 1, n
+    do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = 1, j + 41  ! { dg-note "Used here" }
+        do ll = 1, p, 2
+          if (k > 23 + 41 .or. k < 1) error stop
+        end do
+      end do
+    end do
+  end do
+  if (ll /= 29) error stop
+
+  !$omp simd collapse(3) lastprivate(p)
+  do i = 1, n
+    do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+      do k = 1, j + 41  ! { dg-note "Used here" }
+        do ll = 1, p, 2
+          if (k > 23 + 41 .or. k < 1) error stop
+        end do
+      end do
+    end do
+  end do
+  if (ll /= 29) error stop
+
+  !$omp simd collapse(3) lastprivate(k)
+  do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+    do j = 1, m
+      do k = 1, i + 41  ! { dg-note "Used here" }
+        if (k > 11 + 41 .or. k < 1) error stop
+      end do
+    end do
+  end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - Same but without 'private':
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n
+  do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+    do k = 1, j + 41  ! { dg-note "Used here" }
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k)
+do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+  do j = 1, m
+    do k = 1, i + 41  ! { dg-note "Used here" }
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+! - all with lastprivate
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n
+  do j = 1, m,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+    do k = 1, j + 41  ! { dg-note "Used here" }
+      if (k > 23 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 65) then
+  print *, k, 65
+  error stop
+endif
+
+
+!$omp simd collapse(3) lastprivate(k) lastprivate(i, j)
+do i = 1, n,2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'i'" }
+  do j = 1, m
+    do k = 1, i + 41  ! { dg-note "Used here" }
+      if (k > 11 + 41 .or. k < 1) error stop
+    end do
+  end do
+end do
+if (k /= 53) then
+  print *, k, 53
+  error stop
+endif
+
+end
+end module m
+
+program main
+  use m
+  implicit none (type, external)
+  call lastprivate_check_simd_1
+  call lastprivate_check_do_simd_1
+  call lastprivate_check_do_1
+  call lastprivate_check_2
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90
new file mode 100644
index 00000000000..0cea61e5f0d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-2.f90
@@ -0,0 +1,243 @@
+! { dg-do run }
+! { dg-additional-options "-fdump-tree-original -fcheck=all" }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+! Valid patterns are:
+!  (1)  a2 - var-outer
+!  (2)  a1 * var-outer
+!  (3)  a1 * var-outer + a2
+!  (4)  a2 + a1 * var-outer
+!  (5)  a1 * var-outer - a2
+!  (6)  a2 - a1 * var-outer
+!  (7)  var-outer * a1
+!  (8)  var-outer * a1 + a2
+!  (9)  a2 + var-outer * a1
+! (10)  var-outer * a1 - a2
+! (11)  a2 - var-outer * a1
+
+module m
+contains
+
+
+! { dg-final { scan-tree-dump-times "for \\(one_two_inner = one_two_outer \\* -1 \\+ one_a2; one_two_inner <= one_two_outer \\* two_a1 \\+ 0; one_two_inner = one_two_inner \\+ 1\\)" 1 original } }
+
+!  (1)  a2 - var-outer
+!  (2)  a1 * var-outer
+subroutine one_two()
+  implicit none
+  integer :: one_a2
+  integer :: two_a1
+  integer :: one_two_outer, one_two_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  one_a2 = 13
+  two_a1 = 5
+  allocate(var(1:10, one_a2 - 10:two_a1 * 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do one_two_outer = 1, 10
+    do one_two_inner = one_a2 - one_two_outer, two_a1 * one_two_outer
+      !$omp atomic update
+      var(one_two_outer,one_two_inner) = var(one_two_outer,one_two_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = one_a2 - i, two_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(three_four_inner = three_four_outer \\* three_a1 \\+ three_a2; three_four_inner <= three_four_outer \\* four_a1 \\+ four_a2; three_four_inner = three_four_inner \\+ 1\\)" 1 original } }
+
+!  (3)  a1 * var-outer + a2
+!  (4)  a2 + a1 * var-outer
+subroutine three_four()
+  implicit none
+  integer :: three_a1, three_a2
+  integer :: four_a1, four_a2
+  integer :: three_four_outer, three_four_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  three_a1 = 2
+  three_a2 = 3
+  four_a1 = 3
+  four_a2 = 5
+  allocate(var(1:10, three_a1 * 1 + three_a2:four_a2 + four_a1 * 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do three_four_outer = 1, 10
+    do three_four_inner = three_a1 * three_four_outer + three_a2, four_a2 + four_a1 * three_four_outer
+      !$omp atomic update
+      var(three_four_outer, three_four_inner) = var(three_four_outer, three_four_inner) + 2
+    end do
+  end do
+  do i = 1, 10
+    do j = three_a1 * i + three_a2, four_a2 + four_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(five_six_inner = five_six_outer \\* five_a1 \\+ D\\.\[0-9\]+; five_six_inner <= five_six_outer \\* D\\.\[0-9\]+ \\+ six_a2; five_six_inner = five_six_inner \\+ 1\\)" 1 original } }
+
+!  (5)  a1 * var-outer - a2
+!  (6)  a2 - a1 * var-outer
+subroutine five_six()
+  implicit none
+  integer :: five_a1, five_a2
+  integer :: six_a1, six_a2
+  integer :: five_six_outer, five_six_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  five_a1 = 2
+  five_a2 = -3
+  six_a1 = 3
+  six_a2 = 20
+  allocate(var(1:10, five_a1 * 1 - five_a2:six_a2 - six_a1 * 1), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do five_six_outer = 1, 10
+    do five_six_inner = five_a1 * five_six_outer - five_a2, six_a2 - six_a1 * five_six_outer
+      !$omp atomic update
+      var(five_six_outer, five_six_inner) = var(five_six_outer, five_six_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = five_a1 * i - five_a2, six_a2 - six_a1 * i
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(seven_eight_inner = seven_eight_outer \\* seven_a1 \\+ 0; seven_eight_inner <= seven_eight_outer \\* eight_a1 \\+ eight_a2; seven_eight_inner = seven_eight_inner \\+ 1\\)" 1 original } }
+
+!  (7)  var-outer * a1
+!  (8)  var-outer * a1 + a2
+subroutine seven_eight()
+  implicit none
+  integer :: seven_a1
+  integer :: eight_a1, eight_a2
+  integer :: seven_eight_outer, seven_eight_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  seven_a1 = 3
+  eight_a1 = 2
+  eight_a2 = -4
+  allocate(var(1:10, 1 * seven_a1 : 10 * eight_a1 + eight_a2), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do seven_eight_outer = 1, 10
+    do seven_eight_inner = seven_eight_outer * seven_a1, seven_eight_outer * eight_a1 + eight_a2
+      !$omp atomic update
+      var(seven_eight_outer, seven_eight_inner) = var(seven_eight_outer, seven_eight_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = i * seven_a1, i * eight_a1 + eight_a2
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(nine_ten_inner = nine_ten_outer \\* nine_a1 \\+ nine_a2; nine_ten_inner <= nine_ten_outer \\* ten_a1 \\+ D\\.\[0-9\]+; nine_ten_inner = nine_ten_inner \\+ 1\\)" 1 original } }
+
+!  (9)  a2 + var-outer * a1
+! (10)  var-outer * a1 - a2
+subroutine nine_ten()
+  implicit none
+  integer :: nine_a1, nine_a2
+  integer :: ten_a1, ten_a2
+  integer :: nine_ten_outer, nine_ten_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  nine_a1 = 3
+  nine_a2 = 5
+  ten_a1 = 2
+  ten_a2 = 3
+  allocate(var(1:10, nine_a2 + 1 * nine_a1:10 * ten_a1 - ten_a2), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do nine_ten_outer = 1, 10
+    do nine_ten_inner = nine_a2 + nine_ten_outer * nine_a1, nine_ten_outer * ten_a1 - ten_a2
+      !$omp atomic update
+      var(nine_ten_outer, nine_ten_inner) = var(nine_ten_outer, nine_ten_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = nine_a2 + i * nine_a1, i * ten_a1 - ten_a2
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+
+
+! { dg-final { scan-tree-dump-times "for \\(eleven_inner = eleven_outer \\* D\\.\[0-9\]+ \\+ eleven_a2; eleven_inner <= 10; eleven_inner = eleven_inner \\+ 1\\)" 1 original } }
+
+! (11)  a2 - var-outer * a1
+
+subroutine eleven()
+  implicit none
+  integer :: eleven_a1, eleven_a2
+  integer :: eleven_outer, eleven_inner
+  integer :: i, j
+  integer, allocatable :: var(:,:)
+
+  eleven_a1 = 2
+  eleven_a2 = 3
+  allocate(var(1:10, eleven_a2 - 10 * eleven_a1 : 10), &
+           source=0)
+  if (size(var) <= 4) error stop
+
+  !$omp simd collapse(2)
+  do eleven_outer = 1, 10
+    do eleven_inner = eleven_a2 - eleven_outer * eleven_a1, 10
+      !$omp atomic update
+      var(eleven_outer, eleven_inner) = var(eleven_outer, eleven_inner) + 2
+    end do
+  end do
+
+  do i = 1, 10
+    do j = eleven_a2 - i * eleven_a1, 10
+      if (var(i,j) /= 2) error stop
+    end do
+  end do
+end
+end module m
+
+program main
+use m
+implicit none
+call one_two()
+call three_four()
+call five_six()
+call seven_eight()
+call nine_ten()
+call eleven()
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-3.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-3.f90
new file mode 100644
index 00000000000..c97cd99f71c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-3.f90
@@ -0,0 +1,212 @@
+! { dg-additional-options "-fdump-tree-original" }
+! PR fortran/107424
+
+module m
+contains
+subroutine foo (av, avo, a0, a0o, a1, a2, a3, a4)
+implicit none
+
+integer, value :: av
+integer, value, optional :: avo
+integer :: a0
+integer, optional :: a0o
+integer, pointer :: a1
+integer, pointer, optional :: a2
+integer, allocatable :: a3
+integer, allocatable, optional :: a4
+integer :: a5
+integer, pointer :: a6
+integer, allocatable :: a7
+integer :: arr(20,10), ref(20,10)
+
+integer :: j, i
+
+allocate(a6, a7)
+
+ref = 44
+do i = 1, 10
+  do j = i, 20
+    ref(j, i) = j + 100 * i
+  end do
+end do
+
+! { dg-final { scan-tree-dump-times "for \\(av = 1; av <= 10; av = av \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = av \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+av = 99; j = 99
+!$omp simd collapse(2) lastprivate(av,j)
+do av = 1, 10
+  do j = av, 20
+    arr(j, av) = j + 100 * av
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (av /= 11 .or. j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(avo = 1; avo <= 10; avo = avo \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = avo \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+avo = 99; j = 99
+!$omp simd collapse(2) lastprivate(avo, j)
+do avo = 1, 10
+  do j = avo, 20
+    arr(j, avo) = j + 100 * avo
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (avo /= 11 .or. j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0\\.\[0-9\]+ = 1; a0\\.\[0-9\]+ <= 10; a0\\.\[0-9\]+ = a0\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a0\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0 = a0\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a0 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a0,j)
+do a0 = 1, 10
+  do j = a0, 20
+    arr(j, a0) = j + 100 * a0
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a0 /= 11 .or. j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0o\\.\[0-9\]+ = 1; a0o\\.\[0-9\]+ <= 10; a0o\\.\[0-9\]+ = a0o\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a0o\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0o = a0o\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a0o = 99; j = 99
+!$omp simd collapse(2) lastprivate(a0o,j)
+do a0o = 1, 10
+  do j = a0o, 20
+    arr(j, a0o) = j + 100 * a0o
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a0o /= 11 .or. j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a1\\.\[0-9\]+ = 1; a1\\.\[0-9\]+ <= 10; a1\\.\[0-9\]+ = a1\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a1\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a1 = a1\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a1 = 99; j = 99
+! no last private for 'a1' as "The initial status of a private pointer is undefined."
+!$omp simd collapse(2) lastprivate(j)
+do a1 = 1, 10
+  do j = a1, 20
+    arr(j, a1) = j + 100 * a1
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a2\\.\[0-9\]+ = 1; a2\\.\[0-9\]+ <= 10; a2\\.\[0-9\]+ = a2\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a2\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a2 = a2\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a2 = 99; j = 99
+! no last private for 'a2' as "The initial status of a private pointer is undefined."
+!$omp simd collapse(2) lastprivate(j)
+do a2 = 1, 10
+  do j = a2, 20
+    arr(j, a2) = j + 100 * a2
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a3\\.\[0-9\]+ = 1; a3\\.\[0-9\]+ <= 10; a3\\.\[0-9\]+ = a3\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a3\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a3 = a3\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a3 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a3,j)
+do a3 = 1, 10
+  do j = a3, 20
+    arr(j, a3) = j + 100 * a3
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a3 /= 11 .or. j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a4\\.\[0-9\]+ = 1; a4\\.\[0-9\]+ <= 10; a4\\.\[0-9\]+ = a4\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a4\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a4 = a4\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a4 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a4,j)
+do a4 = 1, 10
+  do j = a4, 20
+    arr(j, a4) = j + 100 * a4
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a4 /= 11 .or. j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a5 = 1; a5 <= 10; a5 = a5 \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a5 \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+a5 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a5,j)
+do a5 = 1, 10
+  do j = a5, 20
+    arr(j, a5) = j + 100 * a5
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a5 /= 11 .or. j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a6\\.\[0-9\]+ = 1; a6\\.\[0-9\]+ <= 10; a6\\.\[0-9\]+ = a6\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a6\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a6 = a6\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a6 = 99; j = 99
+! no last private for 'a6' as "The initial status of a private pointer is undefined."
+!$omp simd collapse(2) lastprivate(j)
+do a6 = 1, 10
+  do j = a6, 20
+    arr(j, a6) = j + 100 * a6
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (j /= 21) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a7\\.\[0-9\]+ = 1; a7\\.\[0-9\]+ <= 10; a7\\.\[0-9\]+ = a7\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = a7\\.\[0-9\]+ \\* 1 \\+ 0; j <= 20; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a7 = a7\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a7 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a7,j)
+do a7 = 1, 10
+  do j = a7, 20
+    arr(j, a7) = j + 100 * a7
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a7 /= 11 .or. j /= 21) error stop
+
+deallocate(a6, a7)
+end
+
+end module m
+
+
+use m
+implicit none
+
+integer :: av
+integer :: avo
+integer :: a0
+integer :: a0o
+integer, pointer :: a1
+integer, pointer :: a2
+integer, allocatable :: a3
+integer, allocatable :: a4
+
+av = -99; avo = -99
+allocate(a1,a2,a3,a4)
+call foo (av, avo, a0, a0o, a1, a2, a3, a4)
+deallocate(a1,a2,a3,a4)
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-4.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-4.f90
new file mode 100644
index 00000000000..ef2bd61f180
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-4.f90
@@ -0,0 +1,215 @@
+! { dg-additional-options "-fdump-tree-original" }
+! PR fortran/107424
+
+! Same as non-rectangular-loop-4.f90 but expr in upper bound 
+
+module m
+contains
+subroutine foo (av, avo, a0, a0o, a1, a2, a3, a4)
+implicit none
+
+integer, value :: av
+integer, value, optional :: avo
+integer :: a0
+integer, optional :: a0o
+integer, pointer :: a1
+integer, pointer, optional :: a2
+integer, allocatable :: a3
+integer, allocatable, optional :: a4
+integer :: a5
+integer, pointer :: a6
+integer, allocatable :: a7
+integer :: arr(20,10), ref(20,10)
+
+integer :: j, i, lp_i, lp_j
+
+allocate(a6, a7)
+
+ref = 44
+do i = 1, 10
+  do j = 1, i*2-1
+    ref(j, i) = j + 100 * i
+  end do
+end do
+lp_i = i; lp_j = j
+
+! { dg-final { scan-tree-dump-times "for \\(av = 1; av <= 10; av = av \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= av \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+av = 99; j = 99
+!$omp simd collapse(2) lastprivate(av,j)
+do av = 1, 10
+  do j = 1, av*2-1
+    arr(j, av) = j + 100 * av
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (av /= lp_i .or. j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(avo = 1; avo <= 10; avo = avo \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= avo \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+avo = 99; j = 99
+!$omp simd collapse(2) lastprivate(avo, j)
+do avo = 1, 10
+  do j = 1, avo*2-1
+    arr(j, avo) = j + 100 * avo
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (avo /= lp_i .or. j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0\\.\[0-9\]+ = 1; a0\\.\[0-9\]+ <= 10; a0\\.\[0-9\]+ = a0\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a0\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0 = a0\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a0 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a0,j)
+do a0 = 1, 10
+  do j = 1, a0*2-1
+    arr(j, a0) = j + 100 * a0
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a0 /= lp_i .or. j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a0o\\.\[0-9\]+ = 1; a0o\\.\[0-9\]+ <= 10; a0o\\.\[0-9\]+ = a0o\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a0o\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a0o = a0o\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a0o = 99; j = 99
+!$omp simd collapse(2) lastprivate(a0o,j)
+do a0o = 1, 10
+  do j = 1, a0o*2-1
+    arr(j, a0o) = j + 100 * a0o
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a0o /= lp_i .or. j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a1\\.\[0-9\]+ = 1; a1\\.\[0-9\]+ <= 10; a1\\.\[0-9\]+ = a1\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a1\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a1 = a1\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a1 = 99; j = 99
+! no last private for 'a1' as "The initial status of a private pointer is undefined."
+!$omp simd collapse(2) lastprivate(j)
+do a1 = 1, 10
+  do j = 1, a1*2-1
+    arr(j, a1) = j + 100 * a1
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a2\\.\[0-9\]+ = 1; a2\\.\[0-9\]+ <= 10; a2\\.\[0-9\]+ = a2\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a2\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a2 = a2\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a2 = 99; j = 99
+! no last private for 'a2' as "The initial status of a private pointer is undefined."
+!$omp simd collapse(2) lastprivate(j)
+do a2 = 1, 10
+  do j = 1, a2*2-1
+    arr(j, a2) = j + 100 * a2
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a3\\.\[0-9\]+ = 1; a3\\.\[0-9\]+ <= 10; a3\\.\[0-9\]+ = a3\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a3\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a3 = a3\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a3 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a3,j)
+do a3 = 1, 10
+  do j = 1, a3*2-1
+    arr(j, a3) = j + 100 * a3
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a3 /= lp_i .or. j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a4\\.\[0-9\]+ = 1; a4\\.\[0-9\]+ <= 10; a4\\.\[0-9\]+ = a4\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a4\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*\\*a4 = a4\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a4 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a4,j)
+do a4 = 1, 10
+  do j = 1, a4*2-1
+    arr(j, a4) = j + 100 * a4
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a4 /= lp_i .or. j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a5 = 1; a5 <= 10; a5 = a5 \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a5 \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! -> no temp var
+arr = 44
+a5 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a5,j)
+do a5 = 1, 10
+  do j = 1, a5*2-1
+    arr(j, a5) = j + 100 * a5
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a5 /= lp_i .or. j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a6\\.\[0-9\]+ = 1; a6\\.\[0-9\]+ <= 10; a6\\.\[0-9\]+ = a6\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a6\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a6 = a6\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a6 = 99; j = 99
+! no last private for 'a6' as "The initial status of a private pointer is undefined."
+!$omp simd collapse(2) lastprivate(j)
+do a6 = 1, 10
+  do j = 1, a6*2-1
+    arr(j, a6) = j + 100 * a6
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (j /= lp_j) error stop
+
+! { dg-final { scan-tree-dump-times "for \\(a7\\.\[0-9\]+ = 1; a7\\.\[0-9\]+ <= 10; a7\\.\[0-9\]+ = a7\\.\[0-9\]+ \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "for \\(j = 1; j <= a7\\.\[0-9\]+ \\* 2 \\+ -1; j = j \\+ 1\\)" 1 "original" } }
+! { dg-final { scan-tree-dump-times "\\*a7 = a7\\.\[0-9\]+;" 1 "original" } }
+arr = 44
+a7 = 99; j = 99
+!$omp simd collapse(2) lastprivate(a7,j)
+do a7 = 1, 10
+  do j = 1, a7*2-1
+    arr(j, a7) = j + 100 * a7
+  end do
+end do
+if (any (ref /= arr)) error stop
+if (a7 /= lp_i .or. j /= lp_j) error stop
+
+deallocate(a6, a7)
+end
+
+end module m
+
+
+use m
+implicit none
+
+integer :: av
+integer :: avo
+integer :: a0
+integer :: a0o
+integer, pointer :: a1
+integer, pointer :: a2
+integer, allocatable :: a3
+integer, allocatable :: a4
+
+av = -99; avo = -99
+allocate(a1,a2,a3,a4)
+call foo (av, avo, a0, a0o, a1, a2, a3, a4)
+deallocate(a1,a2,a3,a4)
+end
diff --git a/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-5.f90 b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-5.f90
new file mode 100644
index 00000000000..643ab796a84
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/non-rectangular-loop-5.f90
@@ -0,0 +1,28 @@
+! { dg-do compile }
+! { dg-additional-options "-msse2" { target sse2_runtime } }
+! { dg-additional-options "-mavx" { target avx_runtime } }
+
+! PR fortran/107424
+
+! Nonrectangular loop nests checks
+
+!$omp simd collapse(2)
+do i = 1, 10
+  do j = i, 10, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+  end do
+end do
+
+!$omp do collapse(2) lastprivate(j)  ! { dg-error "lastprivate variable 'j' is private in outer context" }
+do i = 1, 10
+  do j = i, 10, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+  end do
+end do
+if (i /= 11) stop 1
+
+!$omp simd collapse(2) lastprivate(j)
+do i = 1, 10
+  do j = i, 10, 2  ! { dg-message "sorry, unimplemented: non-rectangular loop nest with step other than constant 1 or -1 for 'j'" }
+  end do
+end do
+if (i /= 11) stop 1
+end

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Patch][v2] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
  2023-02-09 14:46       ` Tobias Burnus
@ 2023-02-09 14:49         ` Jakub Jelinek
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Jelinek @ 2023-02-09 14:49 UTC (permalink / raw)
  To: Tobias Burnus; +Cc: gcc-patches, fortran

On Thu, Feb 09, 2023 at 03:46:35PM +0100, Tobias Burnus wrote:
> I think the test coverage should be sufficient. Any further test idea?
> Otherwise, I would commit it now.

LGTM, thanks.

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-02-09 14:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19 14:40 [Patch] OpenMP/Fortran: Partially fix non-rect loop nests [PR107424] Tobias Burnus
2023-01-20 17:39 ` Jakub Jelinek
2023-01-20 18:00   ` Jakub Jelinek
2023-01-20 20:02     ` Jakub Jelinek
2023-01-25 14:47   ` [Patch][v2] " Tobias Burnus
2023-01-31 11:37     ` Jakub Jelinek
2023-02-09 14:46       ` Tobias Burnus
2023-02-09 14:49         ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).