* Re: [committed, gomp4] Handle sequential code in kernels region patch series
2015-10-12 17:13 [committed, gomp4] Handle sequential code in kernels region patch series Tom de Vries
@ 2015-10-12 17:19 ` Tom de Vries
2015-10-12 17:27 ` [committed, gomp4, 2/3] Handle sequential code in kernels region Tom de Vries
2015-10-12 17:29 ` [committed, gomp4, 3/3] Handle sequential code in kernels region - Testcases Tom de Vries
2 siblings, 0 replies; 5+ messages in thread
From: Tom de Vries @ 2015-10-12 17:19 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 729 bytes --]
On 12/10/15 19:12, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series.
>
> 1 Add get_bbs_in_oacc_kernels_region
> 2 Handle sequential code in kernels region
> 3 Handle sequential code in kernels region - Testcases
>
> The patch series adds detection of whether sequential code (that is,
> code in the oacc kernels region before and after the loop that is to be
> parallelized), is safe to execute in parallel.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch adds an oacc kernels infrastructure function:
extern vec<basic_block> get_bbs_in_oacc_kernels_region (basic_block,
basic_block);
Thanks,
- Tom
[-- Attachment #2: 0001-Add-get_bbs_in_oacc_kernels_region.patch --]
[-- Type: text/x-patch, Size: 2452 bytes --]
Add get_bbs_in_oacc_kernels_region
2015-10-12 Tom de Vries <tom@codesourcery.com>
* omp-low.c (get_bbs_in_oacc_kernels_region): New function.
* omp-low.h (get_bbs_in_oacc_kernels_region): Declare.
---
gcc/omp-low.c | 40 ++++++++++++++++++++++++++++++++++++++++
gcc/omp-low.h | 2 ++
2 files changed, 42 insertions(+)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 2289486..f6e0247 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -9959,6 +9959,46 @@ mark_loops_in_oacc_kernels_region (basic_block region_entry,
loop->in_oacc_kernels_region = true;
}
+/* Return blocks in oacc kernels region delimited by REGION_ENTRY and
+ REGION_EXIT. */
+
+vec<basic_block>
+get_bbs_in_oacc_kernels_region (basic_block region_entry,
+ basic_block region_exit)
+{
+ bitmap excludes_bitmap = BITMAP_GGC_ALLOC ();
+ unsigned di;
+ basic_block bb;
+
+ bitmap_clear (excludes_bitmap);
+
+ /* Get all the blocks dominated by the region entry. That will include the
+ entire region. */
+ vec<basic_block> dominated
+ = get_all_dominated_blocks (CDI_DOMINATORS, region_entry);
+
+ bitmap_set_bit (excludes_bitmap, region_entry->index);
+
+ /* Exclude all the blocks which are not in the region: the blocks dominated by
+ the region exit. */
+ if (region_exit != NULL)
+ {
+ vec<basic_block> excludes
+ = get_all_dominated_blocks (CDI_DOMINATORS, region_exit);
+ FOR_EACH_VEC_ELT (excludes, di, bb)
+ bitmap_set_bit (excludes_bitmap, bb->index);
+ bitmap_clear_bit (excludes_bitmap, region_exit->index);
+ }
+
+ vec<basic_block> bbs = vNULL;
+
+ FOR_EACH_VEC_ELT (dominated, di, bb)
+ if (!bitmap_bit_p (excludes_bitmap, bb->index))
+ bbs.safe_push (bb);
+
+ return bbs;
+}
+
/* Return the entry basic block of the oacc kernels region containing LOOP. */
basic_block
diff --git a/gcc/omp-low.h b/gcc/omp-low.h
index 62a7d4a..9f09bbc 100644
--- a/gcc/omp-low.h
+++ b/gcc/omp-low.h
@@ -34,6 +34,8 @@ extern tree get_omp_data_i (basic_block);
extern bool oacc_kernels_region_entry_p (basic_block, gomp_target **);
extern basic_block get_oacc_kernels_region_exit (basic_block);
extern basic_block loop_get_oacc_kernels_region_entry (struct loop *);
+extern vec<basic_block> get_bbs_in_oacc_kernels_region (basic_block,
+ basic_block);
extern void replace_oacc_fn_attrib (tree, tree);
extern tree build_oacc_routine_dims (tree);
extern tree get_oacc_fn_attrib (tree);
--
1.9.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [committed, gomp4, 2/3] Handle sequential code in kernels region
2015-10-12 17:13 [committed, gomp4] Handle sequential code in kernels region patch series Tom de Vries
2015-10-12 17:19 ` Tom de Vries
@ 2015-10-12 17:27 ` Tom de Vries
2015-11-03 9:02 ` [gomp4, committed] Remove shadowing declaration in oacc_entry_exit_ok_1 Tom de Vries
2015-10-12 17:29 ` [committed, gomp4, 3/3] Handle sequential code in kernels region - Testcases Tom de Vries
2 siblings, 1 reply; 5+ messages in thread
From: Tom de Vries @ 2015-10-12 17:27 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 854 bytes --]
On 12/10/15 19:12, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series.
>
> 1 Add get_bbs_in_oacc_kernels_region
> 2 Handle sequential code in kernels region
> 3 Handle sequential code in kernels region - Testcases
>
> The patch series adds detection of whether sequential code (that is,
> code in the oacc kernels region before and after the loop that is to be
> parallelized), is safe to execute in parallel.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch checks in parloops, for each non-loop stmt in the oacc
kernels region, that it's not a load aliasing with a store anywhere in
the region, and vice versa.
An exception are loads and stores for reductions, which are later-on
transformed into an atomic update.
Thanks,
- Tom
[-- Attachment #2: 0002-Handle-sequential-code-in-kernels-region.patch --]
[-- Type: text/x-patch, Size: 8243 bytes --]
Handle sequential code in kernels region
2015-10-12 Tom de Vries <tom@codesourcery.com>
* omp-low.c (lower_omp_for): Don't call lower_oacc_head_tail for oacc
kernels regions.
* tree-parloops.c (try_create_reduction_list): Initialize keep_res
field.
(dead_load_p, ref_conflicts_with_region, oacc_entry_exit_ok_1)
(oacc_entry_exit_ok): New function.
(parallelize_loops): Call oacc_entry_exit_ok.
---
gcc/omp-low.c | 3 +-
gcc/tree-parloops.c | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 247 insertions(+), 1 deletion(-)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f6e0247..e700dd1 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -11949,7 +11949,8 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx)
/* Once lowered, extract the bounds and clauses. */
extract_omp_for_data (stmt, &fd, NULL);
- if (is_gimple_omp_oacc (ctx->stmt))
+ if (is_gimple_omp_oacc (ctx->stmt)
+ && !ctx_in_oacc_kernels_region (ctx))
lower_oacc_head_tail (gimple_location (stmt),
gimple_omp_for_clauses (stmt),
&oacc_head, &oacc_tail, ctx);
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 4b67793..d4eb32a 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -58,6 +58,8 @@ along with GCC; see the file COPYING3. If not see
#include "cgraph.h"
#include "tree-ssa.h"
#include "params.h"
+#include "tree-ssa-alias.h"
+#include "tree-eh.h"
/* This pass tries to distribute iterations of loops into several threads.
The implementation is straightforward -- for each loop we test whether its
@@ -2672,6 +2674,7 @@ try_create_reduction_list (loop_p loop,
" FAILED: it is not a part of reduction.\n");
return false;
}
+ red->keep_res = phi;
if (dump_file && (dump_flags & TDF_DETAILS))
{
fprintf (dump_file, "reduction phi is ");
@@ -2764,6 +2767,240 @@ try_create_reduction_list (loop_p loop,
return true;
}
+/* Return true if STMT is a load of which the result is unused, and can be
+ safely deleted. */
+
+static bool
+dead_load_p (gimple *stmt)
+{
+ if (!gimple_assign_load_p (stmt))
+ return false;
+
+ tree lhs = gimple_assign_lhs (stmt);
+ return (TREE_CODE (lhs) == SSA_NAME
+ && has_zero_uses (lhs)
+ && !gimple_has_side_effects (stmt)
+ && !stmt_could_throw_p (stmt));
+}
+
+static bool
+ref_conflicts_with_region (gimple_stmt_iterator gsi, ao_ref *ref,
+ bool ref_is_store, vec<basic_block> region_bbs,
+ unsigned int i, gimple *skip_stmt)
+{
+ basic_block bb = region_bbs[i];
+ gsi_next (&gsi);
+
+ while (true)
+ {
+ for (; !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gimple *stmt = gsi_stmt (gsi);
+ if (stmt == skip_stmt)
+ {
+ if (dump_file)
+ {
+ fprintf (dump_file, "skipping reduction store: ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ continue;
+ }
+
+ if (!gimple_vdef (stmt)
+ && !gimple_vuse (stmt))
+ continue;
+
+ if (ref_is_store)
+ {
+ if (dead_load_p (stmt))
+ {
+ if (dump_file)
+ {
+ fprintf (dump_file, "skipping dead load: ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ continue;
+ }
+
+ if (ref_maybe_used_by_stmt_p (stmt, ref))
+ {
+ if (dump_file)
+ {
+ fprintf (dump_file, "Stmt ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ return true;
+ }
+ }
+ else
+ {
+ if (stmt_may_clobber_ref_p_1 (stmt, ref))
+ {
+ if (dump_file)
+ {
+ fprintf (dump_file, "Stmt ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ return true;
+ }
+ }
+ }
+ i++;
+ if (i == region_bbs.length ())
+ break;
+ bb = region_bbs[i];
+ gsi = gsi_start_bb (bb);
+ }
+
+ return false;
+}
+
+static bool
+oacc_entry_exit_ok_1 (bitmap in_loop_bbs, vec<basic_block> region_bbs,
+ tree omp_data_i,
+ reduction_info_table_type *reduction_list)
+{
+ unsigned i;
+ basic_block bb;
+ FOR_EACH_VEC_ELT (region_bbs, i, bb)
+ {
+ if (bitmap_bit_p (in_loop_bbs, bb->index))
+ continue;
+
+ gimple_stmt_iterator gsi;
+ for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);
+ gsi_next (&gsi))
+ {
+ gimple *stmt = gsi_stmt (gsi);
+ gimple *skip_stmt = NULL;
+
+ if (is_gimple_debug (stmt)
+ || gimple_code (stmt) == GIMPLE_COND)
+ continue;
+
+ ao_ref ref;
+ bool ref_is_store = false;
+ if (gimple_assign_load_p (stmt))
+ {
+ tree rhs = gimple_assign_rhs1 (stmt);
+ tree base = get_base_address (rhs);
+ if (TREE_CODE (base) == MEM_REF
+ && operand_equal_p (TREE_OPERAND (base, 0), omp_data_i, 0))
+ continue;
+
+ /* By testing for dead loads (here and in
+ ref_conflicts_with_region), we avoid having to run pass_dce
+ before pass_parallelize_loops_oacc_kernels. */
+ if (dead_load_p (stmt))
+ {
+ if (dump_file)
+ {
+ fprintf (dump_file, "skipping dead load: ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ continue;
+ }
+
+ tree lhs = gimple_assign_lhs (stmt);
+ if (TREE_CODE (lhs) == SSA_NAME
+ && has_single_use (lhs))
+ {
+ use_operand_p use_p;
+ gimple *use_stmt;
+ single_imm_use (lhs, &use_p, &use_stmt);
+ if (gimple_code (use_stmt) == GIMPLE_PHI)
+ {
+ struct reduction_info *red;
+ red = reduction_phi (reduction_list, use_stmt);
+ tree val = PHI_RESULT (red->keep_res);
+ if (has_single_use (val))
+ {
+ single_imm_use (val, &use_p, &use_stmt);
+ if (gimple_store_p (use_stmt))
+ {
+ skip_stmt = use_stmt;
+ if (dump_file)
+ {
+ fprintf (dump_file, "found reduction load: ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ }
+ }
+ }
+ }
+
+ ao_ref_init (&ref, rhs);
+ }
+ else if (gimple_store_p (stmt))
+ {
+ ao_ref ref;
+ ao_ref_init (&ref, gimple_assign_lhs (stmt));
+ ref_is_store = true;
+ }
+ else if (gimple_code (stmt) == GIMPLE_OMP_RETURN)
+ continue;
+ else if (gimple_stmt_omp_data_i_init_p (stmt))
+ continue;
+ else if (!gimple_has_side_effects (stmt)
+ && !gimple_could_trap_p (stmt)
+ && !stmt_could_throw_p (stmt)
+ && !gimple_vdef (stmt)
+ && !gimple_vuse (stmt))
+ continue;
+ else
+ {
+ if (dump_file)
+ {
+ fprintf (dump_file, "Unhandled stmt in entry/exit: ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ return false;
+ }
+
+ if (ref_conflicts_with_region (gsi, &ref, ref_is_store, region_bbs,
+ i, skip_stmt))
+ {
+ if (dump_file)
+ {
+ fprintf (dump_file, "conflicts with entry/exit stmt: ");
+ print_gimple_stmt (dump_file, stmt, 0, 0);
+ }
+ return false;
+ }
+ }
+ }
+
+ return true;
+}
+
+static bool
+oacc_entry_exit_ok (struct loop *loop, basic_block region_entry,
+ reduction_info_table_type *reduction_list)
+{
+ basic_block *loop_bbs = get_loop_body_in_dom_order (loop);
+ basic_block region_exit
+ = get_oacc_kernels_region_exit (single_succ (region_entry));
+ vec<basic_block> region_bbs
+ = get_bbs_in_oacc_kernels_region (region_entry, region_exit);
+ tree omp_data_i = get_omp_data_i (region_entry);
+ gcc_assert (omp_data_i != NULL_TREE);
+
+ bitmap in_loop_bbs = BITMAP_ALLOC (NULL);
+ bitmap_clear (in_loop_bbs);
+ for (unsigned int i = 0; i < loop->num_nodes; i++)
+ bitmap_set_bit (in_loop_bbs, loop_bbs[i]->index);
+
+ bool res = oacc_entry_exit_ok_1 (in_loop_bbs, region_bbs, omp_data_i,
+ reduction_list);
+
+ free (loop_bbs);
+
+ BITMAP_FREE (in_loop_bbs);
+
+ return res;
+}
+
/* Detect parallel loops and generate parallel code using libgomp
primitives. Returns true if some loop was parallelized, false
otherwise. */
@@ -2901,6 +3138,14 @@ parallelize_loops (bool oacc_kernels_p)
continue;
}
+ if (oacc_kernels_p
+ && !oacc_entry_exit_ok (loop, region_entry, &reduction_list))
+ {
+ if (dump_file)
+ fprintf (dump_file, "entry/exit not ok: FAILED\n");
+ continue;
+ }
+
changed = true;
/* Skip inner loop(s) of parallelized loop. */
skip_loop = loop->inner;
--
1.9.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [committed, gomp4, 3/3] Handle sequential code in kernels region - Testcases
2015-10-12 17:13 [committed, gomp4] Handle sequential code in kernels region patch series Tom de Vries
2015-10-12 17:19 ` Tom de Vries
2015-10-12 17:27 ` [committed, gomp4, 2/3] Handle sequential code in kernels region Tom de Vries
@ 2015-10-12 17:29 ` Tom de Vries
2 siblings, 0 replies; 5+ messages in thread
From: Tom de Vries @ 2015-10-12 17:29 UTC (permalink / raw)
To: gcc-patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 618 bytes --]
On 12/10/15 19:12, Tom de Vries wrote:
> Hi,
>
> I've committed the following patch series.
>
> 1 Add get_bbs_in_oacc_kernels_region
> 2 Handle sequential code in kernels region
> 3 Handle sequential code in kernels region - Testcases
>
> The patch series adds detection of whether sequential code (that is,
> code in the oacc kernels region before and after the loop that is to be
> parallelized), is safe to execute in parallel.
>
> Bootstrapped and reg-tested on x86_64.
>
> I'll post the patches individually, in reply to this email.
This patch adds relevant test-cases.
Thanks,
- Tom
[-- Attachment #2: 0003-Handle-sequential-code-in-kernels-region-Testcases.patch --]
[-- Type: text/x-patch, Size: 6415 bytes --]
Handle sequential code in kernels region - Testcases
2015-10-12 Tom de Vries <tom@codesourcery.com>
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c: New test.
---
.../kernels-loop-and-seq-2.c | 36 +++++++++++++++++++++
.../kernels-loop-and-seq-3.c | 37 ++++++++++++++++++++++
.../kernels-loop-and-seq-4.c | 36 +++++++++++++++++++++
.../kernels-loop-and-seq-5.c | 37 ++++++++++++++++++++++
.../kernels-loop-and-seq-6.c | 36 +++++++++++++++++++++
.../kernels-loop-and-seq.c | 37 ++++++++++++++++++++++
6 files changed, 219 insertions(+)
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c
create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c
new file mode 100644
index 0000000..2e4100f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+
+#include <stdlib.h>
+
+#define N 32
+
+unsigned int
+foo (int n, unsigned int *a)
+{
+#pragma acc kernels copy (a[0:N])
+ {
+ a[0] = a[0] + 1;
+
+ for (int i = 0; i < n; i++)
+ a[i] = 1;
+ }
+
+ return a[0];
+}
+
+int
+main (void)
+{
+ unsigned int a[N];
+ unsigned res, i;
+
+ for (i = 0; i < N; ++i)
+ a[i] = i % 4;
+
+ res = foo (N, a);
+ if (res != 1)
+ abort ();
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c
new file mode 100644
index 0000000..b3e736b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+
+#include <stdlib.h>
+
+#define N 32
+
+unsigned int
+foo (int n, unsigned int *a)
+{
+
+#pragma acc kernels copy (a[0:N])
+ {
+ for (int i = 0; i < n; i++)
+ a[i] = 1;
+
+ a[0] = 2;
+ }
+
+ return a[0];
+}
+
+int
+main (void)
+{
+ unsigned int a[N];
+ unsigned res, i;
+
+ for (i = 0; i < N; ++i)
+ a[i] = i % 4;
+
+ res = foo (N, a);
+ if (res != 2)
+ abort ();
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c
new file mode 100644
index 0000000..8b9affa
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+
+#include <stdlib.h>
+
+#define N 32
+
+unsigned int
+foo (int n, unsigned int *a)
+{
+#pragma acc kernels copy (a[0:N])
+ {
+ a[0] = 2;
+
+ for (int i = 0; i < n; i++)
+ a[i] = 1;
+ }
+
+ return a[0];
+}
+
+int
+main (void)
+{
+ unsigned int a[N];
+ unsigned res, i;
+
+ for (i = 0; i < N; ++i)
+ a[i] = i % 4;
+
+ res = foo (N, a);
+ if (res != 1)
+ abort ();
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c
new file mode 100644
index 0000000..83d4e7f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+
+#include <stdlib.h>
+
+#define N 32
+
+unsigned int
+foo (int n, unsigned int *a)
+{
+ int r;
+#pragma acc kernels copyout(r) copy (a[0:N])
+ {
+ r = a[0];
+
+ for (int i = 0; i < n; i++)
+ a[i] = 1;
+ }
+
+ return r;
+}
+
+int
+main (void)
+{
+ unsigned int a[N];
+ unsigned res, i;
+
+ for (i = 0; i < N; ++i)
+ a[i] = i % 4;
+
+ res = foo (N, a);
+ if (res != 0)
+ abort ();
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c
new file mode 100644
index 0000000..01d5e5e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c
@@ -0,0 +1,36 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+
+#include <stdlib.h>
+
+#define N 32
+
+unsigned int
+foo (int n, unsigned int *a)
+{
+#pragma acc kernels copy (a[0:N])
+ {
+ int r = a[0];
+
+ for (int i = 0; i < n; i++)
+ a[i] = 1 + r;
+ }
+
+ return a[0];
+}
+
+int
+main (void)
+{
+ unsigned int a[N];
+ unsigned res, i;
+
+ for (i = 0; i < N; ++i)
+ a[i] = i % 4;
+
+ res = foo (N, a);
+ if (res != 1)
+ abort ();
+
+ return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c
new file mode 100644
index 0000000..61d1283
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=32" } */
+
+#include <stdlib.h>
+
+#define N 32
+
+unsigned int
+foo (int n, unsigned int *a)
+{
+
+#pragma acc kernels copy (a[0:N])
+ {
+ for (int i = 0; i < n; i++)
+ a[i] = 1;
+
+ a[0] = a[0] + 1;
+ }
+
+ return a[0];
+}
+
+int
+main (void)
+{
+ unsigned int a[N];
+ unsigned res, i;
+
+ for (i = 0; i < N; ++i)
+ a[i] = i % 4;
+
+ res = foo (N, a);
+ if (res != 2)
+ abort ();
+
+ return 0;
+}
--
1.9.1
^ permalink raw reply [flat|nested] 5+ messages in thread