Hi! On Mon, 18 Jan 2016 14:07:11 +0100, Tom de Vries wrote: > Add oacc_kernels_p argument to pass_parallelize_loops > --- a/gcc/tree-parloops.c > +++ b/gcc/tree-parloops.c > @@ -2315,6 +2367,9 @@ gen_parallel_loop (struct loop *loop, | /* Ensure that the exit condition is the first statement in the loop. | The common case is that latch of the loop is empty (apart from the | increment) and immediately follows the loop exit test. Attempt to move the | entry of the loop directly before the exit check and increase the number of | iterations of the loop by one. */ | if (try_transform_to_exit_first_loop_alt (loop, reduction_list, nit)) | { | if (dump_file | && (dump_flags & TDF_DETAILS)) | fprintf (dump_file, | "alternative exit-first loop transform succeeded" | " for loop %d\n", loop->num); | } | else | { > + if (oacc_kernels_p) > + n_threads = 1; > + | /* Fall back on the method that handles more cases, but duplicates the | loop body: move the exit condition of LOOP to the beginning of its | header, and duplicate the part of the last iteration that gets disabled | to the exit of the loop. */ | transform_to_exit_first_loop (loop, reduction_list, nit); | } Just for my own education: this pessimization "n_threads = 1" for OpenACC kernels is because the duplicated loop bodies generated by transform_to_exit_first_loop are not appropriate for parallel OpenACC offloading execution? (Might add a source code comment here?) Testing on gomp-4_0-branch, there are no changes in the testsuite if I remove this hunk. Grüße Thomas