public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tom de Vries <Tom_deVries@mentor.com>
To: Thomas Schwinge <thomas@codesourcery.com>
Cc: "gcc-patches@gnu.org" <gcc-patches@gnu.org>,
	Jakub Jelinek	<jakub@redhat.com>,
	Richard Biener <rguenther@suse.de>,
	Richard Biener	<richard.guenther@gmail.com>
Subject: Re: [committed] Add oacc_kernels_p argument to pass_parallelize_loops
Date: Wed, 20 Jan 2016 10:31:00 -0000	[thread overview]
Message-ID: <569F6200.8040204@mentor.com> (raw)
In-Reply-To: <87r3hczl5a.fsf@kepler.schwinge.homeip.net>

On 20/01/16 09:54, Thomas Schwinge wrote:
> Hi!
>
> On Mon, 18 Jan 2016 14:07:11 +0100, Tom de Vries <Tom_deVries@mentor.com> wrote:
>> Add oacc_kernels_p argument to pass_parallelize_loops
>
>> --- a/gcc/tree-parloops.c
>> +++ b/gcc/tree-parloops.c
>
>> @@ -2315,6 +2367,9 @@ gen_parallel_loop (struct loop *loop,
>
> |   /* Ensure that the exit condition is the first statement in the loop.
> |      The common case is that latch of the loop is empty (apart from the
> |      increment) and immediately follows the loop exit test.  Attempt to move the
> |      entry of the loop directly before the exit check and increase the number of
> |      iterations of the loop by one.  */
> |   if (try_transform_to_exit_first_loop_alt (loop, reduction_list, nit))
> |     {
> |       if (dump_file
> | 	  && (dump_flags & TDF_DETAILS))
> | 	fprintf (dump_file,
> | 		 "alternative exit-first loop transform succeeded"
> | 		 " for loop %d\n", loop->num);
> |     }
> |   else
> |     {
>> +      if (oacc_kernels_p)
>> +	n_threads = 1;
>> +
> |       /* Fall back on the method that handles more cases, but duplicates the
> | 	 loop body: move the exit condition of LOOP to the beginning of its
> | 	 header, and duplicate the part of the last iteration that gets disabled
> | 	 to the exit of the loop.  */
> |       transform_to_exit_first_loop (loop, reduction_list, nit);
> |     }
>
> Just for my own education: this pessimization "n_threads = 1" for OpenACC
> kernels is because the duplicated loop bodies generated by
> transform_to_exit_first_loop are not appropriate for parallel OpenACC
> offloading execution?

In the case of standard parloops, only the loop is executed in parallel, 
so the duplicated loop body is outside the parallel region.

In the case of oacc parloops, the duplicated body is included in the 
kernels region, and executed in parallel.

The duplicated body for the last iteration can be executed in parallel 
with the loop body in the loop for all the other iterations. We've done 
the dependency analysis for that.

But the duplicated loop body for the last iteration is now executed in 
parallel with itself as well. We've got code that deals with that by 
guarding the side-effects such that they're only executed for a single 
gang. But that code is atm only effective in oacc_entry_exit_ok, before 
transform_to_exit_first_loop_alt introduces the duplicated loop body.

> (Might add a source code comment here?)  Testing
> on gomp-4_0-branch, there are no changes in the testsuite if I remove
> this hunk.

If you want to see the effect of removing the 'n_threads = 1' hunk, make 
try_transform_to_exit_first_loop_alt always return false.

I expect a loop
   for (i = 0; i < N; ++i)
     a[i] = a[i] + 1;
would give incorrect results in a[N - 1].

Thanks,
- Tom

  reply	other threads:[~2016-01-20 10:31 UTC|newest]

Thread overview: 133+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-09 15:35 [PATCH series, 16] Use parloops to parallelize oacc kernels regions Tom de Vries
2015-11-09 15:44 ` [PATCH, 1/16] Insert new exit block only when needed in transform_to_exit_first_loop_alt Tom de Vries
2015-11-11 10:50   ` Richard Biener
2015-11-09 15:45 ` [PATCH, 2/16] Make create_parallel_loop return void Tom de Vries
2015-11-11 10:50   ` Richard Biener
2015-11-09 15:51 ` [PATCH, 3/16] Ignore reduction clause on kernels directive Tom de Vries
2015-11-24 12:25   ` [PING][PATCH, " Tom de Vries
2016-01-18 14:24     ` [PING^2][PATCH, " Tom de Vries
2016-01-18 14:26       ` Jakub Jelinek
2015-11-09 16:10 ` [PATCH, 4/16] Implement -foffload-alias Tom de Vries
2015-11-11 10:53   ` Richard Biener
2015-11-11 11:01     ` Jakub Jelinek
2015-11-12 16:04       ` Tom de Vries
2015-11-13  8:46         ` Richard Biener
2015-11-13 11:03           ` Tom de Vries
2015-11-13 11:30             ` Richard Biener
2015-11-13 11:39               ` Jakub Jelinek
2015-11-21 12:24                 ` Tom de Vries
2015-11-23 11:46                   ` Richard Biener
2015-11-27 11:44                     ` Tom de Vries
2015-11-27 12:14                       ` Tom de Vries
2015-12-02  9:59                         ` Jakub Jelinek
2016-03-14 13:16                           ` Tom de Vries
2016-03-14 23:18                             ` Tom de Vries
2015-12-02  9:46                       ` Jakub Jelinek
2015-12-02 13:11                         ` Tom de Vries
2015-12-11 12:45                 ` Tom de Vries
2015-12-11 13:00                   ` Richard Biener
2015-12-13 16:38                     ` Tom de Vries
2015-12-14 13:26                       ` Richard Biener
2015-12-14 15:44                         ` Tom de Vries
2015-12-16 13:16                           ` Richard Biener
2015-12-16 14:43                             ` Tom de Vries
2015-12-17 12:03                               ` [gomp4] " Thomas Schwinge
2015-12-03 11:53       ` Tom de Vries
2015-11-09 16:31 ` [PATCH, 5/16] Add in_oacc_kernels_region in struct loop Tom de Vries
2015-11-11 10:57   ` Richard Biener
2015-11-16 11:39     ` Tom de Vries
2015-11-16 11:39     ` Tom de Vries
2015-11-16 12:41       ` Richard Biener
2015-11-09 17:39 ` [PATCH, 6/16] Add pass_oacc_kernels Tom de Vries
2015-11-11 10:59   ` Richard Biener
2015-11-19 13:51     ` Tom de Vries
2015-11-24 12:17       ` Tom de Vries
2015-11-25 10:42         ` Richard Biener
2016-02-05 12:06   ` Use plain -fopenacc to enable OpenACC kernels processing (was: [PATCH, 6/16] Add pass_oacc_kernels) Thomas Schwinge
2016-02-10 14:40     ` Use plain -fopenacc to enable OpenACC kernels processing Thomas Schwinge
2016-02-15 16:54       ` Tom de Vries
2016-02-23 15:19         ` Thomas Schwinge
2015-11-09 18:14 ` [PATCH, 7/16] Add pass_dominator_oacc_kernels Tom de Vries
2015-11-11 11:05   ` Richard Biener
2015-11-16 12:04     ` Tom de Vries
2015-11-09 18:34 ` [PATCH, 8/16] Add pass_ch_oacc_kernels Tom de Vries
2015-11-11 20:29   ` Tom de Vries
2015-11-30 12:12     ` [gomp4] Use pass_ch instead of pass_ch_oacc_kernels (was: [PATCH, 8/16] Add pass_ch_oacc_kernels) Thomas Schwinge
2015-11-09 19:53 ` [PATCH, 9/16] Add pass_parallelize_loops_oacc_kernels Tom de Vries
2015-11-16 11:59   ` Tom de Vries
2015-11-24 12:27     ` Tom de Vries
2015-12-13 16:58       ` [PIING][PATCH, " Tom de Vries
2015-12-14 15:23         ` Richard Biener
2016-01-16 22:41           ` [Committed] Move pass_expand_omp_ssa out of pass_parallelize_loops Tom de Vries
2016-01-18 12:59           ` [Committed] Allow pass_parallelize_loops to be run outside the loop pipeline Tom de Vries
2016-01-18 13:07           ` [committed] Add oacc_kernels_p argument to pass_parallelize_loops Tom de Vries
2016-01-18 13:30             ` [committed] Add pass_parallelize_loops to pass_oacc_kernels Tom de Vries
2016-01-20  8:54             ` [committed] Add oacc_kernels_p argument to pass_parallelize_loops Thomas Schwinge
2016-01-20 10:31               ` Tom de Vries [this message]
2015-11-09 19:59 ` [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def Tom de Vries
2015-11-11 11:03   ` Richard Biener
2015-11-16 11:55     ` Tom de Vries
2015-11-16 12:45       ` Richard Biener
2015-11-16 23:21         ` Tom de Vries
2015-11-17 10:05           ` Richard Biener
2015-11-17 14:54             ` Tom de Vries
2015-11-17 15:18               ` Richard Biener
2015-11-17 15:39                 ` Tom de Vries
2015-11-17 22:21                   ` [PATCH, PR68373 ] Call scev_const_prop in pass_parallelize_loops::execute Tom de Vries
2015-11-19  9:36                     ` Tom de Vries
2015-11-20 10:15                       ` Richard Biener
2015-11-18  8:30                   ` [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def Richard Biener
2015-11-18 16:22                     ` Bernhard Reutner-Fischer
2015-11-20 12:53                       ` [committed, trivial] Fix typo and trailing whitespace in dump-file strings in parloops Tom de Vries
2015-11-19  0:35               ` [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def Tom de Vries
2015-11-20 10:28                 ` Richard Biener
2015-11-21  8:42                   ` Tom de Vries
2015-11-23 11:31                     ` Richard Biener
2015-11-23 15:53                       ` Tom de Vries
2015-11-23 16:38                         ` Richard Biener
2015-11-19 10:31         ` Tom de Vries
2015-11-20 10:37           ` Richard Biener
2015-11-20 13:27             ` Tom de Vries
2015-11-20 13:29               ` Richard Biener
2015-11-20 16:34                 ` Tom de Vries
2015-11-23 10:11                   ` Richard Biener
2015-11-24 12:22                     ` Tom de Vries
2015-11-24 13:19                       ` Richard Biener
2015-11-24 14:33                         ` Tom de Vries
2015-11-24 14:36                           ` Richard Biener
2015-11-24 15:05                             ` Tom de Vries
2015-11-25 10:43                               ` Richard Biener
2015-11-25 10:44                       ` Richard Biener
2015-11-30 17:48                         ` [gomp4] " Thomas Schwinge
2015-11-22 23:37             ` [PATCH] Don't reapply loops flags if unnecessary in loop_optimizer_init Tom de Vries
2015-11-23 10:33               ` Richard Biener
2015-11-23 11:27                 ` Tom de Vries
2015-11-09 20:02 ` [PATCH, 11/16] Update testcases after adding kernels pass group Tom de Vries
2015-11-11 11:03   ` Richard Biener
2015-11-12 14:32     ` Tom de Vries
2015-11-12 14:43       ` Richard Biener
2015-11-12 15:42         ` David Malcolm
2015-11-13  9:44           ` Richard Biener
2015-11-09 20:06 ` [PATCH, 12/16] Handle acc loop directive Tom de Vries
2015-11-24 12:30   ` [PING][PATCH, " Tom de Vries
2016-01-18 14:27     ` [PING^2][PATCH, " Tom de Vries
2016-01-26 12:38       ` [PING^3][PATCH, " Tom de Vries
2016-01-26 12:50         ` Jakub Jelinek
2016-02-12 11:11           ` Tom de Vries
2016-02-22 10:55             ` Tom de Vries
2016-02-22 10:58               ` Jakub Jelinek
2016-02-29  3:27                 ` Tom de Vries
2016-03-07  8:22                   ` [PING][PATCH, " Tom de Vries
2016-03-14  6:21                     ` [PING^2][PATCH, " Tom de Vries
2015-11-09 20:08 ` [PATCH, 13/16] Add c-c++-common/goacc/kernels-*.c Tom de Vries
2016-01-18 13:33   ` [committed] Add oacc kernels tests in goacc Tom de Vries
2015-11-09 20:09 ` [PATCH, 14/16] Add gfortran.dg/goacc/kernels-*.f95 Tom de Vries
2015-11-09 20:11 ` [PATCH, 15/16] Add libgomp.oacc-c-c++-common/kernels-*.c Tom de Vries
2016-01-18 13:39   ` [comitted] Add oacc kernels test in libgomp Tom de Vries
2016-03-09  9:18   ` [PATCH, 15/16] Add libgomp.oacc-c-c++-common/kernels-*.c Tom de Vries
2016-03-18 12:46     ` Scan for parallelization of the oacc kernels test-cases in gfortran.dg/goacc (was: [PATCH, 15/16] Add libgomp.oacc-c-c++-common/kernels-*.c) Thomas Schwinge
2016-04-05  9:13       ` Scan for parallelization of the oacc kernels test-cases in gfortran.dg/goacc Tom de Vries
2016-04-07 15:26         ` Thomas Schwinge
2015-11-09 20:12 ` [PATCH, 16/16] Add libgomp.oacc-fortran/kernels-*.f95 Tom de Vries
2016-03-09  9:19   ` Tom de Vries
2016-03-16 13:12     ` Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=569F6200.8040204@mentor.com \
    --to=tom_devries@mentor.com \
    --cc=gcc-patches@gnu.org \
    --cc=jakub@redhat.com \
    --cc=rguenther@suse.de \
    --cc=richard.guenther@gmail.com \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).