public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Ilya Verbin <iverbin@gmail.com>
To: Jakub Jelinek <jakub@redhat.com>
Cc: Aldy Hernandez <aldyh@redhat.com>,
	gcc-patches@gcc.gnu.org,	Kirill Yukhin <kirill.yukhin@gmail.com>,
	Thomas Schwinge <thomas@codesourcery.com>,
	Alexander Monakov <amonakov@ispras.ru>,
	Martin Jambor <mjambor@suse.cz>
Subject: Re: [gomp4.5] depend nowait support for target
Date: Thu, 12 Nov 2015 17:44:00 -0000	[thread overview]
Message-ID: <20151112174353.GA4917@msticlxl57.ims.intel.com> (raw)
In-Reply-To: <20151111165222.GL5675@tucnak.redhat.com>

On Wed, Nov 11, 2015 at 17:52:22 +0100, Jakub Jelinek wrote:
> On Mon, Oct 19, 2015 at 10:47:54PM +0300, Ilya Verbin wrote:
> > So, here is what I have for now.  Attached target-29.c testcase works fine with
> > MIC emul, however I don't know how to (and where) properly check for completion
> > of async execution on target.  And, similarly, where to do unmapping after that?
> > Do we need a callback from plugin to libgomp (as far as I understood, PTX
> > runtime supports this, but HSA doesn't), or libgomp will just check for
> > ttask->is_completed in task.c?
> 
> Here is the patch updated to have a task.c defined function that the plugin
> can call upon completion of async offloading exection.

Thanks.

> The testsuite coverage will need to improve, the testcase is wrong
> (contains data races - if you want to test parallel running of two target
> regions that both touch the same var, I'd say best would be to use
> #pragma omp atomic and or in 4 in one case and 1 in another case, then
> test if result is 5 (and similarly for the other var).
> Also, with the usleeps Alex Monakov will be unhappy because PTX newlib does
> not have it, but we'll need to find some solution for that.
> 
> Another thing to work on beyond testsuite coverage (it is desirable to test
> nowait target tasks (both depend and without depend) being awaited in all
> the various waiting spots, i.e. end of parallel, barrier, taskwait, end of
> taskgroup, or if (0) task with depend clause waiting on that.
> 
> Also, I wonder what to do if #pragma omp target nowait is used outside of
> (host) parallel - when team is NULL.  All the tasking code in that case just
> executes tasks undeferred, which is fine for all but target nowait - there
> it is I'd say useful to be able to run a single host thread concurrently
> with some async offloading tasks.  So, I wonder if in that case,
> if we encounter target nowait with team == NULL, should not just create a
> dummy non-active (nthreads == 1) team, as if there was #pragma omp parallel
> if (0) starting above it and ending at program's end.  In OpenMP, the
> program's initial thread is implicitly surrounded by inactive parallel, so
> this isn't anything against the OpenMP execution model.  But we'd need to
> free the team somewhere in a destructor.
>
> Can you please try to cleanup the liboffloadmic side of this, so that
> a callback instead of hardcoded __gomp_offload_intelmic_async_completed call
> is used?

Do you mean something like the patch bellow?  I'll discuss it with liboffloadmic
maintainers.

> Can you make sure it works on XeonPhi non-emulated too?

I'm trying to do it, but it will take some time...

Unfortunately, target-32.c fails for me using emulation mode:

Program received signal SIGSEGV, Segmentation fault.
#0  0x00007ff4ab1265ed in priority_list_remove (list=0x0, node=0x7ff49001afa0, model=MEMMODEL_RELAXED) at libgomp/priority_queue.h:422
#1  0x00007ff4ab1266d9 in priority_tree_remove (type=PQ_CHILDREN, head=0x1883138, node=0x7ff49001afa0) at libgomp/priority_queue.c:195
#2  0x00007ff4ab10fa06 in priority_queue_remove (type=PQ_CHILDREN, head=0x1883138, task=0x7ff49001af30, model=MEMMODEL_RELAXED) at libgomp/priority_queue.h:468
#3  0x00007ff4ab11570d in gomp_task_maybe_wait_for_dependencies (depend=0x7ff49b0d9de0) at libgomp/task.c:1539
#4  0x00007ff4ab11fd46 in GOMP_target_enter_exit_data (device=-1, mapnum=3, hostaddrs=0x7ff49b0d9dc0, sizes=0x6020b0 <.omp_data_sizes.38>, kinds=0x6020a0 <.omp_data_kinds.39>, flags=2, depend=0x7ff49b0d9de0) at libgomp/target.c:1662
#5  0x00000000004011f9 in main._omp_fn ()
#6  0x00007ff4ab1160f3 in gomp_thread_start (xdata=0x7fffe93766a0) at libgomp/team.c:119
#7  0x0000003b07e07ee5 in start_thread () from /lib64/libpthread.so.0
#8  0x0000003b076f4b8d in clone () from /lib64/libc.so.6

However when I manually run commands from testsuite/libgomp.log under the same
environment, it passes.  Don't know where is the difference.

Also I tried to replace 'b = 4;' and 'b = 5;' with infinite loops, but got only
100% CPU usage in offload_target_main instead of 200%, so it seems that only one
target task is running concurrently.


diff --git a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
index 6da09b1..772e198 100644
--- a/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
+++ b/liboffloadmic/plugin/libgomp-plugin-intelmic.cpp
@@ -220,6 +220,10 @@ static void
 register_main_image ()
 {
   __offload_register_image (&main_target_image);
+
+  /* liboffloadmic will call GOMP_PLUGIN_target_task_completion when
+     asynchronous task on target is completed.  */
+  __offload_register_task_callback (GOMP_PLUGIN_target_task_completion);
 }
 
 /* liboffloadmic loads and runs offload_target_main on all available devices
@@ -537,13 +541,3 @@ GOMP_OFFLOAD_run (int device, void *tgt_fn, void *tgt_vars)
 
   GOMP_OFFLOAD_async_run (device, tgt_fn, tgt_vars, NULL);
 }
-
-/* Called by liboffloadmic when asynchronous function is completed.  */
-
-extern "C" void
-__gomp_offload_intelmic_async_completed (const void *async_data)
-{
-  TRACE ("(async_data = %p)", async_data);
-
-  GOMP_PLUGIN_target_task_completion ((void *) async_data);
-}
diff --git a/liboffloadmic/runtime/offload_host.cpp b/liboffloadmic/runtime/offload_host.cpp
index a150410..e52019d 100644
--- a/liboffloadmic/runtime/offload_host.cpp
+++ b/liboffloadmic/runtime/offload_host.cpp
@@ -64,8 +64,7 @@ static void __offload_fini_library(void);
 #define GET_OFFLOAD_NUMBER(timer_data) \
     timer_data? timer_data->offload_number : 0
 
-extern "C" void
-__gomp_offload_intelmic_async_completed (const void *);
+static void (*task_completion_callback)(void *);
 
 extern "C" {
 #ifdef TARGET_WINNT
@@ -2510,7 +2509,7 @@ extern "C" {
         const void *info
     )
     {
-	__gomp_offload_intelmic_async_completed (info);
+	task_completion_callback ((void *) info);
     }
 }
 
@@ -5672,6 +5671,11 @@ extern "C" void __offload_unregister_image(const void *target_image)
     }
 }
 
+extern "C" void __offload_register_task_callback(void (*cb)(void *))
+{
+    task_completion_callback = cb;
+}
+
 // Runtime trace interface for user programs
 
 void __offload_console_trace(int level)
diff --git a/liboffloadmic/runtime/offload_host.h b/liboffloadmic/runtime/offload_host.h
index afd5c99..2a43fd6 100644
--- a/liboffloadmic/runtime/offload_host.h
+++ b/liboffloadmic/runtime/offload_host.h
@@ -376,6 +376,9 @@ extern "C" bool __offload_target_image_is_executable(const void *target_image);
 extern "C" bool __offload_register_image(const void* image);
 extern "C" void __offload_unregister_image(const void* image);
 
+// Registers asynchronous task completion callback
+extern "C" void __offload_register_task_callback(void (*cb)(void *));
+
 // Initializes offload runtime library.
 DLL_LOCAL extern int __offload_init_library(void);
 

  -- Ilya

  reply	other threads:[~2015-11-12 17:44 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-08  9:26 [gomp4.1] depend nowait support for target {update,{enter,exit} data} Jakub Jelinek
2015-10-02 19:28 ` Ilya Verbin
2015-10-15 14:02   ` Jakub Jelinek
2015-10-15 16:18     ` Alexander Monakov
2015-10-15 17:18       ` Jakub Jelinek
2015-10-15 18:11         ` Alexander Monakov
2015-10-15 16:42     ` Ilya Verbin
2015-10-16 11:50     ` Martin Jambor
2015-10-19 19:55     ` Ilya Verbin
2015-11-11 16:52       ` [gomp4.5] depend nowait support for target Jakub Jelinek
2015-11-12 17:44         ` Ilya Verbin [this message]
2015-11-12 17:58           ` Jakub Jelinek
2015-11-12 18:07             ` Ilya Verbin
2015-11-12 17:45         ` Jakub Jelinek
2015-11-12 20:52           ` Ilya Verbin
2015-11-13 10:18             ` Jakub Jelinek
2015-11-13 15:12               ` Jakub Jelinek
2015-11-13 16:37                 ` Ilya Verbin
2015-11-13 16:42                   ` Jakub Jelinek
2015-11-13 18:37                     ` Ilya Verbin
2015-11-23 14:16                 ` [hsa] " Martin Jambor
2015-11-23 14:25                   ` Jakub Jelinek
2015-11-25 15:41                     ` Martin Jambor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151112174353.GA4917@msticlxl57.ims.intel.com \
    --to=iverbin@gmail.com \
    --cc=aldyh@redhat.com \
    --cc=amonakov@ispras.ru \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=kirill.yukhin@gmail.com \
    --cc=mjambor@suse.cz \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).