public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Chung-Lin Tang <chunglin_tang@mentor.com>
To: Thomas Schwinge <Thomas_Schwinge@mentor.com>,
	Chung-Lin Tang	<cltang@codesourcery.com>
Cc: <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH 2/6, OpenACC, libgomp] Async re-work, oacc-* parts
Date: Tue, 18 Dec 2018 10:03:00 -0000	[thread overview]
Message-ID: <c0bde2d9-0a3d-ad48-ee6a-b61b15c17679@mentor.com> (raw)
In-Reply-To: <yxfp7eg8te9w.fsf@hertz.schwinge.homeip.net>

On 2018/12/17 10:32 PM, Thomas Schwinge wrote:
>> The reason there are deadlocks from inside the plugin on GOMP_PLUGIN_fatal() is when we hold the
>> struct gomp_device_descr's*device*  lock, which is also acquired when we execute atexit device shutdown handlers, hence the deadlock.
>>
>> I don't think this is the case for the OpenACC entry points that grab at the openacc.async.* hooks,
> Ah, OK, I see.  (But I thought that method of deadlock had been fixed by
> some structural changes, to have plugin functions call the
> non-terminating "GOMP_PLUGIN_error" and return some error, instead of
> calling "GOMP_PLUGIN_fatal"?  I may be misremembering.  Or, that's
> another TODO item for later, separately...  Or, if that's actually the
> case, that this has been fixed in the way I described, then should these
> functions also be changed accordingly: instead of "GOMP_PLUGIN_fatal"
> call "GOMP_PLUGIN_error", and then return an error code?)

You remembered correctly, although...

>> though I can audit them again if deemed so.
> My understanding had been that deadlock may happen if we're inside some
> of these async/wait/serialize/synchronize functions, with "async" locked,
> then run into an error, then libgomp prepares to abort, and at that time
> shuts down the device, which will shut down the asyncqueues
> ("goacc_fini_asyncqueues"), which will again try to lock "async" -- which
> it actually doesn't.  My misunderstanding, I guess?

...at least now, you can see that goacc_fini_asyncqueues() does not attempt to
acquire devicep->openacc.async.lock when doing cleanup.

Come to think of it, that might be a bug there. :P

>> "If there are two or more host threads executing and sharing the same accelerator device,
>> two asynchronous operations with the same async-value will be enqueued on the same activity queue"
> Right, but then, in the very next sentence, it goes on to state: "If the
> threads are not synchronized with respect to each other, the operations
> may be enqueued in either order and therefore may execute on the device
> in either order".  So this, and given that:

I actually didn't care much about that next sentence, since it's just stating the obvious :)

It also seem to imply that the multiple host threads are enqueuing operations to the same async queue, hence further
corroborating that queues are device-wide, not thread.

>> That said, I recall most (if not all) of the synchronization operations and behavior are all
>> defined to be with respect to operations of the local host thread only, so the spec mentioning interaction with
>> other host threads here may be moot, as there's no way meaningful way to synchronize with
>> the OpenACC activity of other host threads (though correct me if I forgot some API)
> ..., I concluded something must be wrong in the OpenACC 2.6,
> 2.16.1. "async clause" text, and no such (host-side) inter-thread
> synchronization can be expected from OpenACC "async"/"wait".  I've also
> already got that on my list of things to clarify with the OpenACC
> technical committee, later on.

I just remembered, there does seem to be one area where device vs. thread-wide interpretation will be visible:
when using acc_get/set_cuda_stream(). Supposedly, given the specification's device-wide queue/stream model,
different host-threads should access the same CUDA stream when using acc_get/set_cuda_stream().
This will break if we made async queues to be thread-local.

>> Also, CUDA streams do not seem to support local-thread-operation-only synchronization.
>> I remember this was an issue in the old implementation inside the nvptx plugin as well, and we
>> had hacks to somewhat alleviate it (i.e. calling streams "single" or "multi" threaded)
> Right.
> 
>> Well, another issue that we might want to bring up to the OpenACC committee:)
>> I agree that if async queues spaces were simply thread-local then things would be much simpler.
> OK, so you agree with that, good.
> 
> And, no problem foreseeable about simply moving the asyncqueues into
> "goacc_thread" -- and removing the "async" lock?

I think we should still try to solve the potential deadlock problems, and stay close to the current
implementation just for now. We can ask the committee for further guidance later.

Chung-Lin

  reply	other threads:[~2018-12-18 10:03 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-25 13:11 Chung-Lin Tang
2018-12-07 11:33 ` Thomas Schwinge
2018-12-07 14:19   ` Chung-Lin Tang
2018-12-14 14:11     ` Thomas Schwinge
2018-12-14 14:17 ` Thomas Schwinge
2018-12-14 14:52   ` Chung-Lin Tang
2018-12-17 13:52     ` Thomas Schwinge
2018-12-18  9:35       ` Chung-Lin Tang
2018-12-14 14:32 ` Thomas Schwinge
2018-12-14 14:42   ` Chung-Lin Tang
2018-12-17 13:56     ` Thomas Schwinge
2018-12-14 14:54 ` Thomas Schwinge
2018-12-14 15:01   ` Chung-Lin Tang
2018-12-17 14:11     ` Thomas Schwinge
2018-12-14 14:56 ` Thomas Schwinge
2018-12-17 11:03   ` Chung-Lin Tang
2018-12-17 14:32     ` Thomas Schwinge
2018-12-18 10:03       ` Chung-Lin Tang [this message]
2018-12-18 11:44         ` Thomas Schwinge
2018-12-18 15:06 ` [PATCH 2/6, OpenACC, libgomp] Async re-work, oacc-* parts (revised, v2) Chung-Lin Tang
2018-12-18 21:04   ` Thomas Schwinge
2018-12-21 16:25     ` [PATCH 2/6, OpenACC, libgomp] Async re-work, oacc-* parts (revised, v3) Chung-Lin Tang
2018-12-28 14:52       ` Thomas Schwinge
2019-01-02 12:46     ` [PATCH 2/6, OpenACC, libgomp] Async re-work, oacc-* parts (revised, v2) Chung-Lin Tang
2019-01-05  9:47       ` [PATCH 2/6, OpenACC, libgomp] Async re-work, oacc-* parts (revised, v4) Chung-Lin Tang
2019-01-07 14:16         ` Thomas Schwinge
2019-01-08 14:04           ` Chung-Lin Tang
2019-01-07 14:15       ` [PATCH 2/6, OpenACC, libgomp] Async re-work, oacc-* parts (revised, v2) Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c0bde2d9-0a3d-ad48-ee6a-b61b15c17679@mentor.com \
    --to=chunglin_tang@mentor.com \
    --cc=Thomas_Schwinge@mentor.com \
    --cc=cltang@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).