public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Chung-Lin Tang <chunglin.tang@siemens.com>
To: Thomas Schwinge <thomas@codesourcery.com>,
	gcc-patches@gcc.gnu.org, Chung-Lin Tang <cltang@codesourcery.com>,
	Tom de Vries <tdevries@suse.de>
Subject: Re: nvptx: Avoid deadlock in 'cuStreamAddCallback' callback, error case (was: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes)
Date: Fri, 13 Jan 2023 21:17:43 +0800	[thread overview]
Message-ID: <e4cb68a2-d7f2-a0bd-1133-f4a8d4b62728@siemens.com> (raw)
In-Reply-To: <87zgan6eug.fsf@euler.schwinge.homeip.net>

Hi Thomas,

On 2023/1/12 9:51 PM, Thomas Schwinge wrote:
> In my case, 'cuda_callback_wrapper' (expectedly) gets invoked with
> 'res != CUDA_SUCCESS' ("an illegal memory access was encountered").
> When we invoke 'GOMP_PLUGIN_fatal', this attempts to shut down the device
> (..., which deadlocks); that's generally problematic: per
> https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__STREAM.html#group__CUDA__STREAM_1g613d97a277d7640f4cb1c03bd51c2483
> "'cuStreamAddCallback' [...] Callbacks must not make any CUDA API calls".

I remember running into this myself when first creating this async support
(IIRC in my case it was cuFree()-ing something) yet you've found another mistake here! :) 

> Given that eventually we must reach a host/device synchronization point
> (latest when the device is shut down at program termination), and the
> non-'CUDA_SUCCESS' will be upheld until then, it does seem safe to
> replace this 'GOMP_PLUGIN_fatal' with 'GOMP_PLUGIN_error' as per the
> "nvptx: Avoid deadlock in 'cuStreamAddCallback' callback, error case"
> attached.  OK to push?

I think this patch is fine. Actual approval powers are your's or Tom's :)

> 
> (Might we even skip 'GOMP_PLUGIN_error' here, understanding that the
> error will be caught and reported at the next host/device synchronization
> point?  But I've not verified that.)

Actually, the CUDA driver API docs are a bit vague on what exactly this
CUresult arg to the callback actually means. The 'res != CUDA_SUCCESS' handling
here was basically just generic handling. I am not really sure what is the
true right thing to do here (is the error still retained by CUDA after the callback
completes?)

Chung-Lin


  reply	other threads:[~2023-01-13 13:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-25 13:13 [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes Chung-Lin Tang
2018-10-05 14:07 ` Tom de Vries
2018-12-06 20:57 ` Thomas Schwinge
2018-12-10 10:02   ` Chung-Lin Tang
2018-12-11 13:50     ` [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes (revised, v2) Chung-Lin Tang
2018-12-18 15:07       ` [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes (revised, v3) Chung-Lin Tang
2023-01-12 13:51 ` nvptx: Avoid deadlock in 'cuStreamAddCallback' callback, error case (was: [PATCH 6/6, OpenACC, libgomp] Async re-work, nvptx changes) Thomas Schwinge
2023-01-13 13:17   ` Chung-Lin Tang [this message]
2023-01-13 13:59     ` Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4cb68a2-d7f2-a0bd-1133-f4a8d4b62728@siemens.com \
    --to=chunglin.tang@siemens.com \
    --cc=cltang@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=tdevries@suse.de \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).