[libgomp, nvptx. committed] Handle per-function max-threads-per-block in default dims

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Tom de Vries <tdevries@suse.de>
To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Thomas Schwinge <thomas@codesourcery.com>,
	Jakub Jelinek <jakub@redhat.com>,
	Cesar Philippidis <cesar@codesourcery.com>
Subject: [libgomp, nvptx. committed] Handle per-function max-threads-per-block in default dims
Date: Mon, 30 Jul 2018 10:20:00 -0000	[thread overview]
Message-ID: <d5349c01-97cc-7591-bba3-40542e71a5a4@suse.de> (raw)

[-- Attachment #1: Type: text/plain, Size: 96 bytes --]

Hi,

Build and reg-tested on x86_64 with nvptx accelerator.

Committed to trunk.

Thanks,
- Tom

[-- Attachment #2: 0003-libgomp-nvptx-Handle-per-function-max-threads-per-block-in-default-dims.patch --]
[-- Type: text/x-patch, Size: 2845 bytes --]

[libgomp, nvptx] Handle per-function max-threads-per-block in default dims

Currently parallel-loop-1.c fails at -O0 on a Quadro M1200, because one of the
kernel launch configurations exceeds the resources available in the device, due
to the default dimensions chosen by the runtime.

This patch fixes that by taking the per-function max_threads_per_block into
account when using the default dimensions.

2018-07-27  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (MIN, MAX): Redefine.
	(nvptx_exec): Ensure worker and vector default dims don't exceed
	targ_fn->max_threads_per_block.

---
 libgomp/plugin/plugin-nvptx.c | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 5c522aaf281..b6ec5f88d59 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -141,6 +141,11 @@ init_cuda_lib (void)
 
 #include "secure_getenv.h"
 
+#undef MIN
+#undef MAX
+#define MIN(X,Y) ((X) < (Y) ? (X) : (Y))
+#define MAX(X,Y) ((X) > (Y) ? (X) : (Y))
+
 /* Convenience macros for the frequently used CUDA library call and
    error handling sequence as well as CUDA library calls that
    do the error checking themselves or don't do it at all.  */
@@ -1135,6 +1140,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
   void *kargs[1];
   void *hp, *dp;
   struct nvptx_thread *nvthd = nvptx_thread ();
+  int warp_size = nvthd->ptx_dev->warp_size;
   const char *maybe_abort_msg = "(perhaps abort was called)";
 
   function = targ_fn->fn;
@@ -1175,7 +1181,6 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 
 	  int gang, worker, vector;
 	  {
-	    int warp_size = nvthd->ptx_dev->warp_size;
 	    int block_size = nvthd->ptx_dev->max_threads_per_block;
 	    int cpu_size = nvthd->ptx_dev->max_threads_per_multiprocessor;
 	    int dev_size = nvthd->ptx_dev->num_sms;
@@ -1213,9 +1218,25 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 	}
       pthread_mutex_unlock (&ptx_dev_lock);
 
-      for (i = 0; i != GOMP_DIM_MAX; i++)
-	if (!dims[i])
-	  dims[i] = nvthd->ptx_dev->default_dims[i];
+      {
+	bool default_dim_p[GOMP_DIM_MAX];
+	for (i = 0; i != GOMP_DIM_MAX; i++)
+	  {
+	    default_dim_p[i] = !dims[i];
+	    if (default_dim_p[i])
+	      dims[i] = nvthd->ptx_dev->default_dims[i];
+	  }
+
+	if (default_dim_p[GOMP_DIM_VECTOR])
+	  dims[GOMP_DIM_VECTOR]
+	    = MIN (dims[GOMP_DIM_VECTOR],
+		   (targ_fn->max_threads_per_block / warp_size * warp_size));
+
+	if (default_dim_p[GOMP_DIM_WORKER])
+	  dims[GOMP_DIM_WORKER]
+	    = MIN (dims[GOMP_DIM_WORKER],
+		   targ_fn->max_threads_per_block / dims[GOMP_DIM_VECTOR]);
+      }
     }
 
   /* Check if the accelerator has sufficient hardware resources to

                 reply	other threads:[~2018-07-30 10:20 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5349c01-97cc-7591-bba3-40542e71a5a4@suse.de \
    --to=tdevries@suse.de \
    --cc=cesar@codesourcery.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jakub@redhat.com \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).