* [PATCH] [og9] Add omp_pause_resource{,_all} for AMD GCN @ 2019-09-06 16:02 Julian Brown 2019-09-06 16:02 ` [PATCH] [og9] OpenACC profiling support " Julian Brown 2019-09-06 16:02 ` [PATCH] [og9] Use more appropriate var in localize_reductions call Julian Brown 0 siblings, 2 replies; 4+ messages in thread From: Julian Brown @ 2019-09-06 16:02 UTC (permalink / raw) To: gcc-patches; +Cc: Thomas Schwinge, Andrew Stubbs This patch adds some missing functions to the AMD GCN-specific target.c file. This fixes several link errors in the testsuite. Tested with offloading to AMD GCN. I will apply to the openacc-gcc-9-branch shortly. Julian ChangeLog libgomp/ * config/gcn/target.c (omp_pause_resource, omp_pause_resource_all): New functions, plus ialiases. --- libgomp/ChangeLog.openacc | 5 +++++ libgomp/config/gcn/target.c | 18 ++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/libgomp/ChangeLog.openacc b/libgomp/ChangeLog.openacc index 438bd59b47b..d7a4c7a5f8a 100644 --- a/libgomp/ChangeLog.openacc +++ b/libgomp/ChangeLog.openacc @@ -1,3 +1,8 @@ +2019-09-06 Julian Brown <julian@codesourcery.com> + + * config/gcn/target.c (omp_pause_resource, omp_pause_resource_all): New + functions, plus ialiases. + 2019-09-05 Julian Brown <julian@codesourcery.com> * plugin/plugin-gcn.c (gcn_exec): Change default number of workers to diff --git a/libgomp/config/gcn/target.c b/libgomp/config/gcn/target.c index 5ec57778772..db00551e695 100644 --- a/libgomp/config/gcn/target.c +++ b/libgomp/config/gcn/target.c @@ -47,3 +47,21 @@ GOMP_teams (unsigned int num_teams, unsigned int thread_limit) } gomp_num_teams_var = num_teams - 1; } + +int +omp_pause_resource (omp_pause_resource_t kind, int device_num) +{ + (void) kind; + (void) device_num; + return -1; +} + +int +omp_pause_resource_all (omp_pause_resource_t kind) +{ + (void) kind; + return -1; +} + +ialias (omp_pause_resource) +ialias (omp_pause_resource_all) -- 2.22.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] [og9] OpenACC profiling support for AMD GCN 2019-09-06 16:02 [PATCH] [og9] Add omp_pause_resource{,_all} for AMD GCN Julian Brown @ 2019-09-06 16:02 ` Julian Brown 2023-03-10 14:07 ` Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position (was: [PATCH] [og9] OpenACC profiling support for AMD GCN) Thomas Schwinge 2019-09-06 16:02 ` [PATCH] [og9] Use more appropriate var in localize_reductions call Julian Brown 1 sibling, 1 reply; 4+ messages in thread From: Julian Brown @ 2019-09-06 16:02 UTC (permalink / raw) To: gcc-patches; +Cc: Thomas Schwinge, Andrew Stubbs This patch adds profiling support to the AMD GCN libgomp plugin, modeled after the equivalent support in the NVPTX plugin. This gives a positive test delta in AMD GCN offload testing. I will apply to the openacc-gcc-9-branch shortly. Julian 2019-09-06 Julian Brown <julian@codesourcery.com> libgomp/ * plugin/plugin-gcn.c (GOMP_OFFLOAD_alloc_by_agent, GOMP_OFFLOAD_free, gcn_exec): Add profiling support. * testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Add GCN support. * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Likewise. --- libgomp/ChangeLog.openacc | 9 ++ libgomp/plugin/plugin-gcn.c | 96 +++++++++++++++++++ .../acc_prof-init-1.c | 2 + .../acc_prof-kernels-1.c | 4 + .../acc_prof-parallel-1.c | 12 +++ 5 files changed, 123 insertions(+) diff --git a/libgomp/ChangeLog.openacc b/libgomp/ChangeLog.openacc index d7a4c7a5f8a..8ed0a10a589 100644 --- a/libgomp/ChangeLog.openacc +++ b/libgomp/ChangeLog.openacc @@ -1,3 +1,12 @@ +2019-09-06 Julian Brown <julian@codesourcery.com> + + * plugin/plugin-gcn.c (GOMP_OFFLOAD_alloc_by_agent, + GOMP_OFFLOAD_free, gcn_exec): Add profiling support. + * testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Add GCN + support. + * testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise. + * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Likewise. + 2019-09-06 Julian Brown <julian@codesourcery.com> * config/gcn/target.c (omp_pause_resource, omp_pause_resource_all): New diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index f0b22ebc3d7..2f273967bad 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -3024,6 +3024,35 @@ GOMP_OFFLOAD_alloc_by_agent (struct agent_info *agent, size_t size) return NULL; } + struct goacc_thread *thr = GOMP_PLUGIN_goacc_thread (); + bool profiling_dispatch_p + = __builtin_expect (thr != NULL && thr->prof_info != NULL, false); + if (profiling_dispatch_p) + { + acc_prof_info *prof_info = thr->prof_info; + acc_event_info data_event_info; + acc_api_info *api_info = thr->api_info; + + prof_info->event_type = acc_ev_alloc; + + data_event_info.data_event.event_type = prof_info->event_type; + data_event_info.data_event.valid_bytes + = _ACC_DATA_EVENT_INFO_VALID_BYTES; + data_event_info.data_event.parent_construct + = acc_construct_parallel; + data_event_info.data_event.implicit = 1; + data_event_info.data_event.tool_info = NULL; + data_event_info.data_event.var_name = NULL; + data_event_info.data_event.bytes = size; + data_event_info.data_event.host_ptr = NULL; + data_event_info.data_event.device_ptr = (void *) ptr; + + api_info->device_api = acc_device_api_other; + + GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, &data_event_info, + api_info); + } + return ptr; } @@ -3046,6 +3075,35 @@ GOMP_OFFLOAD_free (int device, void *ptr) return false; } + struct goacc_thread *thr = GOMP_PLUGIN_goacc_thread (); + bool profiling_dispatch_p + = __builtin_expect (thr != NULL && thr->prof_info != NULL, false); + if (profiling_dispatch_p) + { + acc_prof_info *prof_info = thr->prof_info; + acc_event_info data_event_info; + acc_api_info *api_info = thr->api_info; + + prof_info->event_type = acc_ev_free; + + data_event_info.data_event.event_type = prof_info->event_type; + data_event_info.data_event.valid_bytes + = _ACC_DATA_EVENT_INFO_VALID_BYTES; + data_event_info.data_event.parent_construct + = acc_construct_parallel; + data_event_info.data_event.implicit = 1; + data_event_info.data_event.tool_info = NULL; + data_event_info.data_event.var_name = NULL; + data_event_info.data_event.bytes = 0; + data_event_info.data_event.host_ptr = NULL; + data_event_info.data_event.device_ptr = (void *) ptr; + + api_info->device_api = acc_device_api_other; + + GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, &data_event_info, + api_info); + } + return true; } @@ -3276,6 +3334,35 @@ gcn_exec (struct kernel_info *kernel, size_t mapnum, void **hostaddrs, {1, 64, 16} }; + struct goacc_thread *thr = GOMP_PLUGIN_goacc_thread (); + acc_prof_info *prof_info = thr->prof_info; + acc_event_info enqueue_launch_event_info; + acc_api_info *api_info = thr->api_info; + bool profiling_dispatch_p = __builtin_expect (prof_info != NULL, false); + if (profiling_dispatch_p) + { + prof_info->event_type = acc_ev_enqueue_launch_start; + + enqueue_launch_event_info.launch_event.event_type + = prof_info->event_type; + enqueue_launch_event_info.launch_event.valid_bytes + = _ACC_LAUNCH_EVENT_INFO_VALID_BYTES; + enqueue_launch_event_info.launch_event.parent_construct + = acc_construct_parallel; + enqueue_launch_event_info.launch_event.implicit = 1; + enqueue_launch_event_info.launch_event.tool_info = NULL; + enqueue_launch_event_info.launch_event.kernel_name + = (char *) kernel->name; + enqueue_launch_event_info.launch_event.num_gangs = kla.gdims[0]; + enqueue_launch_event_info.launch_event.num_workers = kla.gdims[2]; + enqueue_launch_event_info.launch_event.vector_length = kla.gdims[1]; + + api_info->device_api = acc_device_api_other; + + GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, + &enqueue_launch_event_info, api_info); + } + if (!async) { run_kernel (kernel, ind_da, &kla, NULL, false); @@ -3289,6 +3376,15 @@ gcn_exec (struct kernel_info *kernel, size_t mapnum, void **hostaddrs, aq->agent->device_id, aq->id, ind_da); queue_push_callback (aq, gomp_offload_free, ind_da); } + + if (profiling_dispatch_p) + { + prof_info->event_type = acc_ev_enqueue_launch_end; + enqueue_launch_event_info.launch_event.event_type = prof_info->event_type; + GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, + &enqueue_launch_event_info, + api_info); + } } void diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c index 6a44e8ffb6a..cf980f1baec 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c @@ -224,6 +224,8 @@ static void cb_compute_construct_end (acc_prof_info *prof_info, acc_event_info * if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c index 269b4398478..9c1cfbe292c 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c @@ -107,6 +107,8 @@ static void cb_enqueue_launch_start (acc_prof_info *prof_info, acc_event_info *e assert (event_info->launch_event.vector_length >= 1); else if (acc_device_type == acc_device_nvidia) /* ... is special. */ assert (event_info->launch_event.vector_length == 32); + else if (acc_device_type == acc_device_gcn) /* ...and so is this. */ + assert (event_info->launch_event.vector_length == 64); else { #ifdef __OPTIMIZE__ @@ -119,6 +121,8 @@ static void cb_enqueue_launch_start (acc_prof_info *prof_info, acc_event_info *e if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c index 116b9b538a6..5d392511592 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c @@ -265,6 +265,8 @@ static void cb_enter_data_end (acc_prof_info *prof_info, acc_event_info *event_i if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); @@ -319,6 +321,8 @@ static void cb_exit_data_start (acc_prof_info *prof_info, acc_event_info *event_ if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); @@ -371,6 +375,8 @@ static void cb_exit_data_end (acc_prof_info *prof_info, acc_event_info *event_in if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); @@ -510,6 +516,8 @@ static void cb_compute_construct_end (acc_prof_info *prof_info, acc_event_info * if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); @@ -573,6 +581,8 @@ static void cb_enqueue_launch_start (acc_prof_info *prof_info, acc_event_info *e if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); @@ -637,6 +647,8 @@ static void cb_enqueue_launch_end (acc_prof_info *prof_info, acc_event_info *eve if (acc_device_type == acc_device_host) assert (api_info->device_api == acc_device_api_none); + else if (acc_device_type == acc_device_gcn) + assert (api_info->device_api == acc_device_api_other); else assert (api_info->device_api == acc_device_api_cuda); assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); -- 2.22.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position (was: [PATCH] [og9] OpenACC profiling support for AMD GCN) 2019-09-06 16:02 ` [PATCH] [og9] OpenACC profiling support " Julian Brown @ 2023-03-10 14:07 ` Thomas Schwinge 0 siblings, 0 replies; 4+ messages in thread From: Thomas Schwinge @ 2023-03-10 14:07 UTC (permalink / raw) To: Julian Brown, gcc-patches; +Cc: Andrew Stubbs [-- Attachment #1: Type: text/plain, Size: 3604 bytes --] Hi! On 2019-09-06T09:02:13-0700, Julian Brown <julian@codesourcery.com> wrote: > This patch adds profiling support to the AMD GCN libgomp plugin, modeled > after the equivalent support in the NVPTX plugin. This gives a positive > test delta in AMD GCN offload testing. Yay! \o/ > I will apply to the openacc-gcc-9-branch shortly. ..., and later these changes got into master branch, via integration into "[PATCH 7/7 libgomp,amdgcn] GCN Libgomp Plugin". > --- a/libgomp/plugin/plugin-gcn.c > +++ b/libgomp/plugin/plugin-gcn.c | static void | gomp_offload_free (void *ptr) | { | GCN_DEBUG ("Async thread ?:?: Freeing %p\n", ptr); | GOMP_OFFLOAD_free (0, ptr); | } > @@ -3046,6 +3075,35 @@ GOMP_OFFLOAD_free (int device, void *ptr) > return false; > } > > + struct goacc_thread *thr = GOMP_PLUGIN_goacc_thread (); > + bool profiling_dispatch_p > + = __builtin_expect (thr != NULL && thr->prof_info != NULL, false); > + if (profiling_dispatch_p) > + { > + [...] > + prof_info->event_type = acc_ev_free; > + > + [...] > + GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, &data_event_info, > + api_info); > + } > + > return true; > } > > @@ -3276,6 +3334,35 @@ gcn_exec (struct kernel_info *kernel, size_t mapnum, void **hostaddrs, > {1, 64, 16} > }; > > + struct goacc_thread *thr = GOMP_PLUGIN_goacc_thread (); > + acc_prof_info *prof_info = thr->prof_info; > + acc_event_info enqueue_launch_event_info; > + acc_api_info *api_info = thr->api_info; > + bool profiling_dispatch_p = __builtin_expect (prof_info != NULL, false); > + if (profiling_dispatch_p) > + { > + prof_info->event_type = acc_ev_enqueue_launch_start; > + > + [...] > + GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, > + &enqueue_launch_event_info, api_info); > + } > + > if (!async) > { > run_kernel (kernel, ind_da, &kla, NULL, false); | gomp_offload_free (ind_da); | } | else | { | queue_push_launch (aq, kernel, ind_da, &kla); | if (DEBUG_QUEUES) | GCN_DEBUG ("queue_push_callback %d:%d gomp_offload_free, %p\n", > aq->agent->device_id, aq->id, ind_da); > queue_push_callback (aq, gomp_offload_free, ind_da); > } > + > + if (profiling_dispatch_p) > + { > + prof_info->event_type = acc_ev_enqueue_launch_end; > + enqueue_launch_event_info.launch_event.event_type = prof_info->event_type; > + GOMP_PLUGIN_goacc_profiling_dispatch (prof_info, > + &enqueue_launch_event_info, > + api_info); > + } > } Per that, we've currently got: - [...] - acc_ev_enqueue_launch_start - launch kernel - free memory - acc_ev_free - acc_ev_enqueue_launch_end This confused another thing that I'm working on, so I adjusted that to: - [...] - acc_ev_enqueue_launch_start - launch kernel - acc_ev_enqueue_launch_end - free memory - acc_ev_free Pushed to master branch commit 649f1939baf11f45fd3579b8b9601c7840a097b3 "Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position", see attached. Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Fix-OpenACC-GCN-acc_ev_enqueue_launch_end-position.patch --] [-- Type: text/x-diff, Size: 12617 bytes --] From 649f1939baf11f45fd3579b8b9601c7840a097b3 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <thomas@codesourcery.com> Date: Thu, 2 Mar 2023 10:39:09 +0100 Subject: [PATCH] Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position For an OpenACC compute construct, we've currently got: - [...] - acc_ev_enqueue_launch_start - launch kernel - free memory - acc_ev_free - acc_ev_enqueue_launch_end This confused another thing that I'm working on, so I adjusted that to: - [...] - acc_ev_enqueue_launch_start - launch kernel - acc_ev_enqueue_launch_end - free memory - acc_ev_free Correspondingly, verify 'acc_ev_alloc', 'acc_ev_free' in 'libgomp.oacc-c-c++-common/acc_prof-parallel-1.c'. libgomp/ * plugin/plugin-gcn.c (gcn_exec): Fix 'acc_ev_enqueue_launch_end' position. * testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Verify 'acc_ev_alloc', 'acc_ev_free'. --- libgomp/plugin/plugin-gcn.c | 23 +- .../acc_prof-parallel-1.c | 202 ++++++++++++++++-- 2 files changed, 195 insertions(+), 30 deletions(-) diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index 11ce6b0fa8d..96920a48d2e 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -3192,18 +3192,9 @@ gcn_exec (struct kernel_info *kernel, size_t mapnum, void **hostaddrs, } if (!async) - { - run_kernel (kernel, ind_da, &kla, NULL, false); - gomp_offload_free (ind_da); - } + run_kernel (kernel, ind_da, &kla, NULL, false); else - { - queue_push_launch (aq, kernel, ind_da, &kla); - if (DEBUG_QUEUES) - GCN_DEBUG ("queue_push_callback %d:%d gomp_offload_free, %p\n", - aq->agent->device_id, aq->id, ind_da); - queue_push_callback (aq, gomp_offload_free, ind_da); - } + queue_push_launch (aq, kernel, ind_da, &kla); if (profiling_dispatch_p) { @@ -3213,6 +3204,16 @@ gcn_exec (struct kernel_info *kernel, size_t mapnum, void **hostaddrs, &enqueue_launch_event_info, api_info); } + + if (!async) + gomp_offload_free (ind_da); + else + { + if (DEBUG_QUEUES) + GCN_DEBUG ("queue_push_callback %d:%d gomp_offload_free, %p\n", + aq->agent->device_id, aq->id, ind_da); + queue_push_callback (aq, gomp_offload_free, ind_da); + } } /* }}} */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c index 1f503861cb6..cbf23d7d83b 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c @@ -195,6 +195,139 @@ static void cb_device_init_end (acc_prof_info *prof_info, acc_event_info *event_ #endif } +static void cb_alloc (acc_prof_info *prof_info, acc_event_info *event_info, acc_api_info *api_info) +{ + DEBUG_printf ("%s\n", __FUNCTION__); + +#if DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT +# error TODO +#else + assert (state == 4 + || state == 6 + || state == 104 + || state == 106); + STATE_OP (state, ++); + + if (state == 5 + || state == 105) + { + assert (tool_info != NULL); + assert (tool_info->event_info.other_event.event_type == acc_ev_compute_construct_start); + assert (tool_info->nested != NULL); + assert (tool_info->nested->event_info.other_event.event_type == acc_ev_enter_data_start); + assert (tool_info->nested->nested == NULL); + } + else if (state == 7 + || state == 107) + { + assert (tool_info != NULL); + assert (tool_info->event_info.other_event.event_type == acc_ev_compute_construct_start); + assert (tool_info->nested == NULL); + } + else + abort (); +#endif + + assert (prof_info->event_type == acc_ev_alloc); + assert (prof_info->valid_bytes == _ACC_PROF_INFO_VALID_BYTES); + assert (prof_info->version == _ACC_PROF_INFO_VERSION); + assert (prof_info->device_type == acc_device_type); + assert (prof_info->device_number == acc_device_num); + assert (prof_info->thread_id == -1); + assert (prof_info->async == acc_async); + assert (prof_info->async_queue == prof_info->async); + assert (prof_info->src_file == NULL); + assert (prof_info->func_name == NULL); + assert (prof_info->line_no == -1); + assert (prof_info->end_line_no == -1); + assert (prof_info->func_line_no == -1); + assert (prof_info->func_end_line_no == -1); + + assert (event_info->data_event.event_type == prof_info->event_type); + assert (event_info->data_event.valid_bytes == _ACC_DATA_EVENT_INFO_VALID_BYTES); + assert (event_info->data_event.parent_construct == acc_construct_parallel); + assert (event_info->data_event.implicit == 1); + assert (event_info->data_event.tool_info == NULL); + assert (event_info->data_event.var_name == NULL); + assert (event_info->data_event.bytes != 0); + assert (event_info->data_event.host_ptr == NULL); + assert (event_info->data_event.device_ptr != NULL); + + assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); + assert (api_info->device_type == prof_info->device_type); + assert (api_info->vendor == -1); + assert (api_info->device_handle == NULL); + assert (api_info->context_handle == NULL); + assert (api_info->async_handle == NULL); +} + +static void cb_free (acc_prof_info *prof_info, acc_event_info *event_info, acc_api_info *api_info) +{ + DEBUG_printf ("%s\n", __FUNCTION__); + +#if DEVICE_INIT_INSIDE_COMPUTE_CONSTRUCT +# error TODO +#else + assert (state == 9 + || state == 11); + STATE_OP (state, ++); + + if (state == 10) + { + assert (tool_info != NULL); + assert (tool_info->event_info.other_event.event_type == acc_ev_compute_construct_start); + assert (tool_info->nested == NULL); + } + else if (state == 12) + { + assert (tool_info != NULL); + assert (tool_info->event_info.other_event.event_type == acc_ev_compute_construct_start); + assert (tool_info->nested != NULL); + assert (tool_info->nested->event_info.other_event.event_type == acc_ev_exit_data_start); + assert (tool_info->nested->nested == NULL); + } + else + abort (); +#endif + + assert (prof_info->event_type == acc_ev_free); + assert (prof_info->valid_bytes == _ACC_PROF_INFO_VALID_BYTES); + assert (prof_info->version == _ACC_PROF_INFO_VERSION); + assert (prof_info->device_type == acc_device_type); + assert (prof_info->device_number == acc_device_num); + assert (prof_info->thread_id == -1); + assert (prof_info->async == acc_async); + assert (prof_info->async_queue == prof_info->async); + assert (prof_info->src_file == NULL); + assert (prof_info->func_name == NULL); + assert (prof_info->line_no == -1); + assert (prof_info->end_line_no == -1); + assert (prof_info->func_line_no == -1); + assert (prof_info->func_end_line_no == -1); + + assert (event_info->data_event.event_type == prof_info->event_type); + assert (event_info->data_event.valid_bytes == _ACC_DATA_EVENT_INFO_VALID_BYTES); + assert (event_info->data_event.parent_construct == acc_construct_parallel); + assert (event_info->data_event.implicit == 1); + assert (event_info->data_event.tool_info == NULL); + assert (event_info->data_event.var_name == NULL); + if (acc_device_type == acc_device_nvidia) + assert (event_info->data_event.bytes == (size_t) -1); + else if (acc_device_type == acc_device_radeon) + assert (event_info->data_event.bytes == 0); + else + abort (); + assert (event_info->data_event.host_ptr == NULL); + assert (event_info->data_event.device_ptr != NULL); + + assert (api_info->valid_bytes == _ACC_API_INFO_VALID_BYTES); + assert (api_info->device_type == prof_info->device_type); + assert (api_info->vendor == -1); + assert (api_info->device_handle == NULL); + assert (api_info->context_handle == NULL); + assert (api_info->async_handle == NULL); +} + static void cb_enter_data_start (acc_prof_info *prof_info, acc_event_info *event_info, acc_api_info *api_info) { DEBUG_printf ("%s\n", __FUNCTION__); @@ -246,8 +379,8 @@ static void cb_enter_data_end (acc_prof_info *prof_info, acc_event_info *event_i { DEBUG_printf ("%s\n", __FUNCTION__); - assert (state == 4 - || state == 104); + assert (state == 5 + || state == 105); #if defined COPYIN /* Conceptually, 'acc_ev_enter_data_end' marks the end of data copying, before 'acc_ev_enqueue_launch_start' marks invoking the compute region. @@ -316,9 +449,19 @@ static void cb_exit_data_start (acc_prof_info *prof_info, acc_event_info *event_ { DEBUG_printf ("%s\n", __FUNCTION__); - assert (state == 7 #if ASYNC_EXIT_DATA - || state == 107 + if (acc_async != acc_async_sync) + { + /* Compensate for the deferred 'acc_ev_free'. */ + state += 1; + } +#else +# error TODO +#endif + + assert (state == 10 +#if ASYNC_EXIT_DATA + || state == 110 #endif ); STATE_OP (state, ++); @@ -366,15 +509,25 @@ static void cb_exit_data_start (acc_prof_info *prof_info, acc_event_info *event_ tool_info->nested->event_info.other_event.event_type = event_info->other_event.event_type; event_info->other_event.tool_info = tool_info->nested; + +#if ASYNC_EXIT_DATA + if (acc_async != acc_async_sync) + { + /* Compensate for the deferred 'acc_ev_free'. */ + state += 1; + } +#else +# error TODO +#endif } static void cb_exit_data_end (acc_prof_info *prof_info, acc_event_info *event_info, acc_api_info *api_info) { DEBUG_printf ("%s\n", __FUNCTION__); - assert (state == 8 + assert (state == 12 #if ASYNC_EXIT_DATA - || state == 108 + || state == 112 #endif ); STATE_OP (state, ++); @@ -488,6 +641,8 @@ static void cb_compute_construct_start (acc_prof_info *prof_info, acc_event_info { /* Compensate for the missing 'acc_ev_enter_data_start'. */ state += 1; + /* Compensate for the missing 'acc_ev_alloc'. */ + state += 1; } } @@ -499,12 +654,19 @@ static void cb_compute_construct_end (acc_prof_info *prof_info, acc_event_info * { /* Compensate for the missing 'acc_ev_enter_data_end'. */ state += 1; + /* Compensate for the missing 'acc_ev_alloc'. */ + state += 1; /* Compensate for the missing 'acc_ev_enqueue_launch_start' and 'acc_ev_enqueue_launch_end'. */ state += 2; - /* Compensate for the missing 'acc_ev_exit_data_start' and - 'acc_ev_exit_data_end'. */ - state += 2; + /* Compensate for the missing 'acc_ev_free'. */ + state += 1; + /* Compensate for the missing 'acc_ev_exit_data_start'. */ + state += 1; + /* Compensate for the missing 'acc_ev_free'. */ + state += 1; + /* Compensate for the missing 'acc_ev_exit_data_end'. */ + state += 1; } #if !ASYNC_EXIT_DATA else if (acc_async != acc_async_sync) @@ -514,8 +676,8 @@ static void cb_compute_construct_end (acc_prof_info *prof_info, acc_event_info * state += 2; } #endif - assert (state == 9 - || state == 109); + assert (state == 13 + || state == 113); STATE_OP (state, ++); assert (tool_info != NULL); @@ -569,8 +731,8 @@ static void cb_enqueue_launch_start (acc_prof_info *prof_info, acc_event_info *e assert (acc_device_type != acc_device_host); - assert (state == 5 - || state == 105); + assert (state == 7 + || state == 107); STATE_OP (state, ++); assert (tool_info != NULL); @@ -638,8 +800,8 @@ static void cb_enqueue_launch_end (acc_prof_info *prof_info, acc_event_info *eve assert (acc_device_type != acc_device_host); - assert (state == 6 - || state == 106); + assert (state == 8 + || state == 108); STATE_OP (state, ++); assert (tool_info != NULL); @@ -703,6 +865,8 @@ int main() STATE_OP (state, = 0); reg (acc_ev_device_init_start, cb_device_init_start, acc_reg); reg (acc_ev_device_init_end, cb_device_init_end, acc_reg); + reg (acc_ev_alloc, cb_alloc, acc_reg); + reg (acc_ev_free, cb_free, acc_reg); reg (acc_ev_enter_data_start, cb_enter_data_start, acc_reg); reg (acc_ev_enter_data_end, cb_enter_data_end, acc_reg); reg (acc_ev_exit_data_start, cb_exit_data_start, acc_reg); @@ -725,9 +889,9 @@ int main() state_init = state; } - assert (state_init == 4); + assert (state_init == 5); } - assert (state == 10); + assert (state == 14); STATE_OP (state, = 100); @@ -742,9 +906,9 @@ int main() } acc_async = acc_async_sync; #pragma acc wait - assert (state_init == 104); + assert (state_init == 105); } - assert (state == 110); + assert (state == 114); return 0; } -- 2.25.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] [og9] Use more appropriate var in localize_reductions call 2019-09-06 16:02 [PATCH] [og9] Add omp_pause_resource{,_all} for AMD GCN Julian Brown 2019-09-06 16:02 ` [PATCH] [og9] OpenACC profiling support " Julian Brown @ 2019-09-06 16:02 ` Julian Brown 1 sibling, 0 replies; 4+ messages in thread From: Julian Brown @ 2019-09-06 16:02 UTC (permalink / raw) To: gcc-patches; +Cc: Thomas Schwinge, Andrew Stubbs This patch uses a more appropriate local variable in the call to localize_reductions in gimplify_omp_for, for better self-documentation. Tested with offloading to AMD GCN. I will apply to the openacc-gcc-9-branch shortly. Julian ChangeLog gcc/ * gimplify.c (gimplify_omp_for): Use for_stmt in call to localize_reductions. --- gcc/ChangeLog.openacc | 5 +++++ gcc/gimplify.c | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog.openacc b/gcc/ChangeLog.openacc index ffe19bc5809..02667a2aed4 100644 --- a/gcc/ChangeLog.openacc +++ b/gcc/ChangeLog.openacc @@ -1,3 +1,8 @@ +2019-09-06 Julian Brown <julian@codesourcery.com> + + * gimplify.c (gimplify_omp_for): Use for_stmt in call to + localize_reductions. + 2019-09-06 Julian Brown <julian@codesourcery.com> * config/nvptx/nvptx.c (omp-sese.h): Include. diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 3d869447d70..60761504a5e 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -11082,7 +11082,8 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) here, because the code to remove reductions in kernels regions cannot handle that. */ if (outer && outer->region_type == ORT_ACC_PARALLEL) - localize_reductions (OMP_FOR_CLAUSES (*expr_p), OMP_FOR_BODY (*expr_p)); + localize_reductions (OMP_FOR_CLAUSES (for_stmt), + OMP_FOR_BODY (for_stmt)); } /* Set OMP_CLAUSE_LINEAR_NO_COPYIN flag on explicit linear -- 2.22.0 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-03-10 14:08 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-06 16:02 [PATCH] [og9] Add omp_pause_resource{,_all} for AMD GCN Julian Brown 2019-09-06 16:02 ` [PATCH] [og9] OpenACC profiling support " Julian Brown 2023-03-10 14:07 ` Fix OpenACC/GCN 'acc_ev_enqueue_launch_end' position (was: [PATCH] [og9] OpenACC profiling support for AMD GCN) Thomas Schwinge 2019-09-06 16:02 ` [PATCH] [og9] Use more appropriate var in localize_reductions call Julian Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).