[ was: Re: [RFC PATCH] Coalesce host to device transfers in libgomp ] On 10/25/2017 01:38 PM, Jakub Jelinek wrote: > And we don't really have the async target implemented yet for NVPTX:(, > guess that should be the highest priority after this optimization. Hi, how about this approach: 1 - Move async_run from plugin-hsa.c to default_async_run 2 - Implement omp async support for nvptx ? The first patch moves the GOMP_OFFLOAD_async_run implementation from plugin-hsa.c to target.c, making it the default implementation if the plugin does not define the GOMP_OFFLOAD_async_run symbol. The second patch removes the GOMP_OFFLOAD_async_run symbol from the nvptx plugin, activating the default implementation, and makes sure GOMP_OFFLOAD_run can be called from a fresh thread. I've tested this with libgomp.c/c.exp and the previously failing target-33.c and target-34.c are now passing, and there are no regressions. OK for trunk after complete testing (and adding function comment for default_async_run)? Thanks, - Tom