* [hsa] Add missing guard in OMP gridification
@ 2017-10-27 13:19 Martin Jambor
2017-10-27 13:48 ` Jakub Jelinek
0 siblings, 1 reply; 2+ messages in thread
From: Martin Jambor @ 2017-10-27 13:19 UTC (permalink / raw)
To: GCC Patches; +Cc: Jakub Jelinek, Pekka Jääskeläinen
Hi,
rather embarrasingly, I found out that there is a missing condition to
make sure that HSA grid size is zero when the OpenMP loop bounds
should preclude the loop from executing at all. I do not know whether
I lost is somewhere when preparing patches for trunk or whether I
forgot about it from the beginning. In any case, the patch below adds
it where it should be.
This popped up during my libgomp testsuite runs as a consequence of
Jakub's revision 253395 after which HSAIL was apparently generated for a
a few more kernels and libgomp.c/for-5.c started to fail (taking the
whole machine GPGPU subsystem with it). So there is already a testcase
for this.
My long term plan for gridification is to replace it with the approach
that our nvidia offloading uses once we have simpler (and better
supported) function pointers in HSA or/and, better yet, a full blown
GCN BE. It does not currently work well but I still try to avoid any
regressions (this one took long because the bug started happening when
I changed some unrelated things on the APU machine and was suspecting
them).
Bootstrapped with hsa enabled on an x86_64-linux and tested on an HSA
capable APU, OK for trunk?
Thanks,
Martin
2017-10-10 Martin Jambor <mjambor@suse.cz>
* omp-grid.c (grid_attempt_target_gridification): Also insert a
condition whether loop should be executed at all.
---
gcc/omp-grid.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/gcc/omp-grid.c b/gcc/omp-grid.c
index a7b6f60aeaf..121c96ebe39 100644
--- a/gcc/omp-grid.c
+++ b/gcc/omp-grid.c
@@ -1315,6 +1315,7 @@ grid_attempt_target_gridification (gomp_target *target,
n1 = fold_convert (itype, n1);
n2 = fold_convert (itype, n2);
+ tree cond = fold_build2 (cond_code, boolean_type_node, n1, n2);
tree step
= omp_get_for_step_from_incr (loc, gimple_omp_for_incr (inner_loop, i));
@@ -1328,6 +1329,7 @@ grid_attempt_target_gridification (gomp_target *target,
fold_build1 (NEGATE_EXPR, itype, step));
else
t = fold_build2 (TRUNC_DIV_EXPR, itype, t, step);
+ t = fold_build3 (COND_EXPR, itype, cond, t, build_zero_cst (itype));
if (grid.tiling)
{
if (cond_code == GT_EXPR)
--
2.14.2
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [hsa] Add missing guard in OMP gridification
2017-10-27 13:19 [hsa] Add missing guard in OMP gridification Martin Jambor
@ 2017-10-27 13:48 ` Jakub Jelinek
0 siblings, 0 replies; 2+ messages in thread
From: Jakub Jelinek @ 2017-10-27 13:48 UTC (permalink / raw)
To: GCC Patches, Pekka Jääskeläinen
On Fri, Oct 27, 2017 at 03:19:05PM +0200, Martin Jambor wrote:
> 2017-10-10 Martin Jambor <mjambor@suse.cz>
>
> * omp-grid.c (grid_attempt_target_gridification): Also insert a
> condition whether loop should be executed at all.
Ok, thanks.
> --- a/gcc/omp-grid.c
> +++ b/gcc/omp-grid.c
> @@ -1315,6 +1315,7 @@ grid_attempt_target_gridification (gomp_target *target,
> n1 = fold_convert (itype, n1);
> n2 = fold_convert (itype, n2);
>
> + tree cond = fold_build2 (cond_code, boolean_type_node, n1, n2);
> tree step
> = omp_get_for_step_from_incr (loc, gimple_omp_for_incr (inner_loop, i));
>
> @@ -1328,6 +1329,7 @@ grid_attempt_target_gridification (gomp_target *target,
> fold_build1 (NEGATE_EXPR, itype, step));
> else
> t = fold_build2 (TRUNC_DIV_EXPR, itype, t, step);
> + t = fold_build3 (COND_EXPR, itype, cond, t, build_zero_cst (itype));
> if (grid.tiling)
> {
> if (cond_code == GT_EXPR)
> --
> 2.14.2
Jakub
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-10-27 13:37 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27 13:19 [hsa] Add missing guard in OMP gridification Martin Jambor
2017-10-27 13:48 ` Jakub Jelinek
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).