* [gomp4] useless reduction locks and other bug fixes
@ 2015-09-01 23:21 Cesar Philippidis
2015-09-01 23:23 ` Cesar Philippidis
0 siblings, 1 reply; 2+ messages in thread
From: Cesar Philippidis @ 2015-09-01 23:21 UTC (permalink / raw)
To: gcc-patches; +Cc: Nathan Sidwell
This patch teaches lower_oacc_reductions not to generate calls to
GOACC_{UN}LOCK if they aren't any reductions. That situation can happen
when there is a fake gang reduction on a private variable.
I also found a bug where the lower_rec_input_clauses expects there to be
a data mapping for the reduction variable when there isn't, e.g. for
private/local reduction variables. And I made the nvptx backend aware of
the fact that the lhs of a call to REDUCTION_TEARDOWN may be have been
optimized away for worker reductions too. I have a couple of test cases
for these bugs, but I'll include them with my upcoming auto-independent
loop patch.
This patch has been committed to gomp-4_0-branch.
Cesar
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [gomp4] useless reduction locks and other bug fixes
2015-09-01 23:21 [gomp4] useless reduction locks and other bug fixes Cesar Philippidis
@ 2015-09-01 23:23 ` Cesar Philippidis
0 siblings, 0 replies; 2+ messages in thread
From: Cesar Philippidis @ 2015-09-01 23:23 UTC (permalink / raw)
To: gcc-patches; +Cc: Nathan Sidwell
[-- Attachment #1: Type: text/plain, Size: 811 bytes --]
[Attaching patch this time.]
On 09/01/2015 04:21 PM, Cesar Philippidis wrote:
> This patch teaches lower_oacc_reductions not to generate calls to
> GOACC_{UN}LOCK if they aren't any reductions. That situation can happen
> when there is a fake gang reduction on a private variable.
>
> I also found a bug where the lower_rec_input_clauses expects there to be
> a data mapping for the reduction variable when there isn't, e.g. for
> private/local reduction variables. And I made the nvptx backend aware of
> the fact that the lhs of a call to REDUCTION_TEARDOWN may be have been
> optimized away for worker reductions too. I have a couple of test cases
> for these bugs, but I'll include them with my upcoming auto-independent
> loop patch.
>
> This patch has been committed to gomp-4_0-branch.
>
> Cesar
>
[-- Attachment #2: reduction-useless-locks.diff --]
[-- Type: text/x-patch, Size: 4256 bytes --]
2015-09-01 Cesar Philippidis <cesar@codesourcery.com>
gcc/
* config/nvptx/nvptx.c (nvptx_goacc_reduction_teardown): Allow
lhs to be NULL for worker reductions too.
* omp-low.c (lower_rec_input_clauses): Bail out on OpenACC reductions.
(lower_oacc_reductions): Use maybe_lookup_decl for private reductions.
Don't emit locks for fake private gang reductions.
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 1b85892..51f2893 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4229,14 +4229,19 @@ nvptx_goacc_reduction_teardown (gimple call)
tree rid = gimple_call_arg (call, 5);
gimple_seq seq = NULL;
+ if (v == NULL)
+ {
+ gsi_remove (&gsi, true);
+ return false;
+ }
+
push_gimplify_context (true);
switch (loop_dim)
{
case GOMP_DIM_GANG:
case GOMP_DIM_VECTOR:
- if (v)
- gimplify_assign (v, local_var, &seq);
+ gimplify_assign (v, local_var, &seq);
break;
case GOMP_DIM_WORKER:
{
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index fdca880..bfef298 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -3892,7 +3892,14 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist,
new_var = var = OMP_CLAUSE_DECL (c);
if (c_kind != OMP_CLAUSE_COPYIN)
- new_var = lookup_decl (var, ctx);
+ {
+ /* Not all OpenACC reductions require new mappings. */
+ if (is_gimple_omp_oacc (ctx->stmt)
+ && (new_var = maybe_lookup_decl (var, ctx)) == NULL)
+ new_var = var;
+ else
+ new_var = lookup_decl (var, ctx);
+ }
if (c_kind == OMP_CLAUSE_SHARED || c_kind == OMP_CLAUSE_COPYIN)
{
@@ -4724,6 +4731,8 @@ lower_oacc_reductions (enum internal_fn ifn, int loop_dim, tree clauses,
tree c, tcode, gwv, rid, lid = build_int_cst (integer_type_node, oacc_lid);
int oacc_rid, i;
unsigned mask = extract_oacc_loop_mask (ctx);
+ gimple_seq red_seq = NULL;
+ int num_reductions = 0;
enum tree_code rcode;
/* Remove the outer-most level of parallelism from the loop. */
@@ -4753,14 +4762,6 @@ lower_oacc_reductions (enum internal_fn ifn, int loop_dim, tree clauses,
gimplify_and_add (call, ilist);
}
- /* Call GOACC_LOCK. */
- if (ifn == IFN_GOACC_REDUCTION_FINI && write_back)
- {
- call = build_call_expr_internal_loc (UNKNOWN_LOCATION, IFN_GOACC_LOCK,
- void_type_node, 2, dim, lid);
- gimplify_and_add (call, ilist);
- }
-
for (c = clauses, oacc_rid = 0;
c && write_back;
c = OMP_CLAUSE_CHAIN (c), oacc_rid++)
@@ -4776,7 +4777,9 @@ lower_oacc_reductions (enum internal_fn ifn, int loop_dim, tree clauses,
var = OMP_CLAUSE_REDUCTION_PRIVATE_DECL (c);
if (var == NULL_TREE)
- var = lookup_decl (orig, ctx);
+ var = maybe_lookup_decl (orig, ctx);
+ if (var == NULL_TREE)
+ var = orig;
res = build_outer_var_ref (orig, ctx);
@@ -4811,16 +4814,32 @@ lower_oacc_reductions (enum internal_fn ifn, int loop_dim, tree clauses,
call = build_call_expr_internal_loc (UNKNOWN_LOCATION, ifn,
TREE_TYPE (var), 6, ref_to_res,
var, gwv, tcode, lid, rid);
- gimplify_assign (var, call, ilist);
+ gimplify_assign (var, call, &red_seq);
+ num_reductions++;
}
- /* Call GOACC_UNLOCK. */
- if (ifn == IFN_GOACC_REDUCTION_FINI && write_back)
+ if (num_reductions)
{
- dim = build_int_cst (integer_type_node, loop_dim);
- call = build_call_expr_internal_loc (UNKNOWN_LOCATION, IFN_GOACC_UNLOCK,
- void_type_node, 2, dim, lid);
- gimplify_and_add (call, ilist);
+ /* Call GOACC_LOCK. */
+ if (ifn == IFN_GOACC_REDUCTION_FINI && write_back)
+ {
+ call = build_call_expr_internal_loc (UNKNOWN_LOCATION,
+ IFN_GOACC_LOCK, void_type_node,
+ 2, dim, lid);
+ gimplify_and_add (call, ilist);
+ }
+
+ gimple_seq_add_seq (ilist, red_seq);
+
+ /* Call GOACC_UNLOCK. */
+ if (ifn == IFN_GOACC_REDUCTION_FINI && write_back)
+ {
+ dim = build_int_cst (integer_type_node, loop_dim);
+ call = build_call_expr_internal_loc (UNKNOWN_LOCATION,
+ IFN_GOACC_UNLOCK,
+ void_type_node, 2, dim, lid);
+ gimplify_and_add (call, ilist);
+ }
}
}
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-09-01 23:23 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-01 23:21 [gomp4] useless reduction locks and other bug fixes Cesar Philippidis
2015-09-01 23:23 ` Cesar Philippidis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).