public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, OpenACC] Make reduction arguments addressable
@ 2016-05-30 18:42 Chung-Lin Tang
  2016-05-30 21:43 ` Jakub Jelinek
  0 siblings, 1 reply; 11+ messages in thread
From: Chung-Lin Tang @ 2016-05-30 18:42 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jakub Jelinek, Cesar Philippidis

[-- Attachment #1: Type: text/plain, Size: 1038 bytes --]

Hi, a previous patch of Cesar's has made the middle-end omp-lowering
automatically create and insert a tofrom (i.e. present_or_copy) map for
parallel reductions.  This allowed the user to not need explicit
clauses to copy out the reduction result, but because reduction arguments
are not marked addressable, async does not work as expected,
i.e. the asynchronous copy-out results are not used in the compiler generated code.

This patch fixes this in the front-ends, I've tested this patch without
new regressions, and fixes some C++ OpenACC tests that regressed after
my last OpenACC async patch.  Is this okay for trunk?

Thanks,
Chung-Lin

2016-05-30  Chung-Lin Tang  <cltang@codesourcery.com>

	c/
	* c-typeck.c (c_finish_omp_clauses): Mark OpenACC reduction
	arguments as addressable.

	cp/
	* semantics.c (finish_omp_clauses): Mark OpenACC reduction
	arguments as addressable.

	fortran/
	* trans-openmp.c (gfc_trans_oacc_construct): Mark OpenACC reduction
	arguments as addressable.
	(gfc_trans_oacc_combined_directive): Likewise.

[-- Attachment #2: a.diff --]
[-- Type: text/plain, Size: 2055 bytes --]

Index: c/c-typeck.c
===================================================================
--- c/c-typeck.c	(revision 236845)
+++ c/c-typeck.c	(working copy)
@@ -12575,6 +12575,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_reg
 	      remove = true;
 	      break;
 	    }
+	  if (ort & C_ORT_ACC)
+	    c_mark_addressable (t);
 	  type = TREE_TYPE (t);
 	  if (TREE_CODE (t) == MEM_REF)
 	    type = TREE_TYPE (type);
Index: cp/semantics.c
===================================================================
--- cp/semantics.c	(revision 236845)
+++ cp/semantics.c	(working copy)
@@ -5827,6 +5827,8 @@ finish_omp_clauses (tree clauses, enum c_omp_regio
 		t = n;
 	      goto check_dup_generic_t;
 	    }
+	  if (ort & C_ORT_ACC)
+	    cxx_mark_addressable (t);
 	  goto check_dup_generic;
 	case OMP_CLAUSE_COPYPRIVATE:
 	  copyprivate_seen = true;
Index: fortran/trans-openmp.c
===================================================================
--- fortran/trans-openmp.c	(revision 236845)
+++ fortran/trans-openmp.c	(working copy)
@@ -2704,6 +2704,10 @@ gfc_trans_oacc_construct (gfc_code *code)
   gfc_start_block (&block);
   oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
 					code->loc);
+  for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
+    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
+	&& DECL_P (OMP_CLAUSE_DECL (c)))
+      TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
   stmt = gfc_trans_omp_code (code->block->next, true);
   stmt = build2_loc (input_location, construct_code, void_type_node, stmt,
 		     oacc_clauses);
@@ -3501,6 +3505,10 @@ gfc_trans_oacc_combined_directive (gfc_code *code)
 	construct_clauses.lists[OMP_LIST_REDUCTION] = NULL;
       oacc_clauses = gfc_trans_omp_clauses (&block, &construct_clauses,
 					    code->loc);
+      for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
+	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
+	    && DECL_P (OMP_CLAUSE_DECL (c)))
+	  TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
     }
   if (!loop_clauses.seq)
     pblock = &block;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-05-30 18:42 [PATCH, OpenACC] Make reduction arguments addressable Chung-Lin Tang
@ 2016-05-30 21:43 ` Jakub Jelinek
  2016-05-31  9:59   ` Thomas Schwinge
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2016-05-30 21:43 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: gcc-patches, Cesar Philippidis

On Mon, May 30, 2016 at 10:38:59PM +0800, Chung-Lin Tang wrote:
> Hi, a previous patch of Cesar's has made the middle-end omp-lowering
> automatically create and insert a tofrom (i.e. present_or_copy) map for
> parallel reductions.  This allowed the user to not need explicit
> clauses to copy out the reduction result, but because reduction arguments
> are not marked addressable, async does not work as expected,
> i.e. the asynchronous copy-out results are not used in the compiler generated code.

If you need it only for async parallel/kernels? regions, can't you do that
only for those and not for others?

> This patch fixes this in the front-ends, I've tested this patch without
> new regressions, and fixes some C++ OpenACC tests that regressed after
> my last OpenACC async patch.  Is this okay for trunk?

Testcases in the testsuite or others?  If the latter, we should add them.

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-05-30 21:43 ` Jakub Jelinek
@ 2016-05-31  9:59   ` Thomas Schwinge
  2016-05-31 11:10     ` Chung-Lin Tang
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Schwinge @ 2016-05-31  9:59 UTC (permalink / raw)
  To: Jakub Jelinek, Chung-Lin Tang, Cesar Philippidis; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 3066 bytes --]

Hi!

On Mon, 30 May 2016 18:53:41 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, May 30, 2016 at 10:38:59PM +0800, Chung-Lin Tang wrote:
> > Hi, a previous patch of Cesar's has made the middle-end omp-lowering
> > automatically create and insert a tofrom (i.e. present_or_copy) map for
> > parallel reductions.  This allowed the user to not need explicit
> > clauses to copy out the reduction result, but because reduction arguments
> > are not marked addressable, async does not work as expected,
> > i.e. the asynchronous copy-out results are not used in the compiler generated code.
> 
> If you need it only for async parallel/kernels? regions, can't you do that
> only for those and not for others?

Also, please add comments to the source code to document the need for
such special handling.

> > This patch fixes this in the front-ends, I've tested this patch without
> > new regressions, and fixes some C++ OpenACC tests that regressed after
> > my last OpenACC async patch.  Is this okay for trunk?
> 
> Testcases in the testsuite or others?  If the latter, we should add them.

The r236772 commit "[PATCH, libgomp] Rewire OpenACC async",
<http://news.gmane.org/find-root.php?message_id=%3C56FA4F69.1060101%40codesourcery.com%3E>
regressed (or, triggered/exposed the existing wrong behavior?)
libgomp.oacc-c++/template-reduction.C execution testing for nvptx
offloading.  (Please always send email about such known regressions, and
XFAIL them with your commit -- that would have saved me an hour
yesterday, when I bisected recent changes to figure out why that test
suddenly fails.)

For reference, here is a test case, a reduced C version of
libgomp/testsuite/libgomp.oacc-c++/template-reduction.C.  This test case
works (without Chung-Lin's "[PATCH, OpenACC] Make reduction arguments
addressable" patch) if I enable "POCs", which surprises me a bit, because
I thought after Cesar's recent changes, the gimplifier is doing the same
thing of adding a data clause next to the reduction clause.  Probably
it's not doing the exactly same thing, though.  Should it?  Cesar, do you
have any comments on this?  For example (just guessing), should
TREE_ADDRESSABLE be set where the gimplifier does its work, instead of in
the three front ends?

    // Reduced C version of libgomp/testsuite/libgomp.oacc-c++/template-reduction.C.
    
    const int n = 100;
    
    // Check present and async and an explicit firstprivate
    
    int
    async_sum (int c)
    {
      int s = 0;
    
    #define POCs //present_or_copy(s)
    #pragma acc parallel loop num_gangs (10) gang reduction (+:s) POCs firstprivate (c) async
      for (int i = 0; i < n; i++)
        s += i+c;
    
    #pragma acc wait
    
      return s;
    }
    
    int
    main()
    {
      int result = 0;
    
      for (int i = 0; i < n; i++)
        {
          result += i+1;
        }
    
      if (async_sum (1) != result)
        __builtin_abort ();
    
      return 0;
    }


Grüße
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-05-31  9:59   ` Thomas Schwinge
@ 2016-05-31 11:10     ` Chung-Lin Tang
  2016-06-01 13:32       ` Chung-Lin Tang
  0 siblings, 1 reply; 11+ messages in thread
From: Chung-Lin Tang @ 2016-05-31 11:10 UTC (permalink / raw)
  To: Thomas Schwinge, Jakub Jelinek, Cesar Philippidis; +Cc: gcc-patches

On 2016/5/31 3:28 PM, Thomas Schwinge wrote:
> Hi!
> 
> On Mon, 30 May 2016 18:53:41 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Mon, May 30, 2016 at 10:38:59PM +0800, Chung-Lin Tang wrote:
>>> Hi, a previous patch of Cesar's has made the middle-end omp-lowering
>>> automatically create and insert a tofrom (i.e. present_or_copy) map for
>>> parallel reductions.  This allowed the user to not need explicit
>>> clauses to copy out the reduction result, but because reduction arguments
>>> are not marked addressable, async does not work as expected,
>>> i.e. the asynchronous copy-out results are not used in the compiler generated code.
>>
>> If you need it only for async parallel/kernels? regions, can't you do that
>> only for those and not for others?

That is achievable, but not in line with how we currently treat all other
data clause OMP_CLAUSE_MAPs, which are all marked addressable. Is this special
case handling really better here?

> Also, please add comments to the source code to document the need for
> such special handling.
> 
>>> This patch fixes this in the front-ends, I've tested this patch without
>>> new regressions, and fixes some C++ OpenACC tests that regressed after
>>> my last OpenACC async patch.  Is this okay for trunk?
>>
>> Testcases in the testsuite or others?  If the latter, we should add them.
> 
> The r236772 commit "[PATCH, libgomp] Rewire OpenACC async",
> <http://news.gmane.org/find-root.php?message_id=%3C56FA4F69.1060101%40codesourcery.com%3E>
> regressed (or, triggered/exposed the existing wrong behavior?)
> libgomp.oacc-c++/template-reduction.C execution testing for nvptx
> offloading.  (Please always send email about such known regressions, and
> XFAIL them with your commit -- that would have saved me an hour
> yesterday, when I bisected recent changes to figure out why that test
> suddenly fails.)

Sorry, Thomas. I was going to quickly send this follow-up patch, so glossed over XFAILing.

> For reference, here is a test case, a reduced C version of
> libgomp/testsuite/libgomp.oacc-c++/template-reduction.C.  This test case
> works (without Chung-Lin's "[PATCH, OpenACC] Make reduction arguments
> addressable" patch) if I enable "POCs", which surprises me a bit, because
> I thought after Cesar's recent changes, the gimplifier is doing the same
> thing of adding a data clause next to the reduction clause.  Probably
> it's not doing the exactly same thing, though.  Should it?  Cesar, do you
> have any comments on this?  For example (just guessing), should
> TREE_ADDRESSABLE be set where the gimplifier does its work, instead of in
> the three front ends?

There's really nothing wrong about Cesar's patch. The marking addressable needs
to be done earlier, or it may be too late during gimplification. I already
tried to fix this in gimplify.c before, but didn't completely work.

I'll add more testcases for this before I commit any final patches.

Thanks,
Chung-Lin

> 
>     // Reduced C version of libgomp/testsuite/libgomp.oacc-c++/template-reduction.C.
>     
>     const int n = 100;
>     
>     // Check present and async and an explicit firstprivate
>     
>     int
>     async_sum (int c)
>     {
>       int s = 0;
>     
>     #define POCs //present_or_copy(s)
>     #pragma acc parallel loop num_gangs (10) gang reduction (+:s) POCs firstprivate (c) async
>       for (int i = 0; i < n; i++)
>         s += i+c;
>     
>     #pragma acc wait
>     
>       return s;
>     }
>     
>     int
>     main()
>     {
>       int result = 0;
>     
>       for (int i = 0; i < n; i++)
>         {
>           result += i+1;
>         }
>     
>       if (async_sum (1) != result)
>         __builtin_abort ();
>     
>       return 0;
>     }
> 
> 
> Grüße
>  Thomas
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-05-31 11:10     ` Chung-Lin Tang
@ 2016-06-01 13:32       ` Chung-Lin Tang
  2016-06-01 13:39         ` Jakub Jelinek
  0 siblings, 1 reply; 11+ messages in thread
From: Chung-Lin Tang @ 2016-06-01 13:32 UTC (permalink / raw)
  To: Thomas Schwinge, Jakub Jelinek, Cesar Philippidis; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1952 bytes --]

On 2016/5/31 05:51 PM, Chung-Lin Tang wrote:
> On 2016/5/31 3:28 PM, Thomas Schwinge wrote:
>> > Hi!
>> > 
>> > On Mon, 30 May 2016 18:53:41 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
>>> >> On Mon, May 30, 2016 at 10:38:59PM +0800, Chung-Lin Tang wrote:
>>>> >>> Hi, a previous patch of Cesar's has made the middle-end omp-lowering
>>>> >>> automatically create and insert a tofrom (i.e. present_or_copy) map for
>>>> >>> parallel reductions.  This allowed the user to not need explicit
>>>> >>> clauses to copy out the reduction result, but because reduction arguments
>>>> >>> are not marked addressable, async does not work as expected,
>>>> >>> i.e. the asynchronous copy-out results are not used in the compiler generated code.
>>> >>
>>> >> If you need it only for async parallel/kernels? regions, can't you do that
>>> >> only for those and not for others?
> That is achievable, but not in line with how we currently treat all other
> data clause OMP_CLAUSE_MAPs, which are all marked addressable. Is this special
> case handling really better here?
> 

Hi Jakub, here's a version of the patch with the addressable marking restricted
to when there's an async clause. Tests re-ran to ensure no regressions.  Please inform
if this way seems better.  Also attached are some new testcases.

Thanks,
Chung-Lin

2016-06-01  Chung-Lin Tang  <cltang@codesourcery.com>

	c/
	* c-typeck.c (c_finish_omp_clauses): Mark OpenACC reduction
	arguments as addressable when async clause exists.

	cp/
	* semantics.c (finish_omp_clauses): Mark OpenACC reduction
	arguments as addressable when async clause exists.

	fortran/
	* trans-openmp.c (gfc_trans_oacc_construct): Mark OpenACC reduction
	arguments as addressable. when async clause exists.
	(gfc_trans_oacc_combined_directive): Likewise.

	libgomp/testsuite/
	* libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-8.c: New test.
	* libgomp/testsuite/libgomp.oacc-fortran/reduction-8.f90: New test.


[-- Attachment #2: async-reduction.patch --]
[-- Type: text/x-patch, Size: 3852 bytes --]

Index: c/c-typeck.c
===================================================================
--- c/c-typeck.c	(revision 236845)
+++ c/c-typeck.c	(working copy)
@@ -12529,6 +12529,7 @@ c_finish_omp_clauses (tree clauses, enum c_omp_reg
   tree *nowait_clause = NULL;
   bool ordered_seen = false;
   tree schedule_clause = NULL_TREE;
+  bool oacc_async = false;
 
   bitmap_obstack_initialize (NULL);
   bitmap_initialize (&generic_head, &bitmap_default_obstack);
@@ -12539,6 +12540,14 @@ c_finish_omp_clauses (tree clauses, enum c_omp_reg
   bitmap_initialize (&map_field_head, &bitmap_default_obstack);
   bitmap_initialize (&oacc_reduction_head, &bitmap_default_obstack);
 
+  if (ort & C_ORT_ACC)
+    for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ASYNC)
+	{
+	  oacc_async = true;
+	  break;
+	}
+	  
   for (pc = &clauses, c = clauses; c ; c = *pc)
     {
       bool remove = false;
@@ -12575,6 +12584,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_reg
 	      remove = true;
 	      break;
 	    }
+	  if (oacc_async)
+	    c_mark_addressable (t);
 	  type = TREE_TYPE (t);
 	  if (TREE_CODE (t) == MEM_REF)
 	    type = TREE_TYPE (type);
Index: cp/semantics.c
===================================================================
--- cp/semantics.c	(revision 236845)
+++ cp/semantics.c	(working copy)
@@ -5774,6 +5774,7 @@ finish_omp_clauses (tree clauses, enum c_omp_regio
   bool branch_seen = false;
   bool copyprivate_seen = false;
   bool ordered_seen = false;
+  bool oacc_async = false;
 
   bitmap_obstack_initialize (NULL);
   bitmap_initialize (&generic_head, &bitmap_default_obstack);
@@ -5784,6 +5785,14 @@ finish_omp_clauses (tree clauses, enum c_omp_regio
   bitmap_initialize (&map_field_head, &bitmap_default_obstack);
   bitmap_initialize (&oacc_reduction_head, &bitmap_default_obstack);
 
+  if (ort & C_ORT_ACC)
+    for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ASYNC)
+	{
+	  oacc_async = true;
+	  break;
+	}
+
   for (pc = &clauses, c = clauses; c ; c = *pc)
     {
       bool remove = false;
@@ -5827,6 +5836,8 @@ finish_omp_clauses (tree clauses, enum c_omp_regio
 		t = n;
 	      goto check_dup_generic_t;
 	    }
+	  if (oacc_async)
+	    cxx_mark_addressable (t);
 	  goto check_dup_generic;
 	case OMP_CLAUSE_COPYPRIVATE:
 	  copyprivate_seen = true;
Index: fortran/trans-openmp.c
===================================================================
--- fortran/trans-openmp.c	(revision 236845)
+++ fortran/trans-openmp.c	(working copy)
@@ -2704,6 +2704,15 @@ gfc_trans_oacc_construct (gfc_code *code)
   gfc_start_block (&block);
   oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
 					code->loc);
+  for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
+    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ASYNC)
+      {
+	for (c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
+	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
+	      && DECL_P (OMP_CLAUSE_DECL (c)))
+	    TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
+	break;
+      }
   stmt = gfc_trans_omp_code (code->block->next, true);
   stmt = build2_loc (input_location, construct_code, void_type_node, stmt,
 		     oacc_clauses);
@@ -3501,6 +3510,15 @@ gfc_trans_oacc_combined_directive (gfc_code *code)
 	construct_clauses.lists[OMP_LIST_REDUCTION] = NULL;
       oacc_clauses = gfc_trans_omp_clauses (&block, &construct_clauses,
 					    code->loc);
+      for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
+	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ASYNC)
+	  {
+	    for (c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
+	      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
+		  && DECL_P (OMP_CLAUSE_DECL (c)))
+		TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
+	    break;
+	  }
     }
   if (!loop_clauses.seq)
     pblock = &block;

[-- Attachment #3: async-reduction-tests.patch --]
[-- Type: text/x-patch, Size: 1786 bytes --]

Index: libgomp/testsuite/libgomp.oacc-fortran/reduction-8.f90
===================================================================
--- libgomp/testsuite/libgomp.oacc-fortran/reduction-8.f90	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-fortran/reduction-8.f90	(revision 0)
@@ -0,0 +1,41 @@
+! { dg-do run }
+
+program reduction
+  implicit none
+  integer, parameter    :: n = 100
+  integer               :: i, h1, h2, s1, s2, a1, a2
+
+  h1 = 0
+  h2 = 0
+  do i = 1, n
+     h1 = h1 + 1
+     h2 = h2 + 2
+  end do
+
+  s1 = 0
+  s2 = 0
+  !$acc parallel loop reduction(+:s1, s2)
+  do i = 1, n
+     s1 = s1 + 1
+     s2 = s2 + 2
+  end do
+  !$acc end parallel loop
+
+  a1 = 0
+  a2 = 0
+  !$acc parallel loop reduction(+:a1, a2) async(1)
+  do i = 1, n
+     a1 = a1 + 1
+     a2 = a2 + 2
+  end do
+  !$acc end parallel loop
+
+  if (h1 .ne. s1) call abort ()
+  if (h2 .ne. s2) call abort ()
+
+  !$acc wait(1)
+
+  if (h1 .ne. a1) call abort ()
+  if (h2 .ne. a2) call abort ()
+
+end program reduction
Index: libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-8.c
===================================================================
--- libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-8.c	(revision 0)
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-8.c	(revision 0)
@@ -0,0 +1,30 @@
+const int n = 100;
+    
+// Check async over parallel construct with reduction
+    
+int
+async_sum (int c)
+{
+  int s = 0;
+    
+#pragma acc parallel loop num_gangs (10) gang reduction (+:s) async
+  for (int i = 0; i < n; i++)
+    s += i+c;
+    
+#pragma acc wait
+  return s;
+}
+    
+int
+main()
+{
+  int result = 0;
+    
+  for (int i = 0; i < n; i++)
+    result += i+1;
+    
+  if (async_sum (1) != result)
+    __builtin_abort ();
+    
+  return 0;
+}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-06-01 13:32       ` Chung-Lin Tang
@ 2016-06-01 13:39         ` Jakub Jelinek
  2016-06-02 13:55           ` Chung-Lin Tang
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2016-06-01 13:39 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: Thomas Schwinge, Cesar Philippidis, gcc-patches

On Wed, Jun 01, 2016 at 09:32:26PM +0800, Chung-Lin Tang wrote:
> 2016-06-01  Chung-Lin Tang  <cltang@codesourcery.com>
> 
> 	c/
> 	* c-typeck.c (c_finish_omp_clauses): Mark OpenACC reduction
> 	arguments as addressable when async clause exists.
> 
> 	cp/
> 	* semantics.c (finish_omp_clauses): Mark OpenACC reduction
> 	arguments as addressable when async clause exists.

This LGTM.
> 
> 	fortran/
> 	* trans-openmp.c (gfc_trans_oacc_construct): Mark OpenACC reduction
> 	arguments as addressable. when async clause exists.
> 	(gfc_trans_oacc_combined_directive): Likewise.

> --- fortran/trans-openmp.c	(revision 236845)
> +++ fortran/trans-openmp.c	(working copy)
> @@ -2704,6 +2704,15 @@ gfc_trans_oacc_construct (gfc_code *code)
>    gfc_start_block (&block);
>    oacc_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
>  					code->loc);
> +  for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +    if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ASYNC)
> +      {
> +	for (c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +	  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
> +	      && DECL_P (OMP_CLAUSE_DECL (c)))
> +	    TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
> +	break;
> +      }
>    stmt = gfc_trans_omp_code (code->block->next, true);
>    stmt = build2_loc (input_location, construct_code, void_type_node, stmt,
>  		     oacc_clauses);
> @@ -3501,6 +3510,15 @@ gfc_trans_oacc_combined_directive (gfc_code *code)
>  	construct_clauses.lists[OMP_LIST_REDUCTION] = NULL;
>        oacc_clauses = gfc_trans_omp_clauses (&block, &construct_clauses,
>  					    code->loc);
> +      for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ASYNC)
> +	  {
> +	    for (c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +	      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
> +		  && DECL_P (OMP_CLAUSE_DECL (c)))
> +		TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
> +	    break;
> +	  }
>      }

These 2 look wrong to me.  1) you really don't need to walk all the clauses
to find if there is OMP_CLAUSE_ASYNC, you can just test the
async field of struct gfc_omp_clauses.  And, 2) is there any reason why you
can't just do this in gfc_trans_omp_clauses instead, when crating
OMP_CLAUSE_REDUCTION if clauses->async is set?  Or are there some cases
where on OpenACC constructs you don't want to do this?

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-06-01 13:39         ` Jakub Jelinek
@ 2016-06-02 13:55           ` Chung-Lin Tang
  2016-06-02 14:00             ` Jakub Jelinek
  0 siblings, 1 reply; 11+ messages in thread
From: Chung-Lin Tang @ 2016-06-02 13:55 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Thomas Schwinge, Cesar Philippidis, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]

On 2016/6/1 09:38 PM, Jakub Jelinek wrote:
>>  	construct_clauses.lists[OMP_LIST_REDUCTION] = NULL;
>> >        oacc_clauses = gfc_trans_omp_clauses (&block, &construct_clauses,
>> >  					    code->loc);
>> > +      for (tree c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
>> > +	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_ASYNC)
>> > +	  {
>> > +	    for (c = oacc_clauses; c; c = OMP_CLAUSE_CHAIN (c))
>> > +	      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
>> > +		  && DECL_P (OMP_CLAUSE_DECL (c)))
>> > +		TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
>> > +	    break;
>> > +	  }
>> >      }
> These 2 look wrong to me.  1) you really don't need to walk all the clauses
> to find if there is OMP_CLAUSE_ASYNC, you can just test the
> async field of struct gfc_omp_clauses.  And, 2) is there any reason why you
> can't just do this in gfc_trans_omp_clauses instead, when crating
> OMP_CLAUSE_REDUCTION if clauses->async is set?  Or are there some cases
> where on OpenACC constructs you don't want to do this?

Thanks for reminding, I didn't notice there was such a gfc_omp_clauses->async field.

Here's a much more succinct patch that does it inside gfc_trans_omp_clauses.
Again re-tested gfortran and libgomp without regressions.
Is this and the C/C++ patches (and the new testsuite cases) okay for trunk?

Thanks,
Chung-Lin

	fortran/
	* trans-openmp.c (gfc_trans_omp_clauses): Mark OpenACC reduction
	arguments as addressable when async clause exists.


[-- Attachment #2: a.diff --]
[-- Type: text/plain, Size: 715 bytes --]

Index: trans-openmp.c
===================================================================
--- trans-openmp.c	(revision 236845)
+++ trans-openmp.c	(working copy)
@@ -1748,6 +1748,12 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp
 	{
 	case OMP_LIST_REDUCTION:
 	  omp_clauses = gfc_trans_omp_reduction_list (n, omp_clauses, where);
+	  /* An OpenACC async clause indicates the need to set reduction
+	     arguments addressable, to allow asynchronous copy-out.  */
+	  if (clauses->async)
+	    for (c = omp_clauses; c; c = OMP_CLAUSE_CHAIN (c))
+	      if (DECL_P (OMP_CLAUSE_DECL (c)))
+		TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
 	  break;
 	case OMP_LIST_PRIVATE:
 	  clause_code = OMP_CLAUSE_PRIVATE;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-06-02 13:55           ` Chung-Lin Tang
@ 2016-06-02 14:00             ` Jakub Jelinek
  2016-06-03  7:13               ` Chung-Lin Tang
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2016-06-02 14:00 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: Thomas Schwinge, Cesar Philippidis, gcc-patches

On Thu, Jun 02, 2016 at 09:55:05PM +0800, Chung-Lin Tang wrote:
> 	fortran/
> 	* trans-openmp.c (gfc_trans_omp_clauses): Mark OpenACC reduction
> 	arguments as addressable when async clause exists.

Wouldn't it be better to pass either a bool openacc_async flag, or
whole clauses, down to gfc_trans_omp_reduction_list and handle it there
instead of walking the list after the fact?

> Index: trans-openmp.c
> ===================================================================
> --- trans-openmp.c	(revision 236845)
> +++ trans-openmp.c	(working copy)
> @@ -1748,6 +1748,12 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp
>  	{
>  	case OMP_LIST_REDUCTION:
>  	  omp_clauses = gfc_trans_omp_reduction_list (n, omp_clauses, where);
> +	  /* An OpenACC async clause indicates the need to set reduction
> +	     arguments addressable, to allow asynchronous copy-out.  */
> +	  if (clauses->async)
> +	    for (c = omp_clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +	      if (DECL_P (OMP_CLAUSE_DECL (c)))
> +		TREE_ADDRESSABLE (OMP_CLAUSE_DECL (c)) = 1;
>  	  break;
>  	case OMP_LIST_PRIVATE:
>  	  clause_code = OMP_CLAUSE_PRIVATE;


	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-06-02 14:00             ` Jakub Jelinek
@ 2016-06-03  7:13               ` Chung-Lin Tang
  2016-06-03  7:15                 ` Jakub Jelinek
  2016-06-10 10:22                 ` Thomas Schwinge
  0 siblings, 2 replies; 11+ messages in thread
From: Chung-Lin Tang @ 2016-06-03  7:13 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Thomas Schwinge, Cesar Philippidis, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 617 bytes --]

On 2016/6/2 10:00 PM, Jakub Jelinek wrote:
> Wouldn't it be better to pass either a bool openacc_async flag, or
> whole clauses, down to gfc_trans_omp_reduction_list and handle it there
> instead of walking the list after the fact?

You mean this style? (patch attached)
Tested again with no regressions.

Thanks,
Chung-Lin

        * trans-openmp.c (gfc_trans_omp_reduction_list): Add mark_addressable
        bool parameter, set reduction clause DECLs as addressable when true.
        (gfc_trans_omp_clauses): Pass clauses->async to
        gfc_trans_omp_reduction_list, add comment describing OpenACC situation.


[-- Attachment #2: a.diff --]
[-- Type: text/plain, Size: 1332 bytes --]

Index: trans-openmp.c
===================================================================
--- trans-openmp.c	(revision 236845)
+++ trans-openmp.c	(working copy)
@@ -1646,7 +1646,7 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_
 
 static tree
 gfc_trans_omp_reduction_list (gfc_omp_namelist *namelist, tree list,
-			      locus where)
+			      locus where, bool mark_addressable)
 {
   for (; namelist != NULL; namelist = namelist->next)
     if (namelist->sym->attr.referenced)
@@ -1657,6 +1657,8 @@ gfc_trans_omp_reduction_list (gfc_omp_namelist *na
 	    tree node = build_omp_clause (where.lb->location,
 					  OMP_CLAUSE_REDUCTION);
 	    OMP_CLAUSE_DECL (node) = t;
+	    if (mark_addressable)
+	      TREE_ADDRESSABLE (t) = 1;
 	    switch (namelist->u.reduction_op)
 	      {
 	      case OMP_REDUCTION_PLUS:
@@ -1747,7 +1749,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp
       switch (list)
 	{
 	case OMP_LIST_REDUCTION:
-	  omp_clauses = gfc_trans_omp_reduction_list (n, omp_clauses, where);
+	  /* An OpenACC async clause indicates the need to set reduction
+	     arguments addressable, to allow asynchronous copy-out.  */
+	  omp_clauses = gfc_trans_omp_reduction_list (n, omp_clauses, where,
+						      clauses->async);
 	  break;
 	case OMP_LIST_PRIVATE:
 	  clause_code = OMP_CLAUSE_PRIVATE;

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-06-03  7:13               ` Chung-Lin Tang
@ 2016-06-03  7:15                 ` Jakub Jelinek
  2016-06-10 10:22                 ` Thomas Schwinge
  1 sibling, 0 replies; 11+ messages in thread
From: Jakub Jelinek @ 2016-06-03  7:15 UTC (permalink / raw)
  To: Chung-Lin Tang; +Cc: Thomas Schwinge, Cesar Philippidis, gcc-patches

On Fri, Jun 03, 2016 at 03:13:40PM +0800, Chung-Lin Tang wrote:
> On 2016/6/2 10:00 PM, Jakub Jelinek wrote:
> > Wouldn't it be better to pass either a bool openacc_async flag, or
> > whole clauses, down to gfc_trans_omp_reduction_list and handle it there
> > instead of walking the list after the fact?
> 
> You mean this style? (patch attached)
> Tested again with no regressions.
> 
> Thanks,
> Chung-Lin
> 
>         * trans-openmp.c (gfc_trans_omp_reduction_list): Add mark_addressable
>         bool parameter, set reduction clause DECLs as addressable when true.
>         (gfc_trans_omp_clauses): Pass clauses->async to
>         gfc_trans_omp_reduction_list, add comment describing OpenACC situation.

Yep, thanks (and the C/C++ patch is ok too).

> Index: trans-openmp.c
> ===================================================================
> --- trans-openmp.c	(revision 236845)
> +++ trans-openmp.c	(working copy)
> @@ -1646,7 +1646,7 @@ gfc_trans_omp_array_reduction_or_udr (tree c, gfc_
>  
>  static tree
>  gfc_trans_omp_reduction_list (gfc_omp_namelist *namelist, tree list,
> -			      locus where)
> +			      locus where, bool mark_addressable)
>  {
>    for (; namelist != NULL; namelist = namelist->next)
>      if (namelist->sym->attr.referenced)
> @@ -1657,6 +1657,8 @@ gfc_trans_omp_reduction_list (gfc_omp_namelist *na
>  	    tree node = build_omp_clause (where.lb->location,
>  					  OMP_CLAUSE_REDUCTION);
>  	    OMP_CLAUSE_DECL (node) = t;
> +	    if (mark_addressable)
> +	      TREE_ADDRESSABLE (t) = 1;
>  	    switch (namelist->u.reduction_op)
>  	      {
>  	      case OMP_REDUCTION_PLUS:
> @@ -1747,7 +1749,10 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp
>        switch (list)
>  	{
>  	case OMP_LIST_REDUCTION:
> -	  omp_clauses = gfc_trans_omp_reduction_list (n, omp_clauses, where);
> +	  /* An OpenACC async clause indicates the need to set reduction
> +	     arguments addressable, to allow asynchronous copy-out.  */
> +	  omp_clauses = gfc_trans_omp_reduction_list (n, omp_clauses, where,
> +						      clauses->async);
>  	  break;
>  	case OMP_LIST_PRIVATE:
>  	  clause_code = OMP_CLAUSE_PRIVATE;


	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH, OpenACC] Make reduction arguments addressable
  2016-06-03  7:13               ` Chung-Lin Tang
  2016-06-03  7:15                 ` Jakub Jelinek
@ 2016-06-10 10:22                 ` Thomas Schwinge
  1 sibling, 0 replies; 11+ messages in thread
From: Thomas Schwinge @ 2016-06-10 10:22 UTC (permalink / raw)
  To: Chung-Lin Tang, gcc-patches; +Cc: Cesar Philippidis, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

Hi!

I'd still have preferred if that patch described (adding source code
comments) why that special handling is required, but oh well...

On Fri, 3 Jun 2016 15:13:40 +0800, Chung-Lin Tang <cltang@codesourcery.com> wrote:
> Tested again with no regressions.

Your gomp-4_0-branch commit of this patch, r237210, forgot to clean up
XFAILs; now committed to gomp-4_0-branch in r237298:

commit f26e600bdf37b2d68bd479c1e7d6a33866da5acd
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri Jun 10 10:12:09 2016 +0000

    Clean up libgomp.oacc-c++/template-reduction.C XFAIL
    
    	libgomp/
    	* testsuite/libgomp.oacc-c++/template-reduction.C: Remove XFAIL.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@237298 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog.gomp                                  | 4 ++++
 libgomp/testsuite/libgomp.oacc-c++/template-reduction.C | 4 ----
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git libgomp/ChangeLog.gomp libgomp/ChangeLog.gomp
index fc1decf..ccee4d1c 100644
--- libgomp/ChangeLog.gomp
+++ libgomp/ChangeLog.gomp
@@ -1,3 +1,7 @@
+2016-06-10  Thomas Schwinge  <thomas@codesourcery.com>
+
+	* testsuite/libgomp.oacc-c++/template-reduction.C: Remove XFAIL.
+
 2016-06-08  Chung-Lin Tang  <cltang@codesourcery.com>
 
 	Backport from trunk r237070:
diff --git libgomp/testsuite/libgomp.oacc-c++/template-reduction.C libgomp/testsuite/libgomp.oacc-c++/template-reduction.C
index e2a0ca0..6c85fba 100644
--- libgomp/testsuite/libgomp.oacc-c++/template-reduction.C
+++ libgomp/testsuite/libgomp.oacc-c++/template-reduction.C
@@ -1,7 +1,3 @@
-// TODO: async_sum is currently failing on nvptx.
-
-// { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } }
-
 const int n = 100;
 
 // Check explicit template copy map


Grüße
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-06-10 10:22 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-30 18:42 [PATCH, OpenACC] Make reduction arguments addressable Chung-Lin Tang
2016-05-30 21:43 ` Jakub Jelinek
2016-05-31  9:59   ` Thomas Schwinge
2016-05-31 11:10     ` Chung-Lin Tang
2016-06-01 13:32       ` Chung-Lin Tang
2016-06-01 13:39         ` Jakub Jelinek
2016-06-02 13:55           ` Chung-Lin Tang
2016-06-02 14:00             ` Jakub Jelinek
2016-06-03  7:13               ` Chung-Lin Tang
2016-06-03  7:15                 ` Jakub Jelinek
2016-06-10 10:22                 ` Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).