public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PING][PR67476] Add param parloops-schedule
@ 2015-09-22  7:47 Tom de Vries
  2015-09-22 11:25 ` Bernd Schmidt
  0 siblings, 1 reply; 19+ messages in thread
From: Tom de Vries @ 2015-09-22  7:47 UTC (permalink / raw)
  To: gcc-patches

Hi,

These two patches:
- https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00938.html
- https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00940.html
add a param parloop-schedule=<static|dynamic|guided|auto|runtime>.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PING][PR67476] Add param parloops-schedule
  2015-09-22  7:47 [PING][PR67476] Add param parloops-schedule Tom de Vries
@ 2015-09-22 11:25 ` Bernd Schmidt
  2015-10-04 15:37   ` Tom de Vries
  0 siblings, 1 reply; 19+ messages in thread
From: Bernd Schmidt @ 2015-09-22 11:25 UTC (permalink / raw)
  To: Tom de Vries, gcc-patches

On 09/22/2015 09:19 AM, Tom de Vries wrote:
> These two patches:
> - https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00938.html
> - https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00940.html
> add a param parloop-schedule=<static|dynamic|guided|auto|runtime>.

The problem I have when trying to review them is that the second patch 
does quite a bit more, and there's no description of these changes in 
your mail or in the comments. Please explain what the non-obvious parts 
of the patch do.

For example - this thing is entirely unexplained:
> +  if (gimple_in_ssa_p (cfun))
> +    {
> +      gphi_iterator psi;
> +
> +      for (psi = gsi_start_phis (l3_bb); !gsi_end_p (psi); gsi_next (&psi))
> +	{
> +	  source_location locus;
> +	  gphi *nphi;
> +	  gphi *exit_phi = psi.phi ();
> +
> +	  edge l2_to_l3 = find_edge (l2_bb, l3_bb);
> +	  tree exit_res = PHI_ARG_DEF_FROM_EDGE (exit_phi, l2_to_l3);
>
> +	  basic_block latch = BRANCH_EDGE (cont_bb)->dest;
> +	  edge latch_to_l1 = find_edge (latch, l1_bb);
> +	  gphi *inner_phi = find_phi_with_arg_on_edge (exit_res, latch_to_l1);
> +
> +	  tree t = gimple_phi_result (exit_phi);
> +	  tree new_res = copy_ssa_name (t, NULL);
> +	  nphi = create_phi_node (new_res, l0_bb);
> +
> +	  edge l0_to_l1 = find_edge (l0_bb, l1_bb);
> +	  t = PHI_ARG_DEF_FROM_EDGE (inner_phi, l0_to_l1);
> +	  locus = gimple_phi_arg_location_from_edge (inner_phi, l0_to_l1);
> +	  edge entry_to_l0 = find_edge (entry_bb, l0_bb);
> +	  add_phi_arg (nphi, t, entry_to_l0, locus);
> +
> +	  edge l2_to_l0 = find_edge (l2_bb, l0_bb);
> +	  add_phi_arg (nphi, exit_res, l2_to_l0, UNKNOWN_LOCATION);
> +
> +	  add_phi_arg (inner_phi, new_res, l0_to_l1, UNKNOWN_LOCATION);
> +	};
> +    }
> +

Also, it would be good to know why having this param is important. Is it 
a tool for development? Are users expected to use it?


Bernd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PING][PR67476] Add param parloops-schedule
  2015-09-22 11:25 ` Bernd Schmidt
@ 2015-10-04 15:37   ` Tom de Vries
  2015-10-06  9:31     ` Bernd Schmidt
  0 siblings, 1 reply; 19+ messages in thread
From: Tom de Vries @ 2015-10-04 15:37 UTC (permalink / raw)
  To: Bernd Schmidt, gcc-patches

On 22/09/15 12:58, Bernd Schmidt wrote:
> On 09/22/2015 09:19 AM, Tom de Vries wrote:
>> These two patches:
>> - https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00938.html
>> - https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00940.html
>> add a param parloop-schedule=<static|dynamic|guided|auto|runtime>.
>
> The problem I have when trying to review them is that the second patch
> does quite a bit more, and there's no description of these changes in
> your mail or in the comments. Please explain what the non-obvious parts
> of the patch do.
>

Hi Bernd,

I'll try to give a bit of context:

The omp-expand machinery is used in two contexts:
1. when omp annotations are added to the source. In that case,
    omp-expand is used in non-ssa gimple context.
2. when parloops annotates a loop with omp annotations. In that case,
    omp-expand is used in ssa gimple context.

Until recently, parloops generated only
   #pragma omp for schedule (static)
annotations, which means the loop was expanded by 
expand_omp_for_static_nochunk, which has ssa support.

Recently (r227434), I've added --param parloops-chunk-size, which means 
parloops can also generate
   #pragma omp for schedule (static, chunk-size)
annotations, which means the loop is expanded by 
expand_omp_for_static_chunk. There was some ssa support in that function 
(I'm not sure when this was added), but it was incomplete and broken, so 
the patch series for the param contained patches to fix this (r227436 - 
r227438).

Similarly, this patch adds --params 
parloops-schedule<static|dynamic|guided|auto|runtime>, which means 
parloops can also generate
  #pragma omp for schedule (dynamic)
  #pragma omp for schedule (dynamic, chunk-size)
  #pragma omp for schedule (guided)
  #pragma omp for schedule (guided, chunk-size)
  #pragma omp for schedule (auto)
  #pragma omp for schedule (runtime)
annotations. While f.i. 'schedule (auto)' is mapped on 'schedule 
(static)' and uses expand_omp_for_static_nochunk, for f.i. 'schedule 
(runtime)' we use expand_omp_for_generic. Again, there was some ssa 
support in that function, but incomplete and broken, so this patch 
contains fixes for that.

In addition, I've recently (r226427) fixed parloops such that it no 
longer invalidates the loops structure and cancels the loop tree. At the 
parloops side, that involved adding an empty latch block, in order not 
to break the LOOPS_HAVE_SIMPLE_LATCHES property. At the omp-expand side, 
that meant handling the empty latch block, as well as updating the loops 
structure. In r227435, I've applied a similar fix for 
expand_omp_for_static_chunk.

In summary, since this patch means that expand_omp_for_generic is 
triggered in ssa mode by parloops, I needed to fix these 3 things in 
expand_omp_for_generic:
- fixing and completing ssa support
- handling added empty latch block
- updating loops structure

> For example - this thing is entirely unexplained:
>> +  if (gimple_in_ssa_p (cfun))
>> +    {
>> +      gphi_iterator psi;
>> +
>> +      for (psi = gsi_start_phis (l3_bb); !gsi_end_p (psi); gsi_next
>> (&psi))
>> +    {
>> +      source_location locus;
>> +      gphi *nphi;
>> +      gphi *exit_phi = psi.phi ();
>> +
>> +      edge l2_to_l3 = find_edge (l2_bb, l3_bb);
>> +      tree exit_res = PHI_ARG_DEF_FROM_EDGE (exit_phi, l2_to_l3);
>>
>> +      basic_block latch = BRANCH_EDGE (cont_bb)->dest;
>> +      edge latch_to_l1 = find_edge (latch, l1_bb);
>> +      gphi *inner_phi = find_phi_with_arg_on_edge (exit_res,
>> latch_to_l1);
>> +
>> +      tree t = gimple_phi_result (exit_phi);
>> +      tree new_res = copy_ssa_name (t, NULL);
>> +      nphi = create_phi_node (new_res, l0_bb);
>> +
>> +      edge l0_to_l1 = find_edge (l0_bb, l1_bb);
>> +      t = PHI_ARG_DEF_FROM_EDGE (inner_phi, l0_to_l1);
>> +      locus = gimple_phi_arg_location_from_edge (inner_phi, l0_to_l1);
>> +      edge entry_to_l0 = find_edge (entry_bb, l0_bb);
>> +      add_phi_arg (nphi, t, entry_to_l0, locus);
>> +
>> +      edge l2_to_l0 = find_edge (l2_bb, l0_bb);
>> +      add_phi_arg (nphi, exit_res, l2_to_l0, UNKNOWN_LOCATION);
>> +
>> +      add_phi_arg (inner_phi, new_res, l0_to_l1, UNKNOWN_LOCATION);
>> +    };
>> +    }
>> +
>

This bit is adding missing ssa support. In expand_omp_for_generic, we 
add a loop around the loop we're expanding. Since we're in ssa, that 
means we need to:
- add phis to the outer loop that connect to the phis in the inner,
   original loop, and
- move the loop entry value of the inner phi to the loop entry value of
   the outer phi.

> Also, it would be good to know why having this param is important. Is it
> a tool for development? Are users expected to use it?

It's a tool to try out diffent omp-schedules for parloops-generated 
loops. I'd expect users and developers with an interest in 
auto-parallelization to use it.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PING][PR67476] Add param parloops-schedule
  2015-10-04 15:37   ` Tom de Vries
@ 2015-10-06  9:31     ` Bernd Schmidt
  2015-10-10 11:07       ` Tom de Vries
  0 siblings, 1 reply; 19+ messages in thread
From: Bernd Schmidt @ 2015-10-06  9:31 UTC (permalink / raw)
  To: Tom de Vries, gcc-patches

On 10/04/2015 05:36 PM, Tom de Vries wrote:
> I'll try to give a bit of context:
>
> The omp-expand machinery is used in two contexts:
> 1. when omp annotations are added to the source. In that case,
>     omp-expand is used in non-ssa gimple context.
> 2. when parloops annotates a loop with omp annotations. In that case,
>     omp-expand is used in ssa gimple context.

This much I remembered. The rest is at least useful background information.

> In addition, I've recently (r226427) fixed parloops such that it no
> longer invalidates the loops structure and cancels the loop tree. At the
> parloops side, that involved adding an empty latch block, in order not
> to break the LOOPS_HAVE_SIMPLE_LATCHES property. At the omp-expand side,
> that meant handling the empty latch block, as well as updating the loops
> structure. In r227435, I've applied a similar fix for
> expand_omp_for_static_chunk.

Ok, I see similar pieces in that commit. Probably this means we're 
looking at some changes that are independent from each other and should 
be submitted as such. Is there a way to make these functions share code?

>> For example - this thing is entirely unexplained:
[...]
> This bit is adding missing ssa support. In expand_omp_for_generic, we
> add a loop around the loop we're expanding. Since we're in ssa, that
> means we need to:
> - add phis to the outer loop that connect to the phis in the inner,
>    original loop, and
> - move the loop entry value of the inner phi to the loop entry value of
>    the outer phi.

Explanations like this should go into a comment.

Also, you're using l2_bb in that block, but as far as I can tell it 
could be NULL (if broken_loop). Is there a reason why this can't happen?


Bernd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PING][PR67476] Add param parloops-schedule
  2015-10-06  9:31     ` Bernd Schmidt
@ 2015-10-10 11:07       ` Tom de Vries
  2015-10-10 11:25         ` [PATCH, 1/5] Handle simple latch in expand_omp_for_generic Tom de Vries
                           ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: Tom de Vries @ 2015-10-10 11:07 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

On 06/10/15 11:30, Bernd Schmidt wrote:
> On 10/04/2015 05:36 PM, Tom de Vries wrote:
>> I'll try to give a bit of context:
>>
>> The omp-expand machinery is used in two contexts:
>> 1. when omp annotations are added to the source. In that case,
>>     omp-expand is used in non-ssa gimple context.
>> 2. when parloops annotates a loop with omp annotations. In that case,
>>     omp-expand is used in ssa gimple context.
>
> This much I remembered. The rest is at least useful background information.
>
>> In addition, I've recently (r226427) fixed parloops such that it no
>> longer invalidates the loops structure and cancels the loop tree. At the
>> parloops side, that involved adding an empty latch block, in order not
>> to break the LOOPS_HAVE_SIMPLE_LATCHES property. At the omp-expand side,
>> that meant handling the empty latch block, as well as updating the loops
>> structure. In r227435, I've applied a similar fix for
>> expand_omp_for_static_chunk.
>
> Ok, I see similar pieces in that commit. Probably this means we're
> looking at some changes that are independent from each other and should
> be submitted as such.

OK, I'll repost with the patch split up, as follows:

      1	Handle simple latch in expand_omp_for_generic
      2	Add missing phis in expand_omp_for_generic
      3	Handle original loop tree in expand_omp_for_generic
      4	Support DEFPARAMENUM in params.def
      5	Add param parloops-schedule

> Is there a way to make these functions share code?
>

I don't see possibilities for that.

>>> For example - this thing is entirely unexplained:
> [...]
>> This bit is adding missing ssa support. In expand_omp_for_generic, we
>> add a loop around the loop we're expanding. Since we're in ssa, that
>> means we need to:
>> - add phis to the outer loop that connect to the phis in the inner,
>>    original loop, and
>> - move the loop entry value of the inner phi to the loop entry value of
>>    the outer phi.
>
> Explanations like this should go into a comment.
>

Done.

> Also, you're using l2_bb in that block, but as far as I can tell it
> could be NULL (if broken_loop). Is there a reason why this can't happen?
>

Yes, broken_loop is assumed false for parloops-generated loops, as we 
can see from this assert in expand_omp_for_static_chunk:
...
   if (gimple_in_ssa_p (cfun))
     {
       ...
       gcc_assert (fd->collapse == 1 && !broken_loop);
...

So I've moved the ssa-support bit into the !broken_loop condition.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH, 1/5] Handle simple latch in expand_omp_for_generic
  2015-10-10 11:07       ` Tom de Vries
@ 2015-10-10 11:25         ` Tom de Vries
  2015-10-12 14:04           ` Bernd Schmidt
  2015-10-10 11:50         ` [PATCH, 2/5] Add missing phis " Tom de Vries
                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Tom de Vries @ 2015-10-10 11:25 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 694 bytes --]

On 10/10/15 13:06, Tom de Vries wrote:
> OK, I'll repost with the patch split up, as follows:
>
>       1    Handle simple latch in expand_omp_for_generic
>       2    Add missing phis in expand_omp_for_generic
>       3    Handle original loop tree in expand_omp_for_generic
>       4    Support DEFPARAMENUM in params.def
>       5    Add param parloops-schedule

this patch handles simple latches in expand_omp_for_generic.

This allows us to handle loops which have the LOOPS_HAVE_SIMPLE_LATCHES 
property.

A similar fix was done:
- in r226427 for expand_omp_for_static_nochunk (for PR66846)
- in r227435 for expand_omp_for_static_chunk (for
   --param parloops-chunk-size)

Thanks,
- Tom

[-- Attachment #2: 0001-Handle-simple-latch-in-expand_omp_for_generic.patch --]
[-- Type: text/x-patch, Size: 1207 bytes --]

Handle simple latch in expand_omp_for_generic

2015-09-10  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/67476
	* omp-low.c (expand_omp_for_generic): Handle simple latch.
---
 gcc/omp-low.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index cdcf9d6..f59a6a4 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -6162,7 +6162,9 @@ expand_omp_for_generic (struct omp_region *region,
   if (!broken_loop)
     {
       l2_bb = create_empty_bb (cont_bb);
-      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l1_bb);
+      gcc_assert (BRANCH_EDGE (cont_bb)->dest == l1_bb
+		  || (single_succ_edge (BRANCH_EDGE (cont_bb)->dest)->dest
+		      == l1_bb));
       gcc_assert (EDGE_COUNT (cont_bb->succs) == 2);
     }
   else
@@ -6438,6 +6440,11 @@ expand_omp_for_generic (struct omp_region *region,
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
       add_bb_to_loop (l2_bb, cont_bb->loop_father);
       e = find_edge (cont_bb, l1_bb);
+      if (e == NULL)
+	{
+	  e = BRANCH_EDGE (cont_bb);
+	  gcc_assert (single_succ (e->dest) == l1_bb);
+	}
       if (gimple_omp_for_combined_p (fd->for_stmt))
 	{
 	  remove_edge (e);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH, 2/5] Add missing phis in expand_omp_for_generic
  2015-10-10 11:07       ` Tom de Vries
  2015-10-10 11:25         ` [PATCH, 1/5] Handle simple latch in expand_omp_for_generic Tom de Vries
@ 2015-10-10 11:50         ` Tom de Vries
  2015-10-12 14:06           ` Bernd Schmidt
  2015-10-10 11:59         ` [PATCH, 3/5] Handle original loop tree " Tom de Vries
                           ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: Tom de Vries @ 2015-10-10 11:50 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 674 bytes --]

On 10/10/15 13:06, Tom de Vries wrote:
> OK, I'll repost with the patch split up, as follows:
>
>       1    Handle simple latch in expand_omp_for_generic
>       2    Add missing phis in expand_omp_for_generic
>       3    Handle original loop tree in expand_omp_for_generic
>       4    Support DEFPARAMENUM in params.def
>       5    Add param parloops-schedule

Hi,

this patch adds missing phis in expand_omp_for_generic.

In expand_omp_for_generic, we add an outer loop around an inner loop. 
That means we need to:
- add the necessary phis on the outer loop, and
- move the loop entry value of the inner phi to the loop entry value of
   the outer phi

Thanks,
- Tom

[-- Attachment #2: 0002-Add-missing-phis-in-expand_omp_for_generic.patch --]
[-- Type: text/x-patch, Size: 2355 bytes --]

Add missing phis in expand_omp_for_generic

2015-09-10  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/67476
	* omp-low.c (expand_omp_for_generic): Add missing phis.
---
 gcc/omp-low.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index f59a6a4..b2a93b9 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -238,6 +238,7 @@ static vec<omp_context *> taskreg_contexts;
 
 static void scan_omp (gimple_seq *, omp_context *);
 static tree scan_omp_1_op (tree *, int *, void *);
+static gphi *find_phi_with_arg_on_edge (tree, edge);
 
 #define WALK_SUBSTMTS  \
     case GIMPLE_BIND: \
@@ -6469,6 +6470,43 @@ expand_omp_for_generic (struct omp_region *region,
 	}
       make_edge (l2_bb, l0_bb, EDGE_TRUE_VALUE);
 
+      if (gimple_in_ssa_p (cfun))
+	{
+	  /* Add phis to the outer loop that connect to the phis in the inner,
+	     original loop, and move the loop entry value of the inner phi to
+	     the loop entry value of the outer phi.  */
+	  gphi_iterator psi;
+	  for (psi = gsi_start_phis (l3_bb); !gsi_end_p (psi); gsi_next (&psi))
+	    {
+	      source_location locus;
+	      gphi *nphi;
+	      gphi *exit_phi = psi.phi ();
+
+	      edge l2_to_l3 = find_edge (l2_bb, l3_bb);
+	      tree exit_res = PHI_ARG_DEF_FROM_EDGE (exit_phi, l2_to_l3);
+
+	      basic_block latch = BRANCH_EDGE (cont_bb)->dest;
+	      edge latch_to_l1 = find_edge (latch, l1_bb);
+	      gphi *inner_phi
+		= find_phi_with_arg_on_edge (exit_res, latch_to_l1);
+
+	      tree t = gimple_phi_result (exit_phi);
+	      tree new_res = copy_ssa_name (t, NULL);
+	      nphi = create_phi_node (new_res, l0_bb);
+
+	      edge l0_to_l1 = find_edge (l0_bb, l1_bb);
+	      t = PHI_ARG_DEF_FROM_EDGE (inner_phi, l0_to_l1);
+	      locus = gimple_phi_arg_location_from_edge (inner_phi, l0_to_l1);
+	      edge entry_to_l0 = find_edge (entry_bb, l0_bb);
+	      add_phi_arg (nphi, t, entry_to_l0, locus);
+
+	      edge l2_to_l0 = find_edge (l2_bb, l0_bb);
+	      add_phi_arg (nphi, exit_res, l2_to_l0, UNKNOWN_LOCATION);
+
+	      add_phi_arg (inner_phi, new_res, l0_to_l1, UNKNOWN_LOCATION);
+	    };
+	}
+
       set_immediate_dominator (CDI_DOMINATORS, l2_bb,
 			       recompute_dominator (CDI_DOMINATORS, l2_bb));
       set_immediate_dominator (CDI_DOMINATORS, l3_bb,
-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH, 3/5] Handle original loop tree in expand_omp_for_generic
  2015-10-10 11:07       ` Tom de Vries
  2015-10-10 11:25         ` [PATCH, 1/5] Handle simple latch in expand_omp_for_generic Tom de Vries
  2015-10-10 11:50         ` [PATCH, 2/5] Add missing phis " Tom de Vries
@ 2015-10-10 11:59         ` Tom de Vries
  2015-10-12 14:11           ` Bernd Schmidt
  2015-10-10 12:06         ` [PATCH, 4/5] Support DEFPARAMENUM in params.def Tom de Vries
  2015-10-10 12:10         ` [PATCH, 5/5] Add param parloops-schedule Tom de Vries
  4 siblings, 1 reply; 19+ messages in thread
From: Tom de Vries @ 2015-10-10 11:59 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 668 bytes --]

On 10/10/15 13:06, Tom de Vries wrote:
> OK, I'll repost with the patch split up, as follows:
>
>       1    Handle simple latch in expand_omp_for_generic
>       2    Add missing phis in expand_omp_for_generic
>       3    Handle original loop tree in expand_omp_for_generic
>       4    Support DEFPARAMENUM in params.def
>       5    Add param parloops-schedule

this patch handles the case that we call expand_omp_for_generic with a 
loop having a corresponding loop struct entry.

A similar fix was done:
- in r226427 for expand_omp_for_static_nochunk (for PR66846)
- in r227435 for expand_omp_for_static_chunk (for
   --param parloops-chunk-size)

Thanks,
- Tom

[-- Attachment #2: 0003-Handle-original-loop-tree-in-expand_omp_for_generic.patch --]
[-- Type: text/x-patch, Size: 1517 bytes --]

Handle original loop tree in expand_omp_for_generic

2015-09-10  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/67476
	* omp-low.c (expand_omp_for_generic): Handle original loop tree.
---
 gcc/omp-low.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index b2a93b9..15a004e 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -6439,7 +6439,6 @@ expand_omp_for_generic (struct omp_region *region,
       remove_edge (e);
 
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
-      add_bb_to_loop (l2_bb, cont_bb->loop_father);
       e = find_edge (cont_bb, l1_bb);
       if (e == NULL)
 	{
@@ -6516,14 +6515,22 @@ expand_omp_for_generic (struct omp_region *region,
       set_immediate_dominator (CDI_DOMINATORS, l1_bb,
 			       recompute_dominator (CDI_DOMINATORS, l1_bb));
 
+      struct loop *loop = l1_bb->loop_father;
+      add_bb_to_loop (l2_bb, entry_bb->loop_father);
+
       struct loop *outer_loop = alloc_loop ();
       outer_loop->header = l0_bb;
       outer_loop->latch = l2_bb;
       add_loop (outer_loop, l0_bb->loop_father);
 
+      /* If we already have a loop struct for the inner loop, don't allocate a
+	 new one.  */
+      if (loop != entry_bb->loop_father)
+	return;
+
       if (!gimple_omp_for_combined_p (fd->for_stmt))
 	{
-	  struct loop *loop = alloc_loop ();
+	  loop = alloc_loop ();
 	  loop->header = l1_bb;
 	  /* The loop may have multiple latches.  */
 	  add_loop (loop, outer_loop);
-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH, 4/5] Support DEFPARAMENUM in params.def
  2015-10-10 11:07       ` Tom de Vries
                           ` (2 preceding siblings ...)
  2015-10-10 11:59         ` [PATCH, 3/5] Handle original loop tree " Tom de Vries
@ 2015-10-10 12:06         ` Tom de Vries
  2015-10-12 14:12           ` Bernd Schmidt
  2015-10-10 12:10         ` [PATCH, 5/5] Add param parloops-schedule Tom de Vries
  4 siblings, 1 reply; 19+ messages in thread
From: Tom de Vries @ 2015-10-10 12:06 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 927 bytes --]

On 10/10/15 13:06, Tom de Vries wrote:
> OK, I'll repost with the patch split up, as follows:
>
>       1    Handle simple latch in expand_omp_for_generic
>       2    Add missing phis in expand_omp_for_generic
>       3    Handle original loop tree in expand_omp_for_generic
>       4    Support DEFPARAMENUM in params.def
>       5    Add param parloops-schedule
>

this patch adds support for DEFPARAMENUM in params.def.

Using this support, we're able to define a param parloops-schedule with 
values names "static, dynamic, guided, auto, runtime" and default value 
"static" like this in params.def:
...
DEFPARAMENUM5 (PARAM_PARLOOPS_SCHEDULE,
               "parloops-schedule",
               "Schedule type of omp schedule for loops parallelized by "
               "parloops (static, dynamic, guided, auto, runtime)",
               static,
               static, dynamic, guided, auto, runtime)
...

Thanks,
- Tom




[-- Attachment #2: 0004-Support-DEFPARAMENUM-in-params.def.patch --]
[-- Type: text/x-patch, Size: 10307 bytes --]

Support DEFPARAMENUM in params.def

2015-09-11  Tom de Vries  <tom@codesourcery.com>

	* Makefile.in (PARAMS_H, PLUGIN_HEADERS): Add params-enum.h.
	* params-enum.h: New file.
	* opts.c (handle_param): Handle case that param arg is a string.
	* params-list.h: Handle DEFPARAMENUM5 in params.def.
	* params.c (find_param): New function, factored out of ...
	(set_param_value): ... here.
	(param_string_value_p): New function.
	* params.h (struct param_info): Add value_names field.
	(find_param, param_string_value_p): Declare.
---
 gcc/Makefile.in   |  6 ++--
 gcc/opts.c        | 19 +++++++----
 gcc/params-enum.h | 39 ++++++++++++++++++++++
 gcc/params-list.h |  3 ++
 gcc/params.c      | 97 ++++++++++++++++++++++++++++++++++++++++++-------------
 gcc/params.h      |  6 ++++
 6 files changed, 138 insertions(+), 32 deletions(-)
 create mode 100644 gcc/params-enum.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 009c745..617c503 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -890,7 +890,7 @@ RTL_BASE_H = coretypes.h rtl.h rtl.def $(MACHMODE_H) reg-notes.def \
 FIXED_VALUE_H = fixed-value.h $(MACHMODE_H) double-int.h
 RTL_H = $(RTL_BASE_H) $(FLAGS_H) genrtl.h
 READ_MD_H = $(OBSTACK_H) $(HASHTAB_H) read-md.h
-PARAMS_H = params.h params.def
+PARAMS_H = params.h params-enum.h params.def
 BUILTINS_DEF = builtins.def sync-builtins.def omp-builtins.def \
 	gtm-builtins.def sanitizer.def cilkplus.def cilk-builtins.def
 INTERNAL_FN_DEF = internal-fn.def
@@ -3259,8 +3259,8 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
   tree-iterator.h $(PLUGIN_H) $(TREE_SSA_H) langhooks.h incpath.h debug.h \
   $(EXCEPT_H) tree-ssa-sccvn.h real.h output.h $(IPA_UTILS_H) \
   $(C_PRAGMA_H)  $(CPPLIB_H)  $(FUNCTION_H) \
-  cppdefault.h flags.h $(MD5_H) params.def params.h prefix.h tree-inline.h \
-  $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
+  cppdefault.h flags.h $(MD5_H) params.def params.h params-enum.h \
+  prefix.h tree-inline.h $(GIMPLE_PRETTY_PRINT_H) realmpfr.h \
   $(IPA_PROP_H) $(TARGET_H) $(RTL_H) $(TM_P_H) $(CFGLOOP_H) $(EMIT_RTL_H) \
   version.h stringpool.h gimplify.h gimple-iterator.h gimple-ssa.h \
   fold-const.h tree-cfg.h tree-into-ssa.h tree-ssanames.h print-tree.h \
diff --git a/gcc/opts.c b/gcc/opts.c
index 896875b..89b3e79 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2128,14 +2128,21 @@ handle_param (struct gcc_options *opts, struct gcc_options *opts_set,
 	      arg);
   else
     {
-      value = integral_argument (equal + 1);
-      if (value == -1)
-	error_at (loc, "invalid --param value %qs", equal + 1);
+      *equal = '\0';
+
+      enum compiler_param index;
+      if (!find_param (arg, &index))
+	error_at (loc, "invalid --param name %qs", arg);
       else
 	{
-	  *equal = '\0';
-	  set_param_value (arg, value,
-			   opts->x_param_values, opts_set->x_param_values);
+	  if (!param_string_value_p (index, equal + 1, &value))
+	    value = integral_argument (equal + 1);
+
+	  if (value == -1)
+	    error_at (loc, "invalid --param value %qs", equal + 1);
+	  else
+	    set_param_value (arg, value,
+			     opts->x_param_values, opts_set->x_param_values);
 	}
     }
 
diff --git a/gcc/params-enum.h b/gcc/params-enum.h
new file mode 100644
index 0000000..a95e16c
--- /dev/null
+++ b/gcc/params-enum.h
@@ -0,0 +1,39 @@
+/* params-enums.h - Run-time parameter enums.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define DEFPARAM(ENUM, OPTION, HELP, DEFAULT, MIN, MAX)
+#define DEFPARAMENUMNAME(ENUM) ENUM ## _KIND
+#define DEFPARAMENUMVAL(ENUM, V) ENUM ## _KIND_ ## V
+#define DEFPARAMENUMTERM(ENUM) ENUM ## _KIND_ ## LAST
+#define DEFPARAMENUM5(ENUM, OPTION, HELP, DEFAULT, V0, V1, V2, V3, V4)	\
+  enum DEFPARAMENUMNAME (ENUM)					\
+  {								\
+    DEFPARAMENUMVAL (ENUM, V0),					\
+    DEFPARAMENUMVAL (ENUM, V1),					\
+    DEFPARAMENUMVAL (ENUM, V2),					\
+    DEFPARAMENUMVAL (ENUM, V3),					\
+    DEFPARAMENUMVAL (ENUM, V4),					\
+    DEFPARAMENUMTERM (ENUM)					\
+  };
+#include "params.def"
+#undef DEFPARAMENUM5
+#undef DEFPARAMENUMTERM
+#undef DEFPARAMENUMVAL
+#undef DEFPARAMENUMNAME
+#undef DEFPARAM
diff --git a/gcc/params-list.h b/gcc/params-list.h
index ee33ef5..fa76856 100644
--- a/gcc/params-list.h
+++ b/gcc/params-list.h
@@ -19,5 +19,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #define DEFPARAM(enumerator, option, nocmsgid, default, min, max) \
   enumerator,
+#define DEFPARAMENUM5(enumerator, option, nocmsgid, default, \
+		      v0, v1, v2, v3, v4) enumerator,
 #include "params.def"
 #undef DEFPARAM
+#undef DEFPARAMENUM5
diff --git a/gcc/params.c b/gcc/params.c
index b0bc80b..d58d81e 100644
--- a/gcc/params.c
+++ b/gcc/params.c
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "common/common-target.h"
 #include "params.h"
+#include "params-enum.h"
 #include "diagnostic-core.h"
 
 /* An array containing the compiler parameters and their current
@@ -37,12 +38,23 @@ static size_t num_compiler_params;
    default values determined.  */
 static bool params_finished;
 
+#define DEFPARAM(ENUM, OPTION, HELP, DEFAULT, MIN, MAX)
+#define DEFPARAMENUM5(ENUM, OPTION, HELP, DEFAULT, V0, V1, V2, V3, V4)	\
+  static const char *values_ ## ENUM [] = { #V0, #V1, #V2, #V3, #V4, NULL };
+#include "params.def"
+#undef DEFPARAMENUM5
+#undef DEFPARAM
+
 static const param_info lang_independent_params[] = {
 #define DEFPARAM(ENUM, OPTION, HELP, DEFAULT, MIN, MAX) \
-  { OPTION, DEFAULT, MIN, MAX, HELP },
+  { OPTION, DEFAULT, MIN, MAX, HELP, NULL },
+#define DEFPARAMENUM5(ENUM, OPTION, HELP, DEFAULT,	     \
+		      V0, V1, V2, V3, V4)		     \
+  { OPTION, (int)ENUM ## _KIND_ ## DEFAULT, 0, 4, HELP, values_ ## ENUM },
 #include "params.def"
 #undef DEFPARAM
-  { NULL, 0, 0, 0, NULL }
+#undef DEFPARAMENUM5
+  { NULL, 0, 0, 0, NULL, NULL }
 };
 
 /* Add the N PARAMS to the current list of compiler parameters.  */
@@ -114,6 +126,45 @@ set_param_value_internal (compiler_param num, int value,
     params_set[i] = true;
 }
 
+/* Return true if it can find the matching entry for NAME in the parameter
+   table, and assign the entry index to INDEX.  Return false otherwise.  */
+
+bool
+find_param (const char *name, enum compiler_param *index)
+{
+  for (size_t i = 0; i < num_compiler_params; ++i)
+    if (strcmp (compiler_params[i].option, name) == 0)
+      {
+	*index = (enum compiler_param) i;
+	return true;
+      }
+
+  return false;
+}
+
+/* Return true if param with entry index INDEX should be defined using strings.
+   If so, return the value corresponding to VALUE_NAME in *VALUE_P.  */
+
+bool
+param_string_value_p (enum compiler_param index, const char *value_name,
+		      int *value_p)
+{
+  param_info *entry = &compiler_params[(int) index];
+  if (entry->value_names == NULL)
+    return false;
+
+  *value_p = -1;
+
+  for (int i = 0; entry->value_names[i] != NULL; ++i)
+    if (strcmp (entry->value_names[i], value_name) == 0)
+      {
+	*value_p = i;
+	return true;
+      }
+
+  return true;
+}
+
 /* Set the VALUE associated with the parameter given by NAME in PARAMS
    and PARAMS_SET.  */
 
@@ -126,27 +177,27 @@ set_param_value (const char *name, int value,
   /* Make sure nobody tries to set a parameter to an invalid value.  */
   gcc_assert (value != INVALID_PARAM_VAL);
 
-  /* Scan the parameter table to find a matching entry.  */
-  for (i = 0; i < num_compiler_params; ++i)
-    if (strcmp (compiler_params[i].option, name) == 0)
-      {
-	if (value < compiler_params[i].min_value)
-	  error ("minimum value of parameter %qs is %u",
-		 compiler_params[i].option,
-		 compiler_params[i].min_value);
-	else if (compiler_params[i].max_value > compiler_params[i].min_value
-		 && value > compiler_params[i].max_value)
-	  error ("maximum value of parameter %qs is %u",
-		 compiler_params[i].option,
-		 compiler_params[i].max_value);
-	else
-	  set_param_value_internal ((compiler_param) i, value,
-				    params, params_set, true);
-	return;
-      }
-
-  /* If we didn't find this parameter, issue an error message.  */
-  error ("invalid parameter %qs", name);
+  enum compiler_param index;
+  if (!find_param (name, &index))
+    {
+      /* If we didn't find this parameter, issue an error message.  */
+      error ("invalid parameter %qs", name);
+      return;
+    }
+  i = (size_t)index;
+
+  if (value < compiler_params[i].min_value)
+    error ("minimum value of parameter %qs is %u",
+	   compiler_params[i].option,
+	   compiler_params[i].min_value);
+  else if (compiler_params[i].max_value > compiler_params[i].min_value
+	   && value > compiler_params[i].max_value)
+    error ("maximum value of parameter %qs is %u",
+	   compiler_params[i].option,
+	   compiler_params[i].max_value);
+  else
+    set_param_value_internal ((compiler_param) i, value,
+			      params, params_set, true);
 }
 
 /* Set the value of the parameter given by NUM to VALUE in PARAMS and
diff --git a/gcc/params.h b/gcc/params.h
index 9f7618a..1090d00 100644
--- a/gcc/params.h
+++ b/gcc/params.h
@@ -55,6 +55,9 @@ struct param_info
 
   /* A short description of the option.  */
   const char *const help;
+
+  /* The optional names corresponding to the values.  */
+  const char **value_names;
 };
 
 /* An array containing the compiler parameters and their current
@@ -85,6 +88,9 @@ enum compiler_param
   LAST_PARAM
 };
 
+extern bool find_param (const char *, enum compiler_param *);
+extern bool param_string_value_p (enum compiler_param, const char *, int *);
+
 /* The value of the parameter given by ENUM.  Not an lvalue.  */
 #define PARAM_VALUE(ENUM) \
   ((int) global_options.x_param_values[(int) ENUM])
-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH, 5/5] Add param parloops-schedule
  2015-10-10 11:07       ` Tom de Vries
                           ` (3 preceding siblings ...)
  2015-10-10 12:06         ` [PATCH, 4/5] Support DEFPARAMENUM in params.def Tom de Vries
@ 2015-10-10 12:10         ` Tom de Vries
  4 siblings, 0 replies; 19+ messages in thread
From: Tom de Vries @ 2015-10-10 12:10 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 526 bytes --]

On 10/10/15 13:06, Tom de Vries wrote:
> OK, I'll repost with the patch split up, as follows:
>
>       1    Handle simple latch in expand_omp_for_generic
>       2    Add missing phis in expand_omp_for_generic
>       3    Handle original loop tree in expand_omp_for_generic
>       4    Support DEFPARAMENUM in params.def
>       5    Add param parloops-schedule

this patch adds a param 
parloop-schedule=<static|dynamic|guided|auto|runtime>, which sets the 
omp schedule for loops parallelized by parloops.

Thanks,
- Tom

[-- Attachment #2: 0005-Add-param-parloops-schedule.patch --]
[-- Type: text/x-patch, Size: 7518 bytes --]

Add param parloops-schedule

2015-09-10  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/67476
	* doc/invoke.texi (@item parloops-schedule): New item.
	* params.def (PARAM_PARLOOPS_SCHEDULE): New DEFPARAMENUM5.
	* tree-parloops.c: Include params-enum.h.
	(create_parallel_loop): Handle PARAM_PARLOOPS_SCHEDULE.

	* testsuite/libgomp.c/autopar-3.c: New test.
	* testsuite/libgomp.c/autopar-4.c: New test.
	* testsuite/libgomp.c/autopar-5.c: New test.
	* testsuite/libgomp.c/autopar-6.c: New test.
	* testsuite/libgomp.c/autopar-7.c: New test.
	* testsuite/libgomp.c/autopar-8.c: New test.
---
 gcc/doc/invoke.texi                     |  4 ++++
 gcc/params.def                          | 12 ++++++++++++
 gcc/tree-parloops.c                     | 26 +++++++++++++++++++++++++-
 libgomp/testsuite/libgomp.c/autopar-3.c |  4 ++++
 libgomp/testsuite/libgomp.c/autopar-4.c |  4 ++++
 libgomp/testsuite/libgomp.c/autopar-5.c |  4 ++++
 libgomp/testsuite/libgomp.c/autopar-6.c |  4 ++++
 libgomp/testsuite/libgomp.c/autopar-7.c |  4 ++++
 libgomp/testsuite/libgomp.c/autopar-8.c |  4 ++++
 9 files changed, 65 insertions(+), 1 deletion(-)
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-3.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-4.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-6.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-7.c
 create mode 100644 libgomp/testsuite/libgomp.c/autopar-8.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1b44b44..dea29d8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11132,6 +11132,10 @@ automaton.  The default is 50.
 Chunk size of omp schedule for loops parallelized by parloops.  The default
 is 0.
 
+@item parloops-schedule
+Schedule type of omp schedule for loops parallelized by parloops (static,
+dynamic, guided, auto, runtime).  The default is static.
+
 @item max-ssa-name-query-depth
 Maximum depth of recursion when querying properties of SSA names in things
 like fold routines.  One level of recursion corresponds to following a
diff --git a/gcc/params.def b/gcc/params.def
index 8b8b9fe..dd07301 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -35,6 +35,11 @@ along with GCC; see the file COPYING3.  If not see
      - The maximum acceptable value for the parameter (if greater than
      the minimum).
 
+   The DEFPARAMENUM<N> macro is similar, but instead of the minumum and maximum
+   arguments, it contains a list of <N> allowed strings, corresponding to
+   integer values 0..<N>-1.  Note that the default argument needs to be
+   specified as one of the allowed strings, rather than an integer value.
+
    Be sure to add an entry to invoke.texi summarizing the parameter.  */
 
 /* When branch is predicted to be taken with probability lower than this
@@ -1153,6 +1158,13 @@ DEFPARAM (PARAM_PARLOOPS_CHUNK_SIZE,
 	  "Chunk size of omp schedule for loops parallelized by parloops",
 	  0, 0, 0)
 
+DEFPARAMENUM5 (PARAM_PARLOOPS_SCHEDULE,
+	       "parloops-schedule",
+	       "Schedule type of omp schedule for loops parallelized by "
+	       "parloops (static, dynamic, guided, auto, runtime)",
+	       static,
+	       static, dynamic, guided, auto, runtime)
+
 DEFPARAM (PARAM_MAX_SSA_NAME_QUERY_DEPTH,
 	  "max-ssa-name-query-depth",
 	  "Maximum recursion depth allowed when querying a property of an"
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 741392b..62e561c 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "tree-ssa.h"
 #include "params.h"
+#include "params-enum.h"
 
 /* This pass tries to distribute iterations of loops into several threads.
    The implementation is straightforward -- for each loop we test whether its
@@ -2086,8 +2087,31 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
   gimple_cond_set_lhs (cond_stmt, cvar_base);
   type = TREE_TYPE (cvar);
   t = build_omp_clause (loc, OMP_CLAUSE_SCHEDULE);
-  OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
   int chunk_size = PARAM_VALUE (PARAM_PARLOOPS_CHUNK_SIZE);
+  enum PARAM_PARLOOPS_SCHEDULE_KIND schedule_type \
+    = (enum PARAM_PARLOOPS_SCHEDULE_KIND) PARAM_VALUE (PARAM_PARLOOPS_SCHEDULE);
+  switch (schedule_type)
+    {
+    case PARAM_PARLOOPS_SCHEDULE_KIND_static:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_STATIC;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_dynamic:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_DYNAMIC;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_guided:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_GUIDED;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_auto:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_AUTO;
+      chunk_size = 0;
+      break;
+    case PARAM_PARLOOPS_SCHEDULE_KIND_runtime:
+      OMP_CLAUSE_SCHEDULE_KIND (t) = OMP_CLAUSE_SCHEDULE_RUNTIME;
+      chunk_size = 0;
+      break;
+    default:
+      gcc_unreachable ();
+    }
   if (chunk_size != 0)
     OMP_CLAUSE_SCHEDULE_CHUNK_EXPR (t)
       = build_int_cst (integer_type_node, chunk_size);
diff --git a/libgomp/testsuite/libgomp.c/autopar-3.c b/libgomp/testsuite/libgomp.c/autopar-3.c
new file mode 100644
index 0000000..1c25a44
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-3.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=dynamic" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-4.c b/libgomp/testsuite/libgomp.c/autopar-4.c
new file mode 100644
index 0000000..78c77d9
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-4.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=dynamic --param parloops-chunk-size=100" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-5.c b/libgomp/testsuite/libgomp.c/autopar-5.c
new file mode 100644
index 0000000..f0acb20
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-5.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=guided" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-6.c b/libgomp/testsuite/libgomp.c/autopar-6.c
new file mode 100644
index 0000000..f6e723e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-6.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=guided --param parloops-chunk-size=100" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-7.c b/libgomp/testsuite/libgomp.c/autopar-7.c
new file mode 100644
index 0000000..5f15508f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-7.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=auto" } */
+
+#include "autopar-1.c"
diff --git a/libgomp/testsuite/libgomp.c/autopar-8.c b/libgomp/testsuite/libgomp.c/autopar-8.c
new file mode 100644
index 0000000..6099e9f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/autopar-8.c
@@ -0,0 +1,4 @@
+/* { dg-do run } */
+/* { dg-additional-options "-ftree-parallelize-loops=4 -ffast-math --param parloops-schedule=runtime" } */
+
+#include "autopar-1.c"
-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 1/5] Handle simple latch in expand_omp_for_generic
  2015-10-10 11:25         ` [PATCH, 1/5] Handle simple latch in expand_omp_for_generic Tom de Vries
@ 2015-10-12 14:04           ` Bernd Schmidt
  0 siblings, 0 replies; 19+ messages in thread
From: Bernd Schmidt @ 2015-10-12 14:04 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 10/10/2015 01:24 PM, Tom de Vries wrote:
> On 10/10/15 13:06, Tom de Vries wrote:
>> OK, I'll repost with the patch split up, as follows:
>>
>>       1    Handle simple latch in expand_omp_for_generic
>>       2    Add missing phis in expand_omp_for_generic
>>       3    Handle original loop tree in expand_omp_for_generic
>>       4    Support DEFPARAMENUM in params.def
>>       5    Add param parloops-schedule
>
> this patch handles simple latches in expand_omp_for_generic.
>
> This allows us to handle loops which have the LOOPS_HAVE_SIMPLE_LATCHES
> property.
>
> A similar fix was done:
> - in r226427 for expand_omp_for_static_nochunk (for PR66846)
> - in r227435 for expand_omp_for_static_chunk (for
>    --param parloops-chunk-size)

This looks ok.


Bernd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 2/5] Add missing phis in expand_omp_for_generic
  2015-10-10 11:50         ` [PATCH, 2/5] Add missing phis " Tom de Vries
@ 2015-10-12 14:06           ` Bernd Schmidt
  2015-10-12 14:12             ` Tom de Vries
  0 siblings, 1 reply; 19+ messages in thread
From: Bernd Schmidt @ 2015-10-12 14:06 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 10/10/2015 01:49 PM, Tom de Vries wrote:
> On 10/10/15 13:06, Tom de Vries wrote:
>> OK, I'll repost with the patch split up, as follows:
>>
>>       1    Handle simple latch in expand_omp_for_generic
>>       2    Add missing phis in expand_omp_for_generic
>>       3    Handle original loop tree in expand_omp_for_generic
>>       4    Support DEFPARAMENUM in params.def
>>       5    Add param parloops-schedule
>
> Hi,
>
> this patch adds missing phis in expand_omp_for_generic.
>
> In expand_omp_for_generic, we add an outer loop around an inner loop.
> That means we need to:
> - add the necessary phis on the outer loop, and
> - move the loop entry value of the inner phi to the loop entry value of
>    the outer phi

Also ok, I think. This seems to be slightly different from the one 
originally submitted?


Bernd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 3/5] Handle original loop tree in expand_omp_for_generic
  2015-10-10 11:59         ` [PATCH, 3/5] Handle original loop tree " Tom de Vries
@ 2015-10-12 14:11           ` Bernd Schmidt
  2015-10-12 16:57             ` Tom de Vries
  0 siblings, 1 reply; 19+ messages in thread
From: Bernd Schmidt @ 2015-10-12 14:11 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 10/10/2015 01:58 PM, Tom de Vries wrote:

> Handle original loop tree in expand_omp_for_generic
>
> 2015-09-10  Tom de Vries<tom@codesourcery.com>
>
> 	PR tree-optimization/67476
> 	* omp-low.c (expand_omp_for_generic): Handle original loop tree.

This one I find slightly confusing.

> -      add_bb_to_loop (l2_bb, cont_bb->loop_father);
> +      struct loop *loop = l1_bb->loop_father;
> +      add_bb_to_loop (l2_bb, entry_bb->loop_father);
>         add_loop (outer_loop, l0_bb->loop_father);

Looks like a lot of bb's loop_father is being looked at. Are all or some 
of these supposed to be the same? I think I'd like one (appropriately 
named) struct loop * variable for each loop that's involved here. 
There's a comment suggesting that there can be different situations, it 
would be good to expand that to explain how they can arise.

> -	  struct loop *loop = alloc_loop ();
> +	  loop = alloc_loop ();

Also, I think it would be preferrable to not reuse that loop variable 
but make a new one instead.


Bernd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 4/5] Support DEFPARAMENUM in params.def
  2015-10-10 12:06         ` [PATCH, 4/5] Support DEFPARAMENUM in params.def Tom de Vries
@ 2015-10-12 14:12           ` Bernd Schmidt
  0 siblings, 0 replies; 19+ messages in thread
From: Bernd Schmidt @ 2015-10-12 14:12 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

On 10/10/2015 02:05 PM, Tom de Vries wrote:
> On 10/10/15 13:06, Tom de Vries wrote:
>> OK, I'll repost with the patch split up, as follows:
>>
>>       1    Handle simple latch in expand_omp_for_generic
>>       2    Add missing phis in expand_omp_for_generic
>>       3    Handle original loop tree in expand_omp_for_generic
>>       4    Support DEFPARAMENUM in params.def
>>       5    Add param parloops-schedule
>>
>
> this patch adds support for DEFPARAMENUM in params.def.
>
> Using this support, we're able to define a param parloops-schedule with
> values names "static, dynamic, guided, auto, runtime" and default value
> "static" like this in params.def:

This and the next one are ok once all the prerequisites are in.


Bernd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 2/5] Add missing phis in expand_omp_for_generic
  2015-10-12 14:06           ` Bernd Schmidt
@ 2015-10-12 14:12             ` Tom de Vries
  0 siblings, 0 replies; 19+ messages in thread
From: Tom de Vries @ 2015-10-12 14:12 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

On 12/10/15 16:05, Bernd Schmidt wrote:
> This seems to be slightly different from the one originally submitted?

Hi Bernd,

As I mentioned here  ( 
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01043.html ),  I've moved 
the ssa-support bit into the !broken_loop condition. I think that's the 
only difference.

Thanks,
- Tom

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 3/5] Handle original loop tree in expand_omp_for_generic
  2015-10-12 14:11           ` Bernd Schmidt
@ 2015-10-12 16:57             ` Tom de Vries
  2015-10-12 17:24               ` Bernd Schmidt
  2015-10-13 21:49               ` Thomas Schwinge
  0 siblings, 2 replies; 19+ messages in thread
From: Tom de Vries @ 2015-10-12 16:57 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1157 bytes --]

On 12/10/15 16:11, Bernd Schmidt wrote:
> On 10/10/2015 01:58 PM, Tom de Vries wrote:
>
>> Handle original loop tree in expand_omp_for_generic
>>
>> 2015-09-10  Tom de Vries<tom@codesourcery.com>
>>
>>     PR tree-optimization/67476
>>     * omp-low.c (expand_omp_for_generic): Handle original loop tree.
>
> This one I find slightly confusing.
>
>> -      add_bb_to_loop (l2_bb, cont_bb->loop_father);
>> +      struct loop *loop = l1_bb->loop_father;
>> +      add_bb_to_loop (l2_bb, entry_bb->loop_father);
>>         add_loop (outer_loop, l0_bb->loop_father);
>
> Looks like a lot of bb's loop_father is being looked at. Are all or some
> of these supposed to be the same? I think I'd like one (appropriately
> named) struct loop * variable for each loop that's involved here.

Done.

> There's a comment suggesting that there can be different situations, it
> would be good to expand that to explain how they can arise.
>
>> -      struct loop *loop = alloc_loop ();
>> +      loop = alloc_loop ();
>
> Also, I think it would be preferrable to not reuse that loop variable
> but make a new one instead.
>

Does this version look better?

Thanks,
- Tom

[-- Attachment #2: 0001-Handle-original-loop-tree-in-expand_omp_for_generic.patch --]
[-- Type: text/x-patch, Size: 2193 bytes --]

Handle original loop tree in expand_omp_for_generic

2015-09-12  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/67476
	* omp-low.c (expand_omp_for_generic): Handle original loop tree.
---
 gcc/omp-low.c | 30 +++++++++++++++++++++---------
 1 file changed, 21 insertions(+), 9 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index b2a93b9..b957428 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -6439,7 +6439,6 @@ expand_omp_for_generic (struct omp_region *region,
       remove_edge (e);
 
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
-      add_bb_to_loop (l2_bb, cont_bb->loop_father);
       e = find_edge (cont_bb, l1_bb);
       if (e == NULL)
 	{
@@ -6516,17 +6515,30 @@ expand_omp_for_generic (struct omp_region *region,
       set_immediate_dominator (CDI_DOMINATORS, l1_bb,
 			       recompute_dominator (CDI_DOMINATORS, l1_bb));
 
-      struct loop *outer_loop = alloc_loop ();
-      outer_loop->header = l0_bb;
-      outer_loop->latch = l2_bb;
-      add_loop (outer_loop, l0_bb->loop_father);
+      /* We enter expand_omp_for_generic with a loop.  This original loop may
+	 have its own loop struct, or it may be part of an outer loop struct
+	 (which may be the fake loop).  */
+      struct loop *outer_loop = entry_bb->loop_father;
+      bool orig_loop_has_loop_struct = l1_bb->loop_father != outer_loop;
 
-      if (!gimple_omp_for_combined_p (fd->for_stmt))
+      add_bb_to_loop (l2_bb, outer_loop);
+
+      /* We've added a new loop around the original loop.  Allocate the
+	 corresponding loop struct.  */
+      struct loop *new_loop = alloc_loop ();
+      new_loop->header = l0_bb;
+      new_loop->latch = l2_bb;
+      add_loop (new_loop, outer_loop);
+
+      if (/* If we already have a loop struct for the original loop, don't
+	     allocate a new one.  */
+	  !orig_loop_has_loop_struct
+	  && !gimple_omp_for_combined_p (fd->for_stmt))
 	{
-	  struct loop *loop = alloc_loop ();
-	  loop->header = l1_bb;
+	  struct loop *orig_loop = alloc_loop ();
+	  orig_loop->header = l1_bb;
 	  /* The loop may have multiple latches.  */
-	  add_loop (loop, outer_loop);
+	  add_loop (orig_loop, new_loop);
 	}
     }
 }
-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 3/5] Handle original loop tree in expand_omp_for_generic
  2015-10-12 16:57             ` Tom de Vries
@ 2015-10-12 17:24               ` Bernd Schmidt
  2015-10-13 21:49               ` Thomas Schwinge
  1 sibling, 0 replies; 19+ messages in thread
From: Bernd Schmidt @ 2015-10-12 17:24 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches

> Does this version look better?

In terms of clarity, yes. Only one thing:

> +      if (/* If we already have a loop struct for the original loop, don't
> +	     allocate a new one.  */
> +	  !orig_loop_has_loop_struct

Don't really like the formatting with this comment. I'd pull it in front 
of the if statement, and change it to
  /* Allocate a loop structure for the original loop unless we already
     had one.  */

Ok with that change.


Bernd

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH, 3/5] Handle original loop tree in expand_omp_for_generic
  2015-10-12 16:57             ` Tom de Vries
  2015-10-12 17:24               ` Bernd Schmidt
@ 2015-10-13 21:49               ` Thomas Schwinge
  2015-10-14 14:54                 ` [gomp4, committed] Backported param parloops-schedule Tom de Vries
  1 sibling, 1 reply; 19+ messages in thread
From: Thomas Schwinge @ 2015-10-13 21:49 UTC (permalink / raw)
  To: Tom de Vries; +Cc: gcc-patches, Chung-Lin Tang

[-- Attachment #1: Type: text/plain, Size: 13415 bytes --]

Hi Tom!

On Mon, 12 Oct 2015 18:56:29 +0200, Tom de Vries <Tom_deVries@mentor.com> wrote:
> Handle original loop tree in expand_omp_for_generic
> 
> 2015-09-12  Tom de Vries  <tom@codesourcery.com>
> 
> 	PR tree-optimization/67476
> 	* omp-low.c (expand_omp_for_generic): Handle original loop tree.

Working on a merge from trunk into gomp-4_0-branch, I'm seeing your
change (trunk r228754) conflict with code Chung-Lin changed
(gomp-4_0-branch r224505).  So, would you two please cherry-pick/merge
trunk r228754 into gomp-4_0-branch?  Thanks!  (I'm assuming you can
easily tell what needs to be done here; it's been a long time that
Chung-Lin touched this code, so CCing him just in case.)  Thanks!


Chung-Lin's gomp-4_0-branch r224505:

commit 5f9849b7f0723d06fcd18a18e0880d4df75da92a
Author: cltang <cltang@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Tue Jun 16 08:59:01 2015 +0000

    2015-06-16  Chung-Lin Tang  <cltang@codesourcery.com>
    
    	* omp-low.c (struct omp_region): Add inside_kernels_p field.
    	(expand_omp_for_generic): Adjust to generate a 'sequential' loop
    	when GOMP builtin arguments are BUILT_IN_NONE.
    	(expand_omp_for): Use expand_omp_for_generic() to generate a
    	non-parallelized loop for OMP_FORs inside OpenACC kernels regions.
    	(expand_omp): Mark inside_kernels_p field true for regions
    	nested inside OpenACC kernels constructs.
    
    
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@224505 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index be09b0f..6fa08da 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,13 @@
+2015-06-16  Chung-Lin Tang  <cltang@codesourcery.com>
+
+	* omp-low.c (struct omp_region): Add inside_kernels_p field.
+	(expand_omp_for_generic): Adjust to generate a 'sequential' loop
+	when GOMP builtin arguments are BUILT_IN_NONE.
+	(expand_omp_for): Use expand_omp_for_generic() to generate a
+	non-parallelized loop for OMP_FORs inside OpenACC kernels regions.
+	(expand_omp): Mark inside_kernels_p field true for regions
+	nested inside OpenACC kernels constructs.
+
 2015-06-15  Cesar Philippidis  <cesar@codesourcery.com>
 
 	* omp-low.c (expand_omp_for_static_nochunk): Update entry_bb after
diff --git gcc/omp-low.c gcc/omp-low.c
index c7451c9..a3dab12 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -161,6 +161,9 @@ struct omp_region
   /* True if this is a combined parallel+workshare region.  */
   bool is_combined_parallel;
 
+  /* True if this is nested inside an OpenACC kernels construct.  */
+  bool inside_kernels_p;
+
   /* For an OpenACC loop, the level of parallelism requested.  */
   int gwv_this;
 
@@ -6862,6 +6865,7 @@ expand_omp_for_generic (struct omp_region *region,
   gassign *assign_stmt;
   bool in_combined_parallel = is_combined_parallel (region);
   bool broken_loop = region->cont == NULL;
+  bool seq_loop = (!start_fn || !next_fn);
   edge e, ne;
   tree *counts = NULL;
   int i;
@@ -6949,7 +6953,20 @@ expand_omp_for_generic (struct omp_region *region,
 							    zero_iter_bb));
 	}
     }
-  if (in_combined_parallel)
+  if (seq_loop)
+    {
+      tree n1 = fold_convert (fd->iter_type, fd->loop.n1);
+      tree n2 = fold_convert (fd->iter_type, fd->loop.n2);
+
+      assign_stmt = gimple_build_assign (istart0, n1);
+      gsi_insert_before (&gsi, assign_stmt, GSI_SAME_STMT);
+
+      assign_stmt = gimple_build_assign (iend0, n2);
+      gsi_insert_before (&gsi, assign_stmt, GSI_SAME_STMT);
+
+      t = fold_build2 (NE_EXPR, boolean_type_node, istart0, iend0);
+    }
+  else if (in_combined_parallel)
     {
       /* In a combined parallel loop, emit a call to
 	 GOMP_loop_foo_next.  */
@@ -7135,32 +7152,38 @@ expand_omp_for_generic (struct omp_region *region,
 	collapse_bb = extract_omp_for_update_vars (fd, cont_bb, l1_bb);
 
       /* Emit code to get the next parallel iteration in L2_BB.  */
-      gsi = gsi_start_bb (l2_bb);
+      if (!seq_loop)
+	{
+	  gsi = gsi_start_bb (l2_bb);
 
-      t = build_call_expr (builtin_decl_explicit (next_fn), 2,
-			   build_fold_addr_expr (istart0),
-			   build_fold_addr_expr (iend0));
-      t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
-				    false, GSI_CONTINUE_LINKING);
-      if (TREE_TYPE (t) != boolean_type_node)
-	t = fold_build2 (NE_EXPR, boolean_type_node,
-			 t, build_int_cst (TREE_TYPE (t), 0));
-      gcond *cond_stmt = gimple_build_cond_empty (t);
-      gsi_insert_after (&gsi, cond_stmt, GSI_CONTINUE_LINKING);
+	  t = build_call_expr (builtin_decl_explicit (next_fn), 2,
+			       build_fold_addr_expr (istart0),
+			       build_fold_addr_expr (iend0));
+	  t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+					false, GSI_CONTINUE_LINKING);
+	  if (TREE_TYPE (t) != boolean_type_node)
+	    t = fold_build2 (NE_EXPR, boolean_type_node,
+			     t, build_int_cst (TREE_TYPE (t), 0));
+	  gcond *cond_stmt = gimple_build_cond_empty (t);
+	  gsi_insert_after (&gsi, cond_stmt, GSI_CONTINUE_LINKING);
+	}
     }
 
   /* Add the loop cleanup function.  */
   gsi = gsi_last_bb (exit_bb);
-  if (gimple_omp_return_nowait_p (gsi_stmt (gsi)))
-    t = builtin_decl_explicit (BUILT_IN_GOMP_LOOP_END_NOWAIT);
-  else if (gimple_omp_return_lhs (gsi_stmt (gsi)))
-    t = builtin_decl_explicit (BUILT_IN_GOMP_LOOP_END_CANCEL);
-  else
-    t = builtin_decl_explicit (BUILT_IN_GOMP_LOOP_END);
-  gcall *call_stmt = gimple_build_call (t, 0);
-  if (gimple_omp_return_lhs (gsi_stmt (gsi)))
-    gimple_call_set_lhs (call_stmt, gimple_omp_return_lhs (gsi_stmt (gsi)));
-  gsi_insert_after (&gsi, call_stmt, GSI_SAME_STMT);
+  if (!seq_loop)
+    {
+      if (gimple_omp_return_nowait_p (gsi_stmt (gsi)))
+	t = builtin_decl_explicit (BUILT_IN_GOMP_LOOP_END_NOWAIT);
+      else if (gimple_omp_return_lhs (gsi_stmt (gsi)))
+	t = builtin_decl_explicit (BUILT_IN_GOMP_LOOP_END_CANCEL);
+      else
+	t = builtin_decl_explicit (BUILT_IN_GOMP_LOOP_END);
+      gcall *call_stmt = gimple_build_call (t, 0);
+      if (gimple_omp_return_lhs (gsi_stmt (gsi)))
+	gimple_call_set_lhs (call_stmt, gimple_omp_return_lhs (gsi_stmt (gsi)));
+      gsi_insert_after (&gsi, call_stmt, GSI_SAME_STMT);
+    }
   gsi_remove (&gsi, true);
 
   /* Connect the new blocks.  */
@@ -7172,7 +7195,7 @@ expand_omp_for_generic (struct omp_region *region,
       gimple_seq phis;
 
       e = find_edge (cont_bb, l3_bb);
-      ne = make_edge (l2_bb, l3_bb, EDGE_FALSE_VALUE);
+      ne = make_edge (l2_bb, l3_bb, seq_loop ? EDGE_FALLTHRU : EDGE_FALSE_VALUE);
 
       phis = phi_nodes (l3_bb);
       for (gsi = gsi_start (phis); !gsi_end_p (gsi); gsi_next (&gsi))
@@ -7208,7 +7231,8 @@ expand_omp_for_generic (struct omp_region *region,
 	  e = find_edge (cont_bb, l2_bb);
 	  e->flags = EDGE_FALLTHRU;
 	}
-      make_edge (l2_bb, l0_bb, EDGE_TRUE_VALUE);
+      if (!seq_loop)
+	make_edge (l2_bb, l0_bb, EDGE_TRUE_VALUE);
 
       set_immediate_dominator (CDI_DOMINATORS, l2_bb,
 			       recompute_dominator (CDI_DOMINATORS, l2_bb));
@@ -7219,10 +7243,16 @@ expand_omp_for_generic (struct omp_region *region,
       set_immediate_dominator (CDI_DOMINATORS, l1_bb,
 			       recompute_dominator (CDI_DOMINATORS, l1_bb));
 
-      struct loop *outer_loop = alloc_loop ();
-      outer_loop->header = l0_bb;
-      outer_loop->latch = l2_bb;
-      add_loop (outer_loop, l0_bb->loop_father);
+      struct loop *outer_loop;
+      if (seq_loop)
+	outer_loop = l0_bb->loop_father;
+      else
+	{
+	  outer_loop = alloc_loop ();
+	  outer_loop->header = l0_bb;
+	  outer_loop->latch = l2_bb;
+	  add_loop (outer_loop, l0_bb->loop_father);
+	}
 
       if (!gimple_omp_for_combined_p (fd->for_stmt))
 	{
@@ -8704,7 +8734,10 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt)
        original loops from being detected.  Fix that up.  */
     loops_state_set (LOOPS_NEED_FIXUP);
 
-  if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_SIMD)
+  if (region->inside_kernels_p)
+    expand_omp_for_generic (region, &fd, BUILT_IN_NONE, BUILT_IN_NONE,
+			    inner_stmt);
+  else if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_SIMD)
     expand_omp_simd (region, &fd);
   else if (gimple_omp_for_kind (fd.for_stmt) == GF_OMP_FOR_KIND_CILKFOR)
     expand_cilk_for (region, &fd);
@@ -10296,6 +10329,14 @@ expand_omp (struct omp_region *region)
       if (region->type == GIMPLE_OMP_PARALLEL)
 	determine_parallel_type (region);
 
+      if (region->type == GIMPLE_OMP_TARGET && region->inner)
+	{
+	  gomp_target *entry = as_a <gomp_target *> (last_stmt (region->entry));
+	  if (region->inside_kernels_p
+	      || gimple_omp_target_kind (entry) == GF_OMP_TARGET_KIND_OACC_KERNELS)
+	    region->inner->inside_kernels_p = true;
+	}
+
       if (region->type == GIMPLE_OMP_FOR
 	  && gimple_omp_for_combined_p (last_stmt (region->entry)))
 	inner_stmt = last_stmt (region->inner->entry);


Tom's trunk r228754:

commit 1c6a437bd44020c37452b7fb4f565f7e7f94d56b
Author: vries <vries@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Tue Oct 13 10:08:40 2015 +0000

    Handle original loop tree in expand_omp_for_generic
    
    2015-10-13  Tom de Vries  <tom@codesourcery.com>
    
    	PR tree-optimization/67476
    	* omp-low.c (expand_omp_for_generic): Handle original loop tree.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@228754 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git gcc/ChangeLog gcc/ChangeLog
index e5ede0b..4632387 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-10-13  Tom de Vries  <tom@codesourcery.com>
+
+	PR tree-optimization/67476
+	* omp-low.c (expand_omp_for_generic): Handle original loop tree.
+
 2015-10-13  Richard Biener  <rguenther@suse.de>
 
 	* tree-vect-data-refs.c (vect_analyze_data_ref_dependences): Allocate
diff --git gcc/omp-low.c gcc/omp-low.c
index b2a93b9..7e894e4 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -6439,7 +6439,6 @@ expand_omp_for_generic (struct omp_region *region,
       remove_edge (e);
 
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
-      add_bb_to_loop (l2_bb, cont_bb->loop_father);
       e = find_edge (cont_bb, l1_bb);
       if (e == NULL)
 	{
@@ -6516,17 +6515,30 @@ expand_omp_for_generic (struct omp_region *region,
       set_immediate_dominator (CDI_DOMINATORS, l1_bb,
 			       recompute_dominator (CDI_DOMINATORS, l1_bb));
 
-      struct loop *outer_loop = alloc_loop ();
-      outer_loop->header = l0_bb;
-      outer_loop->latch = l2_bb;
-      add_loop (outer_loop, l0_bb->loop_father);
+      /* We enter expand_omp_for_generic with a loop.  This original loop may
+	 have its own loop struct, or it may be part of an outer loop struct
+	 (which may be the fake loop).  */
+      struct loop *outer_loop = entry_bb->loop_father;
+      bool orig_loop_has_loop_struct = l1_bb->loop_father != outer_loop;
 
-      if (!gimple_omp_for_combined_p (fd->for_stmt))
+      add_bb_to_loop (l2_bb, outer_loop);
+
+      /* We've added a new loop around the original loop.  Allocate the
+	 corresponding loop struct.  */
+      struct loop *new_loop = alloc_loop ();
+      new_loop->header = l0_bb;
+      new_loop->latch = l2_bb;
+      add_loop (new_loop, outer_loop);
+
+      /* Allocate a loop structure for the original loop unless we already
+	 had one.  */
+      if (!orig_loop_has_loop_struct
+	  && !gimple_omp_for_combined_p (fd->for_stmt))
 	{
-	  struct loop *loop = alloc_loop ();
-	  loop->header = l1_bb;
+	  struct loop *orig_loop = alloc_loop ();
+	  orig_loop->header = l1_bb;
 	  /* The loop may have multiple latches.  */
-	  add_loop (loop, outer_loop);
+	  add_loop (orig_loop, new_loop);
 	}
     }
 }


The merge conflict looks as follows:

      set_immediate_dominator (CDI_DOMINATORS, l1_bb,
			       recompute_dominator (CDI_DOMINATORS, l1_bb));

<<<<<<< HEAD
      struct loop *outer_loop;
      if (seq_loop)
	outer_loop = l0_bb->loop_father;
      else
	{
	  outer_loop = alloc_loop ();
	  outer_loop->header = l0_bb;
	  outer_loop->latch = l2_bb;
	  add_loop (outer_loop, l0_bb->loop_father);
	}
=======
      /* We enter expand_omp_for_generic with a loop.  This original loop may
	 have its own loop struct, or it may be part of an outer loop struct
	 (which may be the fake loop).  */
      struct loop *outer_loop = entry_bb->loop_father;
      bool orig_loop_has_loop_struct = l1_bb->loop_father != outer_loop;
>>>>>>> e2c514f0507fb1864c4eed5d691e47156be57b5b

      add_bb_to_loop (l2_bb, outer_loop);

      /* We've added a new loop around the original loop.  Allocate the
	 corresponding loop struct.  */
      struct loop *new_loop = alloc_loop ();
      new_loop->header = l0_bb;
      new_loop->latch = l2_bb;
      add_loop (new_loop, outer_loop);

      /* Allocate a loop structure for the original loop unless we already
	 had one.  */
      if (!orig_loop_has_loop_struct
	  && !gimple_omp_for_combined_p (fd->for_stmt))
	{
	  struct loop *orig_loop = alloc_loop ();
	  orig_loop->header = l1_bb;
	  /* The loop may have multiple latches.  */
	  add_loop (orig_loop, new_loop);
	}
    }
}


Grüße,
 Thomas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [gomp4, committed] Backported param parloops-schedule
  2015-10-13 21:49               ` Thomas Schwinge
@ 2015-10-14 14:54                 ` Tom de Vries
  0 siblings, 0 replies; 19+ messages in thread
From: Tom de Vries @ 2015-10-14 14:54 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: gcc-patches, Chung-Lin Tang

[-- Attachment #1: Type: text/plain, Size: 1280 bytes --]

[ was: Re: [PATCH, 3/5] Handle original loop tree in 
expand_omp_for_generic ]

On 13/10/15 23:48, Thomas Schwinge wrote:
> Hi Tom!
>
> On Mon, 12 Oct 2015 18:56:29 +0200, Tom de Vries<Tom_deVries@mentor.com>  wrote:
>> >Handle original loop tree in expand_omp_for_generic
>> >
>> >2015-09-12  Tom de Vries<tom@codesourcery.com>
>> >
>> >	PR tree-optimization/67476
>> >	* omp-low.c (expand_omp_for_generic): Handle original loop tree.
> Working on a merge from trunk into gomp-4_0-branch, I'm seeing your
> change (trunk r228754) conflict with code Chung-Lin changed
> (gomp-4_0-branch r224505).  So, would you two please cherry-pick/merge
> trunk r228754 into gomp-4_0-branch?  Thanks!  (I'm assuming you can
> easily tell what needs to be done here; it's been a long time that
> Chung-Lin touched this code, so CCing him just in case.)  Thanks!

Hi Thomas,

I've backport the whole patch series:
      1    Handle simple latch in expand_omp_for_generic
      2    Add missing phis in expand_omp_for_generic
      3    Handle original loop tree in expand_omp_for_generic
      4    Support DEFPARAMENUM in params.def
      5    Add param parloops-schedule
and committed to gomp-4_0-branch.

I'm only posting patch nr. 3, the only one with a non-trivial conflict.

Thanks,
- Tom

[-- Attachment #2: 0003-Handle-original-loop-tree-in-expand_omp_for_generic.patch --]
[-- Type: text/x-patch, Size: 2429 bytes --]

Handle original loop tree in expand_omp_for_generic

2015-10-14  Tom de Vries  <tom@codesourcery.com>

	backport from trunk:
	2015-10-13  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/67476
	* omp-low.c (expand_omp_for_generic): Handle original loop tree.
---
 gcc/omp-low.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 473e2e7..dde3e1b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -6924,7 +6924,6 @@ expand_omp_for_generic (struct omp_region *region,
       remove_edge (e);
 
       make_edge (cont_bb, l2_bb, EDGE_FALSE_VALUE);
-      add_bb_to_loop (l2_bb, cont_bb->loop_father);
       e = find_edge (cont_bb, l1_bb);
       if (e == NULL)
 	{
@@ -7002,23 +7001,36 @@ expand_omp_for_generic (struct omp_region *region,
       set_immediate_dominator (CDI_DOMINATORS, l1_bb,
 			       recompute_dominator (CDI_DOMINATORS, l1_bb));
 
-      struct loop *outer_loop;
-      if (seq_loop)
-	outer_loop = l0_bb->loop_father;
-      else
+      /* We enter expand_omp_for_generic with a loop.  This original loop may
+	 have its own loop struct, or it may be part of an outer loop struct
+	 (which may be the fake loop).  */
+      struct loop *outer_loop = entry_bb->loop_father;
+      bool orig_loop_has_loop_struct = l1_bb->loop_father != outer_loop;
+
+      add_bb_to_loop (l2_bb, outer_loop);
+
+      struct loop *new_loop = NULL;
+      if (!seq_loop)
 	{
-	  outer_loop = alloc_loop ();
-	  outer_loop->header = l0_bb;
-	  outer_loop->latch = l2_bb;
-	  add_loop (outer_loop, l0_bb->loop_father);
+	  /* We've added a new loop around the original loop.  Allocate the
+	     corresponding loop struct.  */
+	  new_loop = alloc_loop ();
+	  new_loop->header = l0_bb;
+	  new_loop->latch = l2_bb;
+	  add_loop (new_loop, outer_loop);
 	}
 
-      if (!gimple_omp_for_combined_p (fd->for_stmt))
+      /* Allocate a loop structure for the original loop unless we already
+	 had one.  */
+      if (!orig_loop_has_loop_struct
+	  && !gimple_omp_for_combined_p (fd->for_stmt))
 	{
-	  struct loop *loop = alloc_loop ();
-	  loop->header = l1_bb;
+	  struct loop *orig_loop = alloc_loop ();
+	  orig_loop->header = l1_bb;
 	  /* The loop may have multiple latches.  */
-	  add_loop (loop, outer_loop);
+	  add_loop (orig_loop, (new_loop != NULL
+				? new_loop
+				: outer_loop));
 	}
     }
 }
-- 
1.9.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-10-14 14:54 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22  7:47 [PING][PR67476] Add param parloops-schedule Tom de Vries
2015-09-22 11:25 ` Bernd Schmidt
2015-10-04 15:37   ` Tom de Vries
2015-10-06  9:31     ` Bernd Schmidt
2015-10-10 11:07       ` Tom de Vries
2015-10-10 11:25         ` [PATCH, 1/5] Handle simple latch in expand_omp_for_generic Tom de Vries
2015-10-12 14:04           ` Bernd Schmidt
2015-10-10 11:50         ` [PATCH, 2/5] Add missing phis " Tom de Vries
2015-10-12 14:06           ` Bernd Schmidt
2015-10-12 14:12             ` Tom de Vries
2015-10-10 11:59         ` [PATCH, 3/5] Handle original loop tree " Tom de Vries
2015-10-12 14:11           ` Bernd Schmidt
2015-10-12 16:57             ` Tom de Vries
2015-10-12 17:24               ` Bernd Schmidt
2015-10-13 21:49               ` Thomas Schwinge
2015-10-14 14:54                 ` [gomp4, committed] Backported param parloops-schedule Tom de Vries
2015-10-10 12:06         ` [PATCH, 4/5] Support DEFPARAMENUM in params.def Tom de Vries
2015-10-12 14:12           ` Bernd Schmidt
2015-10-10 12:10         ` [PATCH, 5/5] Add param parloops-schedule Tom de Vries

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).