public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix the LOOP_BRANCH prediction
@ 2012-07-31  3:29 Dehao Chen
  2012-07-31  9:20 ` Richard Guenther
  0 siblings, 1 reply; 13+ messages in thread
From: Dehao Chen @ 2012-07-31  3:29 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jan Hubicka, David Li

Hi,

This patch fixed the problem when a LOOP_EXIT edge for the inner loop
happened to target at the LOOP_LATCH of the outer loop. As the outer
loop is processed first, the LOOP_BRANCH heuristic is honored
(first_match), thus the inner loop's trip count is 0. (The attached
unittest demonstrates this).

Bootstrapped and passed gcc regression test.

Is it ok for trunk?

Thanks,
Dehao

gcc/ChangeLog

2012-07-30  Dehao Chen  <dehao@google.com>

	* predict.c (predict_loops): Fix the prediction of LOOP_BRANCH.

gcc/testsuite/ChangeLog

2012-07-31  Dehao Chen  <dehao@google.com>

	* gcc.dg/predict-7.c: New test.

Index: gcc/testsuite/gcc.dg/predict-7.c
===================================================================
--- gcc/testsuite/gcc.dg/predict-7.c	(revision 0)
+++ gcc/testsuite/gcc.dg/predict-7.c	(revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
+
+extern int global;
+
+int bar (int);
+
+void foo (int base)
+{
+  int i;
+  while (global < 10)
+    for (i = base; i < 10; i++)
+      bar (i);
+}
+
+/* { dg-final { scan-tree-dump-times "loop branch heuristics" 0
"profile_estimate"} } */
+/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
Index: gcc/predict.c
===================================================================
--- gcc/predict.c	(revision 189835)
+++ gcc/predict.c	(working copy)
@@ -1404,7 +1404,7 @@

 	  /* Loop branch heuristics - predict an edge back to a
 	     loop's head as taken.  */
-	  if (bb == loop->latch)
+	  if (bb == loop->latch && bb->loop_father == loop)
 	    {
 	      e = find_edge (loop->latch, loop->header);
 	      if (e)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31  3:29 [PATCH] Fix the LOOP_BRANCH prediction Dehao Chen
@ 2012-07-31  9:20 ` Richard Guenther
  2012-07-31 10:24   ` Dehao Chen
  0 siblings, 1 reply; 13+ messages in thread
From: Richard Guenther @ 2012-07-31  9:20 UTC (permalink / raw)
  To: Dehao Chen; +Cc: gcc-patches, Jan Hubicka, David Li

On Tue, Jul 31, 2012 at 5:17 AM, Dehao Chen <dehao@google.com> wrote:
> Hi,
>
> This patch fixed the problem when a LOOP_EXIT edge for the inner loop
> happened to target at the LOOP_LATCH of the outer loop. As the outer
> loop is processed first, the LOOP_BRANCH heuristic is honored
> (first_match), thus the inner loop's trip count is 0. (The attached
> unittest demonstrates this).
>
> Bootstrapped and passed gcc regression test.
>
> Is it ok for trunk?
>
> Thanks,
> Dehao
>
> gcc/ChangeLog
>
> 2012-07-30  Dehao Chen  <dehao@google.com>
>
>         * predict.c (predict_loops): Fix the prediction of LOOP_BRANCH.
>
> gcc/testsuite/ChangeLog
>
> 2012-07-31  Dehao Chen  <dehao@google.com>
>
>         * gcc.dg/predict-7.c: New test.
>
> Index: gcc/testsuite/gcc.dg/predict-7.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/predict-7.c    (revision 0)
> +++ gcc/testsuite/gcc.dg/predict-7.c    (revision 0)
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
> +
> +extern int global;
> +
> +int bar (int);
> +
> +void foo (int base)
> +{
> +  int i;
> +  while (global < 10)
> +    for (i = base; i < 10; i++)
> +      bar (i);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "loop branch heuristics" 0
> "profile_estimate"} } */
> +/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
> Index: gcc/predict.c
> ===================================================================
> --- gcc/predict.c       (revision 189835)
> +++ gcc/predict.c       (working copy)
> @@ -1404,7 +1404,7 @@
>
>           /* Loop branch heuristics - predict an edge back to a
>              loop's head as taken.  */
> -         if (bb == loop->latch)
> +         if (bb == loop->latch && bb->loop_father == loop)
>             {
>               e = find_edge (loop->latch, loop->header);
>               if (e)

I think this heuristic should instead move out of the loop iterating over loop
nodes and be done before like

  if (loop->latch)
    {
       e = find_edge (loop->latch, loop->header);
       ...
    }

which also makes header_found initialized before we visit loop blocks.

Instead the code

          /* Loop exit heuristics - predict an edge exiting the loop if the
             conditional has no loop header successors as not taken.  */
          if (!header_found
              /* If we already used more reliable loop exit predictors, do not
                 bother with PRED_LOOP_EXIT.  */
...
              FOR_EACH_EDGE (e, ei, bb->succs)
                if (e->dest->index < NUM_FIXED_BLOCKS
                    || !flow_bb_inside_loop_p (loop, e->dest))
                  predict_edge (e, PRED_LOOP_EXIT, probability);

looks wrong for bb's that are parts of an inner loop of loop - assuming we
only want to predicate exits from loop and not exits from an inner loop
that also happen to exit loop (we will do that when predicating the inner loop).

Is that what you experienced?

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31  9:20 ` Richard Guenther
@ 2012-07-31 10:24   ` Dehao Chen
  2012-07-31 10:34     ` Jan Hubicka
  2012-07-31 10:38     ` Richard Guenther
  0 siblings, 2 replies; 13+ messages in thread
From: Dehao Chen @ 2012-07-31 10:24 UTC (permalink / raw)
  To: Richard Guenther; +Cc: gcc-patches, Jan Hubicka, David Li

Are you suggesting a patch like this:

Index: gcc/predict.c
===================================================================
--- gcc/predict.c	(revision 189835)
+++ gcc/predict.c	(working copy)
@@ -1319,6 +1319,7 @@
       tree loop_bound_var = NULL;
       tree loop_iv_base = NULL;
       gimple stmt = NULL;
+      int header_found = 0;

       exits = get_loop_exit_edges (loop);
       n_exits = VEC_length (edge, exits);
@@ -1387,9 +1388,20 @@

       bbs = get_loop_body (loop);

+      /* Loop branch heuristics - predict an edge back to a
+	 loop's head as taken.  */
+      if (loop->latch && loop->latch->loop_father == loop)
+	{
+	  edge e = find_edge (loop->latch, loop->header);
+	  if (e)
+	    {
+	      header_found = 1;
+	      predict_edge_def (e, PRED_LOOP_BRANCH, TAKEN);
+	    }
+	}
+
       for (j = 0; j < loop->num_nodes; j++)
 	{
-	  int header_found = 0;
 	  edge e;
 	  edge_iterator ei;

@@ -1402,21 +1414,9 @@
 	  if (predicted_by_p (bb, PRED_CONTINUE))
 	    continue;

-	  /* Loop branch heuristics - predict an edge back to a
-	     loop's head as taken.  */
-	  if (bb == loop->latch)
-	    {
-	      e = find_edge (loop->latch, loop->header);
-	      if (e)
-		{
-		  header_found = 1;
-		  predict_edge_def (e, PRED_LOOP_BRANCH, TAKEN);
-		}
-	    }
-
 	  /* Loop exit heuristics - predict an edge exiting the loop if the
 	     conditional has no loop header successors as not taken.  */
-	  if (!header_found
+	  if (!(header_found && bb == loop->latch)
 	      /* If we already used more reliable loop exit predictors, do not
 		 bother with PRED_LOOP_EXIT.  */
 	      && !predicted_by_p (bb, PRED_LOOP_ITERATIONS_GUESSED)

On Tue, Jul 31, 2012 at 5:18 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 31, 2012 at 5:17 AM, Dehao Chen <dehao@google.com> wrote:
>> Hi,
>>
>> This patch fixed the problem when a LOOP_EXIT edge for the inner loop
>> happened to target at the LOOP_LATCH of the outer loop. As the outer
>> loop is processed first, the LOOP_BRANCH heuristic is honored
>> (first_match), thus the inner loop's trip count is 0. (The attached
>> unittest demonstrates this).
>>
>> Bootstrapped and passed gcc regression test.
>>
>> Is it ok for trunk?
>>
>> Thanks,
>> Dehao
>>
>> gcc/ChangeLog
>>
>> 2012-07-30  Dehao Chen  <dehao@google.com>
>>
>>         * predict.c (predict_loops): Fix the prediction of LOOP_BRANCH.
>>
>> gcc/testsuite/ChangeLog
>>
>> 2012-07-31  Dehao Chen  <dehao@google.com>
>>
>>         * gcc.dg/predict-7.c: New test.
>>
>> Index: gcc/testsuite/gcc.dg/predict-7.c
>> ===================================================================
>> --- gcc/testsuite/gcc.dg/predict-7.c    (revision 0)
>> +++ gcc/testsuite/gcc.dg/predict-7.c    (revision 0)
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
>> +
>> +extern int global;
>> +
>> +int bar (int);
>> +
>> +void foo (int base)
>> +{
>> +  int i;
>> +  while (global < 10)
>> +    for (i = base; i < 10; i++)
>> +      bar (i);
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "loop branch heuristics" 0
>> "profile_estimate"} } */
>> +/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
>> Index: gcc/predict.c
>> ===================================================================
>> --- gcc/predict.c       (revision 189835)
>> +++ gcc/predict.c       (working copy)
>> @@ -1404,7 +1404,7 @@
>>
>>           /* Loop branch heuristics - predict an edge back to a
>>              loop's head as taken.  */
>> -         if (bb == loop->latch)
>> +         if (bb == loop->latch && bb->loop_father == loop)
>>             {
>>               e = find_edge (loop->latch, loop->header);
>>               if (e)
>
> I think this heuristic should instead move out of the loop iterating over loop
> nodes and be done before like
>
>   if (loop->latch)
>     {
>        e = find_edge (loop->latch, loop->header);
>        ...
>     }
>
> which also makes header_found initialized before we visit loop blocks.
>
> Instead the code
>
>           /* Loop exit heuristics - predict an edge exiting the loop if the
>              conditional has no loop header successors as not taken.  */
>           if (!header_found
>               /* If we already used more reliable loop exit predictors, do not
>                  bother with PRED_LOOP_EXIT.  */
> ...
>               FOR_EACH_EDGE (e, ei, bb->succs)
>                 if (e->dest->index < NUM_FIXED_BLOCKS
>                     || !flow_bb_inside_loop_p (loop, e->dest))
>                   predict_edge (e, PRED_LOOP_EXIT, probability);
>
> looks wrong for bb's that are parts of an inner loop of loop - assuming we
> only want to predicate exits from loop and not exits from an inner loop
> that also happen to exit loop (we will do that when predicating the inner loop).

You are right. And if we want to change this, we'd also need to modify
get_loop_exit_edges to only count edges whose src is in the same loop
level. However, this is relatively minor issue because it only
predicts inaccurate bias ratio, while in the testcase I gave,
LOOP_BRANCH is predicting in the wrong direction.

Thanks,
Dehao


>
> Is that what you experienced?
>
> Thanks,
> Richard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 10:24   ` Dehao Chen
@ 2012-07-31 10:34     ` Jan Hubicka
  2012-07-31 10:38     ` Richard Guenther
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2012-07-31 10:34 UTC (permalink / raw)
  To: Dehao Chen; +Cc: Richard Guenther, gcc-patches, Jan Hubicka, David Li

> Are you suggesting a patch like this:
> 
> Index: gcc/predict.c
> ===================================================================
> --- gcc/predict.c	(revision 189835)
> +++ gcc/predict.c	(working copy)
> @@ -1319,6 +1319,7 @@
>        tree loop_bound_var = NULL;
>        tree loop_iv_base = NULL;
>        gimple stmt = NULL;
> +      int header_found = 0;

We should use bool these days.

> 
>        exits = get_loop_exit_edges (loop);
>        n_exits = VEC_length (edge, exits);
> @@ -1387,9 +1388,20 @@
> 
>        bbs = get_loop_body (loop);
> 
> +      /* Loop branch heuristics - predict an edge back to a
> +	 loop's head as taken.  */
> +      if (loop->latch && loop->latch->loop_father == loop)
> +	{
> +	  edge e = find_edge (loop->latch, loop->header);
> +	  if (e)
> +	    {
> +	      header_found = 1;
> +	      predict_edge_def (e, PRED_LOOP_BRANCH, TAKEN);
> +	    }
> +	}
> +
>        for (j = 0; j < loop->num_nodes; j++)
>  	{
> -	  int header_found = 0;
>  	  edge e;
>  	  edge_iterator ei;
> 
> @@ -1402,21 +1414,9 @@
>  	  if (predicted_by_p (bb, PRED_CONTINUE))
>  	    continue;
> 
> -	  /* Loop branch heuristics - predict an edge back to a
> -	     loop's head as taken.  */
> -	  if (bb == loop->latch)
> -	    {
> -	      e = find_edge (loop->latch, loop->header);
> -	      if (e)
> -		{
> -		  header_found = 1;
> -		  predict_edge_def (e, PRED_LOOP_BRANCH, TAKEN);
> -		}
> -	    }
> -
>  	  /* Loop exit heuristics - predict an edge exiting the loop if the
>  	     conditional has no loop header successors as not taken.  */
> -	  if (!header_found
> +	  if (!(header_found && bb == loop->latch)

Yes, this seems resonable to me.

> > which also makes header_found initialized before we visit loop blocks.
> >
> > Instead the code
> >
> >           /* Loop exit heuristics - predict an edge exiting the loop if the
> >              conditional has no loop header successors as not taken.  */
> >           if (!header_found
> >               /* If we already used more reliable loop exit predictors, do not
> >                  bother with PRED_LOOP_EXIT.  */
> > ...
> >               FOR_EACH_EDGE (e, ei, bb->succs)
> >                 if (e->dest->index < NUM_FIXED_BLOCKS
> >                     || !flow_bb_inside_loop_p (loop, e->dest))
> >                   predict_edge (e, PRED_LOOP_EXIT, probability);
> >
> > looks wrong for bb's that are parts of an inner loop of loop - assuming we
> > only want to predicate exits from loop and not exits from an inner loop
> > that also happen to exit loop (we will do that when predicating the inner loop).
> 
> You are right. And if we want to change this, we'd also need to modify
> get_loop_exit_edges to only count edges whose src is in the same loop
> level. However, this is relatively minor issue because it only
> predicts inaccurate bias ratio, while in the testcase I gave,
> LOOP_BRANCH is predicting in the wrong direction.

Indeed, it is not the most important thing around. 

Patch seems OK.  We should get the statistic about branch prediction effectivity working again.
It is broken since we moved tree-profile early. I will try to look into it.

Honza

> 
> Thanks,
> Dehao
> 
> 
> >
> > Is that what you experienced?
> >
> > Thanks,
> > Richard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 10:24   ` Dehao Chen
  2012-07-31 10:34     ` Jan Hubicka
@ 2012-07-31 10:38     ` Richard Guenther
  2012-07-31 10:43       ` Jan Hubicka
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Guenther @ 2012-07-31 10:38 UTC (permalink / raw)
  To: Dehao Chen; +Cc: gcc-patches, Jan Hubicka, David Li

On Tue, Jul 31, 2012 at 12:20 PM, Dehao Chen <dehao@google.com> wrote:
> Are you suggesting a patch like this:
>
> Index: gcc/predict.c
> ===================================================================
> --- gcc/predict.c       (revision 189835)
> +++ gcc/predict.c       (working copy)
> @@ -1319,6 +1319,7 @@
>        tree loop_bound_var = NULL;
>        tree loop_iv_base = NULL;
>        gimple stmt = NULL;
> +      int header_found = 0;
>
>        exits = get_loop_exit_edges (loop);
>        n_exits = VEC_length (edge, exits);
> @@ -1387,9 +1388,20 @@
>
>        bbs = get_loop_body (loop);
>
> +      /* Loop branch heuristics - predict an edge back to a
> +        loop's head as taken.  */
> +      if (loop->latch && loop->latch->loop_father == loop)

Hmm, so the issue is that loop->latch does not belong to loop?  That looks
like a bogus loop structure.  Indeed we have the loop header of the inner
loop as latch of the outer loop.  It still looks ok to predict this as unlikely
as the edge is not only the latch edge of the outer loop but also an exit
of the inner loop.

Easier for profile would be to force canonicalization via

  loop_optimizer_init (LOOPS_NORMAL);

instead of

  loop_optimizer_init (0);
  if (dump_file && (dump_flags & TDF_DETAILS))
    flow_loops_dump (dump_file, NULL, 0);

  mark_irreducible_loops ();


> +       {
> +         edge e = find_edge (loop->latch, loop->header);
> +         if (e)
> +           {
> +             header_found = 1;
> +             predict_edge_def (e, PRED_LOOP_BRANCH, TAKEN);
> +           }
> +       }
> +
>        for (j = 0; j < loop->num_nodes; j++)
>         {
> -         int header_found = 0;
>           edge e;
>           edge_iterator ei;
>
> @@ -1402,21 +1414,9 @@
>           if (predicted_by_p (bb, PRED_CONTINUE))
>             continue;
>
> -         /* Loop branch heuristics - predict an edge back to a
> -            loop's head as taken.  */
> -         if (bb == loop->latch)
> -           {
> -             e = find_edge (loop->latch, loop->header);
> -             if (e)
> -               {
> -                 header_found = 1;
> -                 predict_edge_def (e, PRED_LOOP_BRANCH, TAKEN);
> -               }
> -           }
> -

Yes until here,

>           /* Loop exit heuristics - predict an edge exiting the loop if the
>              conditional has no loop header successors as not taken.  */
> -         if (!header_found
> +         if (!(header_found && bb == loop->latch)

here instead

                  !header_found
                  && bb->loop_father == loop

>               /* If we already used more reliable loop exit predictors, do not
>                  bother with PRED_LOOP_EXIT.  */
>               && !predicted_by_p (bb, PRED_LOOP_ITERATIONS_GUESSED)
>
> On Tue, Jul 31, 2012 at 5:18 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Tue, Jul 31, 2012 at 5:17 AM, Dehao Chen <dehao@google.com> wrote:
>>> Hi,
>>>
>>> This patch fixed the problem when a LOOP_EXIT edge for the inner loop
>>> happened to target at the LOOP_LATCH of the outer loop. As the outer
>>> loop is processed first, the LOOP_BRANCH heuristic is honored
>>> (first_match), thus the inner loop's trip count is 0. (The attached
>>> unittest demonstrates this).
>>>
>>> Bootstrapped and passed gcc regression test.
>>>
>>> Is it ok for trunk?
>>>
>>> Thanks,
>>> Dehao
>>>
>>> gcc/ChangeLog
>>>
>>> 2012-07-30  Dehao Chen  <dehao@google.com>
>>>
>>>         * predict.c (predict_loops): Fix the prediction of LOOP_BRANCH.
>>>
>>> gcc/testsuite/ChangeLog
>>>
>>> 2012-07-31  Dehao Chen  <dehao@google.com>
>>>
>>>         * gcc.dg/predict-7.c: New test.
>>>
>>> Index: gcc/testsuite/gcc.dg/predict-7.c
>>> ===================================================================
>>> --- gcc/testsuite/gcc.dg/predict-7.c    (revision 0)
>>> +++ gcc/testsuite/gcc.dg/predict-7.c    (revision 0)
>>> @@ -0,0 +1,17 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
>>> +
>>> +extern int global;
>>> +
>>> +int bar (int);
>>> +
>>> +void foo (int base)
>>> +{
>>> +  int i;
>>> +  while (global < 10)
>>> +    for (i = base; i < 10; i++)
>>> +      bar (i);
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "loop branch heuristics" 0
>>> "profile_estimate"} } */
>>> +/* { dg-final { cleanup-tree-dump "profile_estimate" } } */
>>> Index: gcc/predict.c
>>> ===================================================================
>>> --- gcc/predict.c       (revision 189835)
>>> +++ gcc/predict.c       (working copy)
>>> @@ -1404,7 +1404,7 @@
>>>
>>>           /* Loop branch heuristics - predict an edge back to a
>>>              loop's head as taken.  */
>>> -         if (bb == loop->latch)
>>> +         if (bb == loop->latch && bb->loop_father == loop)
>>>             {
>>>               e = find_edge (loop->latch, loop->header);
>>>               if (e)
>>
>> I think this heuristic should instead move out of the loop iterating over loop
>> nodes and be done before like
>>
>>   if (loop->latch)
>>     {
>>        e = find_edge (loop->latch, loop->header);
>>        ...
>>     }
>>
>> which also makes header_found initialized before we visit loop blocks.
>>
>> Instead the code
>>
>>           /* Loop exit heuristics - predict an edge exiting the loop if the
>>              conditional has no loop header successors as not taken.  */
>>           if (!header_found
>>               /* If we already used more reliable loop exit predictors, do not
>>                  bother with PRED_LOOP_EXIT.  */
>> ...
>>               FOR_EACH_EDGE (e, ei, bb->succs)
>>                 if (e->dest->index < NUM_FIXED_BLOCKS
>>                     || !flow_bb_inside_loop_p (loop, e->dest))
>>                   predict_edge (e, PRED_LOOP_EXIT, probability);
>>
>> looks wrong for bb's that are parts of an inner loop of loop - assuming we
>> only want to predicate exits from loop and not exits from an inner loop
>> that also happen to exit loop (we will do that when predicating the inner loop).
>
> You are right. And if we want to change this, we'd also need to modify
> get_loop_exit_edges to only count edges whose src is in the same loop
> level. However, this is relatively minor issue because it only
> predicts inaccurate bias ratio, while in the testcase I gave,
> LOOP_BRANCH is predicting in the wrong direction.
>
> Thanks,
> Dehao
>
>
>>
>> Is that what you experienced?
>>
>> Thanks,
>> Richard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 10:38     ` Richard Guenther
@ 2012-07-31 10:43       ` Jan Hubicka
  2012-07-31 10:46         ` Richard Guenther
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2012-07-31 10:43 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Dehao Chen, gcc-patches, Jan Hubicka, David Li

> On Tue, Jul 31, 2012 at 12:20 PM, Dehao Chen <dehao@google.com> wrote:
> > Are you suggesting a patch like this:
> >
> > Index: gcc/predict.c
> > ===================================================================
> > --- gcc/predict.c       (revision 189835)
> > +++ gcc/predict.c       (working copy)
> > @@ -1319,6 +1319,7 @@
> >        tree loop_bound_var = NULL;
> >        tree loop_iv_base = NULL;
> >        gimple stmt = NULL;
> > +      int header_found = 0;
> >
> >        exits = get_loop_exit_edges (loop);
> >        n_exits = VEC_length (edge, exits);
> > @@ -1387,9 +1388,20 @@
> >
> >        bbs = get_loop_body (loop);
> >
> > +      /* Loop branch heuristics - predict an edge back to a
> > +        loop's head as taken.  */
> > +      if (loop->latch && loop->latch->loop_father == loop)
> 
> Hmm, so the issue is that loop->latch does not belong to loop?  That looks
> like a bogus loop structure.  Indeed we have the loop header of the inner
> loop as latch of the outer loop.  It still looks ok to predict this as unlikely
> as the edge is not only the latch edge of the outer loop but also an exit
> of the inner loop.
> 
> Easier for profile would be to force canonicalization via
> 
>   loop_optimizer_init (LOOPS_NORMAL);
> 
> instead of
> 
>   loop_optimizer_init (0);
>   if (dump_file && (dump_flags & TDF_DETAILS))
>     flow_loops_dump (dump_file, NULL, 0);
> 
>   mark_irreducible_loops ();

Yeah, this may also work.  The reason it is not done is that
 1) it seemed expensive to force CFG changes just to compute profile decade ago
 2) cfgcleanup afterwards will anyway remove the headers again.
    So I originally hoped to do the right thing without normalization.

Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 10:43       ` Jan Hubicka
@ 2012-07-31 10:46         ` Richard Guenther
  2012-07-31 10:56           ` Richard Guenther
  2012-07-31 11:00           ` Jan Hubicka
  0 siblings, 2 replies; 13+ messages in thread
From: Richard Guenther @ 2012-07-31 10:46 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Dehao Chen, gcc-patches, David Li

On Tue, Jul 31, 2012 at 12:38 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Tue, Jul 31, 2012 at 12:20 PM, Dehao Chen <dehao@google.com> wrote:
>> > Are you suggesting a patch like this:
>> >
>> > Index: gcc/predict.c
>> > ===================================================================
>> > --- gcc/predict.c       (revision 189835)
>> > +++ gcc/predict.c       (working copy)
>> > @@ -1319,6 +1319,7 @@
>> >        tree loop_bound_var = NULL;
>> >        tree loop_iv_base = NULL;
>> >        gimple stmt = NULL;
>> > +      int header_found = 0;
>> >
>> >        exits = get_loop_exit_edges (loop);
>> >        n_exits = VEC_length (edge, exits);
>> > @@ -1387,9 +1388,20 @@
>> >
>> >        bbs = get_loop_body (loop);
>> >
>> > +      /* Loop branch heuristics - predict an edge back to a
>> > +        loop's head as taken.  */
>> > +      if (loop->latch && loop->latch->loop_father == loop)
>>
>> Hmm, so the issue is that loop->latch does not belong to loop?  That looks
>> like a bogus loop structure.  Indeed we have the loop header of the inner
>> loop as latch of the outer loop.  It still looks ok to predict this as unlikely
>> as the edge is not only the latch edge of the outer loop but also an exit
>> of the inner loop.
>>
>> Easier for profile would be to force canonicalization via
>>
>>   loop_optimizer_init (LOOPS_NORMAL);
>>
>> instead of
>>
>>   loop_optimizer_init (0);
>>   if (dump_file && (dump_flags & TDF_DETAILS))
>>     flow_loops_dump (dump_file, NULL, 0);
>>
>>   mark_irreducible_loops ();
>
> Yeah, this may also work.  The reason it is not done is that
>  1) it seemed expensive to force CFG changes just to compute profile decade ago
>  2) cfgcleanup afterwards will anyway remove the headers again.
>     So I originally hoped to do the right thing without normalization.

Ok ... then you should pass AVOID_CFG_MODIFICATIONS instead.  And be
prepared for odd situations like this ;)

> Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 10:46         ` Richard Guenther
@ 2012-07-31 10:56           ` Richard Guenther
  2012-07-31 11:00           ` Jan Hubicka
  1 sibling, 0 replies; 13+ messages in thread
From: Richard Guenther @ 2012-07-31 10:56 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Dehao Chen, gcc-patches, David Li

On Tue, Jul 31, 2012 at 12:43 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 31, 2012 at 12:38 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> On Tue, Jul 31, 2012 at 12:20 PM, Dehao Chen <dehao@google.com> wrote:
>>> > Are you suggesting a patch like this:
>>> >
>>> > Index: gcc/predict.c
>>> > ===================================================================
>>> > --- gcc/predict.c       (revision 189835)
>>> > +++ gcc/predict.c       (working copy)
>>> > @@ -1319,6 +1319,7 @@
>>> >        tree loop_bound_var = NULL;
>>> >        tree loop_iv_base = NULL;
>>> >        gimple stmt = NULL;
>>> > +      int header_found = 0;
>>> >
>>> >        exits = get_loop_exit_edges (loop);
>>> >        n_exits = VEC_length (edge, exits);
>>> > @@ -1387,9 +1388,20 @@
>>> >
>>> >        bbs = get_loop_body (loop);
>>> >
>>> > +      /* Loop branch heuristics - predict an edge back to a
>>> > +        loop's head as taken.  */
>>> > +      if (loop->latch && loop->latch->loop_father == loop)
>>>
>>> Hmm, so the issue is that loop->latch does not belong to loop?  That looks
>>> like a bogus loop structure.  Indeed we have the loop header of the inner
>>> loop as latch of the outer loop.  It still looks ok to predict this as unlikely
>>> as the edge is not only the latch edge of the outer loop but also an exit
>>> of the inner loop.
>>>
>>> Easier for profile would be to force canonicalization via
>>>
>>>   loop_optimizer_init (LOOPS_NORMAL);
>>>
>>> instead of
>>>
>>>   loop_optimizer_init (0);
>>>   if (dump_file && (dump_flags & TDF_DETAILS))
>>>     flow_loops_dump (dump_file, NULL, 0);
>>>
>>>   mark_irreducible_loops ();
>>
>> Yeah, this may also work.  The reason it is not done is that
>>  1) it seemed expensive to force CFG changes just to compute profile decade ago
>>  2) cfgcleanup afterwards will anyway remove the headers again.
>>     So I originally hoped to do the right thing without normalization.
>
> Ok ... then you should pass AVOID_CFG_MODIFICATIONS instead.  And be
> prepared for odd situations like this ;)

In which case the bug looks like that we predict the inner loop exit as unlikely
but not the outer loop exit which should compensate things and not end up
predicting zero iterations?  That is, all patches seem to paper over a
real issue
elsewhere.

Richard.

>> Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 10:46         ` Richard Guenther
  2012-07-31 10:56           ` Richard Guenther
@ 2012-07-31 11:00           ` Jan Hubicka
  2012-07-31 12:39             ` Dehao Chen
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2012-07-31 11:00 UTC (permalink / raw)
  To: Richard Guenther; +Cc: Jan Hubicka, Dehao Chen, gcc-patches, David Li

> >
> > Yeah, this may also work.  The reason it is not done is that
> >  1) it seemed expensive to force CFG changes just to compute profile decade ago
> >  2) cfgcleanup afterwards will anyway remove the headers again.
> >     So I originally hoped to do the right thing without normalization.
> 
> Ok ... then you should pass AVOID_CFG_MODIFICATIONS instead.  And be
> prepared for odd situations like this ;)

Well, I guess we could do the extra work to avoid strange side cases like this.
Does normalization fix the testcase, too?

Honza
> 
> > Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 11:00           ` Jan Hubicka
@ 2012-07-31 12:39             ` Dehao Chen
  2012-07-31 12:47               ` Jan Hubicka
  0 siblings, 1 reply; 13+ messages in thread
From: Dehao Chen @ 2012-07-31 12:39 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, gcc-patches, David Li

On Tue, Jul 31, 2012 at 6:56 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >
>> > Yeah, this may also work.  The reason it is not done is that
>> >  1) it seemed expensive to force CFG changes just to compute profile decade ago
>> >  2) cfgcleanup afterwards will anyway remove the headers again.
>> >     So I originally hoped to do the right thing without normalization.
>>
>> Ok ... then you should pass AVOID_CFG_MODIFICATIONS instead.  And be
>> prepared for odd situations like this ;)
>
> Well, I guess we could do the extra work to avoid strange side cases like this.
> Does normalization fix the testcase, too?

Normalization indeed fixed this issue too. So what shall we do about
this patch? Shall we simply change to use normalization instead?

Thanks,
Dehao

>
> Honza
>>
>> > Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 12:39             ` Dehao Chen
@ 2012-07-31 12:47               ` Jan Hubicka
  2012-07-31 14:32                 ` Dehao Chen
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Hubicka @ 2012-07-31 12:47 UTC (permalink / raw)
  To: Dehao Chen; +Cc: Jan Hubicka, Richard Guenther, gcc-patches, David Li

> On Tue, Jul 31, 2012 at 6:56 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >
> >> > Yeah, this may also work.  The reason it is not done is that
> >> >  1) it seemed expensive to force CFG changes just to compute profile decade ago
> >> >  2) cfgcleanup afterwards will anyway remove the headers again.
> >> >     So I originally hoped to do the right thing without normalization.
> >>
> >> Ok ... then you should pass AVOID_CFG_MODIFICATIONS instead.  And be
> >> prepared for odd situations like this ;)
> >
> > Well, I guess we could do the extra work to avoid strange side cases like this.
> > Does normalization fix the testcase, too?
> 
> Normalization indeed fixed this issue too. So what shall we do about
> this patch? Shall we simply change to use normalization instead?

Yes, I think it is not _that_ expensive and we do some relatively tricky CFG
analysis there these days that should also get bit better.
(the code was written at a time CFG was new citizen to GCC and changes in CFG
was hard and considered harmful :)

Honza
> 
> Thanks,
> Dehao
> 
> >
> > Honza
> >>
> >> > Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 12:47               ` Jan Hubicka
@ 2012-07-31 14:32                 ` Dehao Chen
  2012-07-31 14:49                   ` Jan Hubicka
  0 siblings, 1 reply; 13+ messages in thread
From: Dehao Chen @ 2012-07-31 14:32 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Guenther, gcc-patches, David Li

Thanks, Honza,

Then shall I check in the following patch to trunk (after testing)?

Dehao

Index: gcc/testsuite/gcc.dg/predict-7.c
===================================================================
--- gcc/testsuite/gcc.dg/predict-7.c	(revision 0)
+++ gcc/testsuite/gcc.dg/predict-7.c	(revision 0)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-profile_estimate" } */
+
+extern int global;
+
+int bar (int);
+
+void foo (int base)
+{
+  int i;
+  while (global < 10)
+    for (i = base; i < 10; i++)
+      bar (i);
+}
+
+/* { dg-final { scan-tree-dump-times "loop branch heuristics" 0
"profile_estimate"} } */
+/* { dg-final { cleanup-tree-dump "profile_estimate" } } */

Index: gcc/predict.c
===================================================================
--- gcc/predict.c	(revision 189835)
+++ gcc/predict.c	(working copy)
@@ -2177,7 +2177,7 @@
 {
   unsigned nb_loops;

-  loop_optimizer_init (0);
+  loop_optimizer_init (LOOPS_NORMAL);
   if (dump_file && (dump_flags & TDF_DETAILS))
     flow_loops_dump (dump_file, NULL, 0);


On Tue, Jul 31, 2012 at 8:43 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Tue, Jul 31, 2012 at 6:56 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> >
>> >> > Yeah, this may also work.  The reason it is not done is that
>> >> >  1) it seemed expensive to force CFG changes just to compute profile decade ago
>> >> >  2) cfgcleanup afterwards will anyway remove the headers again.
>> >> >     So I originally hoped to do the right thing without normalization.
>> >>
>> >> Ok ... then you should pass AVOID_CFG_MODIFICATIONS instead.  And be
>> >> prepared for odd situations like this ;)
>> >
>> > Well, I guess we could do the extra work to avoid strange side cases like this.
>> > Does normalization fix the testcase, too?
>>
>> Normalization indeed fixed this issue too. So what shall we do about
>> this patch? Shall we simply change to use normalization instead?
>
> Yes, I think it is not _that_ expensive and we do some relatively tricky CFG
> analysis there these days that should also get bit better.
> (the code was written at a time CFG was new citizen to GCC and changes in CFG
> was hard and considered harmful :)
>
> Honza
>>
>> Thanks,
>> Dehao
>>
>> >
>> > Honza
>> >>
>> >> > Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix the LOOP_BRANCH prediction
  2012-07-31 14:32                 ` Dehao Chen
@ 2012-07-31 14:49                   ` Jan Hubicka
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Hubicka @ 2012-07-31 14:49 UTC (permalink / raw)
  To: Dehao Chen; +Cc: Jan Hubicka, Richard Guenther, gcc-patches, David Li

> Thanks, Honza,
> 
> Then shall I check in the following patch to trunk (after testing)?

Yes, this is OK (with a changelog).
Thanks!
Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-07-31 14:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-31  3:29 [PATCH] Fix the LOOP_BRANCH prediction Dehao Chen
2012-07-31  9:20 ` Richard Guenther
2012-07-31 10:24   ` Dehao Chen
2012-07-31 10:34     ` Jan Hubicka
2012-07-31 10:38     ` Richard Guenther
2012-07-31 10:43       ` Jan Hubicka
2012-07-31 10:46         ` Richard Guenther
2012-07-31 10:56           ` Richard Guenther
2012-07-31 11:00           ` Jan Hubicka
2012-07-31 12:39             ` Dehao Chen
2012-07-31 12:47               ` Jan Hubicka
2012-07-31 14:32                 ` Dehao Chen
2012-07-31 14:49                   ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).