[PATCH PR81832]Skip copying loop header if inner loop is distributed

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH PR81832]Skip copying loop header if inner loop is distributed
@ 2017-08-15 11:26 Bin Cheng
  2017-08-15 11:35 ` Richard Biener
  0 siblings, 1 reply; 7+ messages in thread
From: Bin Cheng @ 2017-08-15 11:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]

Hi,
This patch fixes PR81832.  Root cause for the ICE is:
  1) Loop has distributed inner loop.
  2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header.
  3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus
     not eliminated.

Given pass_ch_vect copies loop header to enable more vectorization, we should
skip loop in this case because distributed inner loop means this loop can not
be vectorized anyway.  One point to mention is name inner_loop_distributed_p
is a little misleading.  The name indicates that each basic block is checked,
but the patch only checks loop's header for simplicity/efficiency's purpose.
Any comment?
Bootstrap and test on x86_64.

Thanks,
bin
2017-08-15  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/81832
	* tree-ssa-loop-ch.c (inner_loop_distributed_p): New function.
	(pass_ch_vect::process_loop_p): Call above function.

gcc/testsuite
2017-08-15  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/81832
	* gcc.dg/tree-ssa/pr81832.c: New test.

[-- Attachment #2: pr81832.txt --]
[-- Type: text/plain, Size: 1888 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
new file mode 100644
index 0000000..893124e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, b, *c;
+void d(void)
+{
+    int **e;
+    for(;;)
+        for(int f = 1; f <= 6; f++)
+        {
+            b = 0;
+            if(a)
+g:
+                while(a++);
+            if (**e);
+            else
+            {
+                *c = a;
+                goto g;
+            }
+        }
+}
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index 14cc6d8d..3c217d4 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -143,6 +143,27 @@ should_duplicate_loop_header_p (basic_block header, struct loop *loop,
   return true;
 }
 
+/* Return TRUE if LOOP's inner loop is versioned by loop distribution and
+   the guarding internal function call happens to be in LOOP's header.
+   Given loop distribution is placed between pass_ch and pass_ch_vect,
+   this function only returns true in pass_ch_vect.  When it returns TRUE,
+   it's known that copying LOOP's header is meaningless.  */
+
+static bool
+inner_loop_distributed_p (struct loop *loop)
+{
+  gimple *stmt = last_stmt (loop->header);
+  if (stmt == NULL || gimple_code (stmt) != GIMPLE_COND)
+    return false;
+
+  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+  gsi_prev (&gsi);
+  if (gsi_end_p (gsi))
+    return false;
+
+  return (gimple_call_internal_p (gsi_stmt (gsi), IFN_LOOP_DIST_ALIAS));
+}
+
 /* Checks whether LOOP is a do-while style loop.  */
 
 static bool
@@ -442,6 +463,9 @@ pass_ch_vect::process_loop_p (struct loop *loop)
   if (loop->dont_vectorize)
     return false;
 
+  if (inner_loop_distributed_p (loop))
+    return false;
+
   if (!do_while_loop_p (loop))
     return true;
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH PR81832]Skip copying loop header if inner loop is distributed
  2017-08-15 11:26 [PATCH PR81832]Skip copying loop header if inner loop is distributed Bin Cheng
@ 2017-08-15 11:35 ` Richard Biener
  2017-08-15 17:51   ` Richard Sandiford
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2017-08-15 11:35 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches, nd

On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> This patch fixes PR81832.  Root cause for the ICE is:
>   1) Loop has distributed inner loop.
>   2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header.
>   3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus
>      not eliminated.
>
> Given pass_ch_vect copies loop header to enable more vectorization, we should
> skip loop in this case because distributed inner loop means this loop can not
> be vectorized anyway.  One point to mention is name inner_loop_distributed_p
> is a little misleading.  The name indicates that each basic block is checked,
> but the patch only checks loop's header for simplicity/efficiency's purpose.
> Any comment?

My comment would be to question pass_ch_vect placement -- what was the
reason to place it so late?

I also see GRAPHITE runs inbetween loop distribution and vectorization --
what prevents GRAPHITE from messing up things here?  Or autopar?

The patch itself shows should_duplicate_loop_header_p should
handle this IFN specially (somehow all IFNs are considered "inexpensive").

So can you please adjust should_duplicate_loop_header_p instead and/or
gimple_inexpensive_call_p?  Since we have IFNs for stuff like EXP10 I'm not
sure if that default is so good.

Thanks,
Richard.

> Bootstrap and test on x86_64.
>
> Thanks,
> bin
> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>
>         PR tree-optimization/81832
>         * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function.
>         (pass_ch_vect::process_loop_p): Call above function.
>
> gcc/testsuite
> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>
>         PR tree-optimization/81832
>         * gcc.dg/tree-ssa/pr81832.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH PR81832]Skip copying loop header if inner loop is distributed
  2017-08-15 11:35 ` Richard Biener
@ 2017-08-15 17:51   ` Richard Sandiford
  2017-08-16 10:43     ` Bin.Cheng
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Sandiford @ 2017-08-15 17:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: Bin Cheng, gcc-patches, nd

Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> This patch fixes PR81832.  Root cause for the ICE is:
>>   1) Loop has distributed inner loop.
>>   2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header.
>>   3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus
>>      not eliminated.
>>
>> Given pass_ch_vect copies loop header to enable more vectorization, we should
>> skip loop in this case because distributed inner loop means this loop can not
>> be vectorized anyway.  One point to mention is name inner_loop_distributed_p
>> is a little misleading.  The name indicates that each basic block is checked,
>> but the patch only checks loop's header for simplicity/efficiency's purpose.
>> Any comment?
>
> My comment would be to question pass_ch_vect placement -- what was the
> reason to place it so late?
>
> I also see GRAPHITE runs inbetween loop distribution and vectorization --
> what prevents GRAPHITE from messing up things here?  Or autopar?
>
> The patch itself shows should_duplicate_loop_header_p should
> handle this IFN specially (somehow all IFNs are considered "inexpensive").
>
> So can you please adjust should_duplicate_loop_header_p instead and/or
> gimple_inexpensive_call_p?  Since we have IFNs for stuff like EXP10 I'm not
> sure if that default is so good.

I think the default itself is OK: we only use IFNs for libm functions
like exp10 if an underlying optab exists (unlike __builtin_exp10, which
is useful as the non-errno setting version of exp10), so the target must
have something that it thinks is worth open-coding.  Also, we currently
treat all MD built-ins as "simple" and thus "inexpensive", so the IFN
handling is consistent with that.

Maybe there are some IFNs that are worth special-casing as expensive,
but IMO doing that to solve this problem would be a bit hacky.  It seems
like "inexpensive" should be more of a cost thing than a correctness thing.

Thanks,
Richard

> Thanks,
> Richard.
>
>> Bootstrap and test on x86_64.
>>
>> Thanks,
>> bin
>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>
>>         PR tree-optimization/81832
>>         * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function.
>>         (pass_ch_vect::process_loop_p): Call above function.
>>
>> gcc/testsuite
>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>
>>         PR tree-optimization/81832
>>         * gcc.dg/tree-ssa/pr81832.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH PR81832]Skip copying loop header if inner loop is distributed
  2017-08-15 17:51   ` Richard Sandiford
@ 2017-08-16 10:43     ` Bin.Cheng
  2017-08-16 10:47       ` Richard Sandiford
  2017-08-16 10:56       ` Richard Biener
  0 siblings, 2 replies; 7+ messages in thread
From: Bin.Cheng @ 2017-08-16 10:43 UTC (permalink / raw)
  To: Richard Biener, gcc-patches, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 3620 bytes --]

On Tue, Aug 15, 2017 at 6:33 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>> Hi,
>>> This patch fixes PR81832.  Root cause for the ICE is:
>>>   1) Loop has distributed inner loop.
>>>   2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header.
>>>   3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus
>>>      not eliminated.
>>>
>>> Given pass_ch_vect copies loop header to enable more vectorization, we should
>>> skip loop in this case because distributed inner loop means this loop can not
>>> be vectorized anyway.  One point to mention is name inner_loop_distributed_p
>>> is a little misleading.  The name indicates that each basic block is checked,
>>> but the patch only checks loop's header for simplicity/efficiency's purpose.
>>> Any comment?
>>
>> My comment would be to question pass_ch_vect placement -- what was the
>> reason to place it so late?
>>
>> I also see GRAPHITE runs inbetween loop distribution and vectorization --
>> what prevents GRAPHITE from messing up things here?  Or autopar?
>>
>> The patch itself shows should_duplicate_loop_header_p should
>> handle this IFN specially (somehow all IFNs are considered "inexpensive").
>>
>> So can you please adjust should_duplicate_loop_header_p instead and/or
>> gimple_inexpensive_call_p?  Since we have IFNs for stuff like EXP10 I'm not
>> sure if that default is so good.
>
> I think the default itself is OK: we only use IFNs for libm functions
> like exp10 if an underlying optab exists (unlike __builtin_exp10, which
> is useful as the non-errno setting version of exp10), so the target must
> have something that it thinks is worth open-coding.  Also, we currently
> treat all MD built-ins as "simple" and thus "inexpensive", so the IFN
> handling is consistent with that.
>
> Maybe there are some IFNs that are worth special-casing as expensive,
> but IMO doing that to solve this problem would be a bit hacky.  It seems
> like "inexpensive" should be more of a cost thing than a correctness thing.
Hi all,
This is updated patch.  It only adds a single check on IFN_LOOP_DIST_ALIAS
in function should_duplicate_loop_header_p.  As for gimple_inexpensive_call_p,
I think it's natural to consider functions like IFN_LOOP_VECTORIZED and
IFN_LOOP_DIST_ALIAS as expensive because they are only meant to indicate
temporary arrangement of optimizations and are never used in code generation.
I will send a standalone patch for that.  Another thing is this patch
doesn't check
IFN_LOOP_VECTORIZED because it's impossible to have it with current order
of passes.
Bootstrap and test ongoing.  Further comments?

Thanks,
bin
2017-08-15  Bin Cheng  <bin.cheng@arm.com>

    PR tree-optimization/81832
    * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Don't
    copy loop header which has IFN_LOOP_DIST_ALIAS call.

gcc/testsuite
2017-08-15  Bin Cheng  <bin.cheng@arm.com>

    PR tree-optimization/81832
    * gcc.dg/tree-ssa/pr81832.c: New test.

>
> Thanks,
> Richard
>
>> Thanks,
>> Richard.
>>
>>> Bootstrap and test on x86_64.
>>>
>>> Thanks,
>>> bin
>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         PR tree-optimization/81832
>>>         * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function.
>>>         (pass_ch_vect::process_loop_p): Call above function.
>>>
>>> gcc/testsuite
>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         PR tree-optimization/81832
>>>         * gcc.dg/tree-ssa/pr81832.c: New test.

[-- Attachment #2: pr81832-20170816.txt --]
[-- Type: text/plain, Size: 1275 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
new file mode 100644
index 0000000..893124e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+int a, b, *c;
+void d(void)
+{
+    int **e;
+    for(;;)
+        for(int f = 1; f <= 6; f++)
+        {
+            b = 0;
+            if(a)
+g:
+                while(a++);
+            if (**e);
+            else
+            {
+                *c = a;
+                goto g;
+            }
+        }
+}
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index 14cc6d8d..6bb0220 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -119,7 +119,10 @@ should_duplicate_loop_header_p (basic_block header, struct loop *loop,
 	continue;
 
       if (gimple_code (last) == GIMPLE_CALL
-	  && !gimple_inexpensive_call_p (as_a <gcall *> (last)))
+	  && (!gimple_inexpensive_call_p (as_a <gcall *> (last))
+	      /* IFN_LOOP_DIST_ALIAS means that inner loop is distributed
+		 at current loop's header.  Don't copy in this case.  */
+	      || gimple_call_internal_p (last, IFN_LOOP_DIST_ALIAS)))
 	{
 	  if (dump_file && (dump_flags & TDF_DETAILS))
 	    fprintf (dump_file,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH PR81832]Skip copying loop header if inner loop is distributed
  2017-08-16 10:43     ` Bin.Cheng
@ 2017-08-16 10:47       ` Richard Sandiford
  2017-08-16 10:49         ` Bin.Cheng
  2017-08-16 10:56       ` Richard Biener
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Sandiford @ 2017-08-16 10:47 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Richard Biener, gcc-patches

"Bin.Cheng" <amker.cheng@gmail.com> writes:
> On Tue, Aug 15, 2017 at 6:33 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>> Hi,
>>>> This patch fixes PR81832.  Root cause for the ICE is:
>>>>   1) Loop has distributed inner loop.
>>>>   2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header.
>>>>   3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus
>>>>      not eliminated.
>>>>
>>>> Given pass_ch_vect copies loop header to enable more vectorization, we should
>>>> skip loop in this case because distributed inner loop means this loop can not
>>>> be vectorized anyway.  One point to mention is name inner_loop_distributed_p
>>>> is a little misleading.  The name indicates that each basic block is checked,
>>>> but the patch only checks loop's header for simplicity/efficiency's purpose.
>>>> Any comment?
>>>
>>> My comment would be to question pass_ch_vect placement -- what was the
>>> reason to place it so late?
>>>
>>> I also see GRAPHITE runs inbetween loop distribution and vectorization --
>>> what prevents GRAPHITE from messing up things here?  Or autopar?
>>>
>>> The patch itself shows should_duplicate_loop_header_p should
>>> handle this IFN specially (somehow all IFNs are considered "inexpensive").
>>>
>>> So can you please adjust should_duplicate_loop_header_p instead and/or
>>> gimple_inexpensive_call_p?  Since we have IFNs for stuff like EXP10 I'm not
>>> sure if that default is so good.
>>
>> I think the default itself is OK: we only use IFNs for libm functions
>> like exp10 if an underlying optab exists (unlike __builtin_exp10, which
>> is useful as the non-errno setting version of exp10), so the target must
>> have something that it thinks is worth open-coding.  Also, we currently
>> treat all MD built-ins as "simple" and thus "inexpensive", so the IFN
>> handling is consistent with that.
>>
>> Maybe there are some IFNs that are worth special-casing as expensive,
>> but IMO doing that to solve this problem would be a bit hacky.  It seems
>> like "inexpensive" should be more of a cost thing than a correctness thing.
> Hi all,

> This is updated patch.  It only adds a single check on IFN_LOOP_DIST_ALIAS
> in function should_duplicate_loop_header_p.

Thanks.

> As for gimple_inexpensive_call_p, I think it's natural to consider
> functions like IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS as
> expensive because they are only meant to indicate temporary
> arrangement of optimizations and are never used in code generation.
> I will send a standalone patch for that.

Is that enough to consider them expensive though?  To me, "expensive"
should mean that they cost a lot in terms of size or speed (whichever
is most important in context).  Both functions are really cheap in
that sense, since they eventually expand to constants.

Thanks,
Richard

> Another thing is this patch doesn't check IFN_LOOP_VECTORIZED because
> it's impossible to have it with current order of passes.  Bootstrap
> and test ongoing.  Further comments?
>
> Thanks,
> bin
> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/81832
>     * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Don't
>     copy loop header which has IFN_LOOP_DIST_ALIAS call.
>
> gcc/testsuite
> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/81832
>     * gcc.dg/tree-ssa/pr81832.c: New test.
>
>>
>> Thanks,
>> Richard
>>
>>> Thanks,
>>> Richard.
>>>
>>>> Bootstrap and test on x86_64.
>>>>
>>>> Thanks,
>>>> bin
>>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         PR tree-optimization/81832
>>>>         * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function.
>>>>         (pass_ch_vect::process_loop_p): Call above function.
>>>>
>>>> gcc/testsuite
>>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         PR tree-optimization/81832
>>>>         * gcc.dg/tree-ssa/pr81832.c: New test.
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
> new file mode 100644
> index 0000000..893124e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
> @@ -0,0 +1,22 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +int a, b, *c;
> +void d(void)
> +{
> +    int **e;
> +    for(;;)
> +        for(int f = 1; f <= 6; f++)
> +        {
> +            b = 0;
> +            if(a)
> +g:
> +                while(a++);
> +            if (**e);
> +            else
> +            {
> +                *c = a;
> +                goto g;
> +            }
> +        }
> +}
> diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
> index 14cc6d8d..6bb0220 100644
> --- a/gcc/tree-ssa-loop-ch.c
> +++ b/gcc/tree-ssa-loop-ch.c
> @@ -119,7 +119,10 @@ should_duplicate_loop_header_p (basic_block header, struct loop *loop,
>  	continue;
>  
>        if (gimple_code (last) == GIMPLE_CALL
> -	  && !gimple_inexpensive_call_p (as_a <gcall *> (last)))
> +	  && (!gimple_inexpensive_call_p (as_a <gcall *> (last))
> +	      /* IFN_LOOP_DIST_ALIAS means that inner loop is distributed
> +		 at current loop's header.  Don't copy in this case.  */
> +	      || gimple_call_internal_p (last, IFN_LOOP_DIST_ALIAS)))
>  	{
>  	  if (dump_file && (dump_flags & TDF_DETAILS))
>  	    fprintf (dump_file,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH PR81832]Skip copying loop header if inner loop is distributed
  2017-08-16 10:47       ` Richard Sandiford
@ 2017-08-16 10:49         ` Bin.Cheng
  0 siblings, 0 replies; 7+ messages in thread
From: Bin.Cheng @ 2017-08-16 10:49 UTC (permalink / raw)
  To: Bin.Cheng, Richard Biener, gcc-patches, Richard Sandiford

On Wed, Aug 16, 2017 at 10:31 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> "Bin.Cheng" <amker.cheng@gmail.com> writes:
>> On Tue, Aug 15, 2017 at 6:33 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>>> Hi,
>>>>> This patch fixes PR81832.  Root cause for the ICE is:
>>>>>   1) Loop has distributed inner loop.
>>>>>   2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header.
>>>>>   3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus
>>>>>      not eliminated.
>>>>>
>>>>> Given pass_ch_vect copies loop header to enable more vectorization, we should
>>>>> skip loop in this case because distributed inner loop means this loop can not
>>>>> be vectorized anyway.  One point to mention is name inner_loop_distributed_p
>>>>> is a little misleading.  The name indicates that each basic block is checked,
>>>>> but the patch only checks loop's header for simplicity/efficiency's purpose.
>>>>> Any comment?
>>>>
>>>> My comment would be to question pass_ch_vect placement -- what was the
>>>> reason to place it so late?
>>>>
>>>> I also see GRAPHITE runs inbetween loop distribution and vectorization --
>>>> what prevents GRAPHITE from messing up things here?  Or autopar?
>>>>
>>>> The patch itself shows should_duplicate_loop_header_p should
>>>> handle this IFN specially (somehow all IFNs are considered "inexpensive").
>>>>
>>>> So can you please adjust should_duplicate_loop_header_p instead and/or
>>>> gimple_inexpensive_call_p?  Since we have IFNs for stuff like EXP10 I'm not
>>>> sure if that default is so good.
>>>
>>> I think the default itself is OK: we only use IFNs for libm functions
>>> like exp10 if an underlying optab exists (unlike __builtin_exp10, which
>>> is useful as the non-errno setting version of exp10), so the target must
>>> have something that it thinks is worth open-coding.  Also, we currently
>>> treat all MD built-ins as "simple" and thus "inexpensive", so the IFN
>>> handling is consistent with that.
>>>
>>> Maybe there are some IFNs that are worth special-casing as expensive,
>>> but IMO doing that to solve this problem would be a bit hacky.  It seems
>>> like "inexpensive" should be more of a cost thing than a correctness thing.
>> Hi all,
>
>> This is updated patch.  It only adds a single check on IFN_LOOP_DIST_ALIAS
>> in function should_duplicate_loop_header_p.
>
> Thanks.
>
>> As for gimple_inexpensive_call_p, I think it's natural to consider
>> functions like IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS as
>> expensive because they are only meant to indicate temporary
>> arrangement of optimizations and are never used in code generation.
>> I will send a standalone patch for that.
>
> Is that enough to consider them expensive though?  To me, "expensive"
Not sure.  Or the other hand,  "Expensive" is a measurement of the cost of
generated code.  For internal function calls in discussion, maybe we should
not ask the question in the first.  Even these function calls are expanded to
constant, IMHO, we can't simply consider it's cheap either because there is
high level side-effects along with expanding, i.e, undoing loop versioning.
such high level transformation is not (and should not be) covered by
gimple_inexpensive_call_p.

Thanks,
bin
> should mean that they cost a lot in terms of size or speed (whichever
> is most important in context).  Both functions are really cheap in
> that sense, since they eventually expand to constants.
>
> Thanks,
> Richard
>
>> Another thing is this patch doesn't check IFN_LOOP_VECTORIZED because
>> it's impossible to have it with current order of passes.  Bootstrap
>> and test ongoing.  Further comments?
>>
>> Thanks,
>> bin
>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>
>>     PR tree-optimization/81832
>>     * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Don't
>>     copy loop header which has IFN_LOOP_DIST_ALIAS call.
>>
>> gcc/testsuite
>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>
>>     PR tree-optimization/81832
>>     * gcc.dg/tree-ssa/pr81832.c: New test.
>>
>>>
>>> Thanks,
>>> Richard
>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> Bootstrap and test on x86_64.
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>         PR tree-optimization/81832
>>>>>         * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function.
>>>>>         (pass_ch_vect::process_loop_p): Call above function.
>>>>>
>>>>> gcc/testsuite
>>>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>         PR tree-optimization/81832
>>>>>         * gcc.dg/tree-ssa/pr81832.c: New test.
>>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
>> new file mode 100644
>> index 0000000..893124e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81832.c
>> @@ -0,0 +1,22 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O3" } */
>> +
>> +int a, b, *c;
>> +void d(void)
>> +{
>> +    int **e;
>> +    for(;;)
>> +        for(int f = 1; f <= 6; f++)
>> +        {
>> +            b = 0;
>> +            if(a)
>> +g:
>> +                while(a++);
>> +            if (**e);
>> +            else
>> +            {
>> +                *c = a;
>> +                goto g;
>> +            }
>> +        }
>> +}
>> diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
>> index 14cc6d8d..6bb0220 100644
>> --- a/gcc/tree-ssa-loop-ch.c
>> +++ b/gcc/tree-ssa-loop-ch.c
>> @@ -119,7 +119,10 @@ should_duplicate_loop_header_p (basic_block header, struct loop *loop,
>>       continue;
>>
>>        if (gimple_code (last) == GIMPLE_CALL
>> -       && !gimple_inexpensive_call_p (as_a <gcall *> (last)))
>> +       && (!gimple_inexpensive_call_p (as_a <gcall *> (last))
>> +           /* IFN_LOOP_DIST_ALIAS means that inner loop is distributed
>> +              at current loop's header.  Don't copy in this case.  */
>> +           || gimple_call_internal_p (last, IFN_LOOP_DIST_ALIAS)))
>>       {
>>         if (dump_file && (dump_flags & TDF_DETAILS))
>>           fprintf (dump_file,

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH PR81832]Skip copying loop header if inner loop is distributed
  2017-08-16 10:43     ` Bin.Cheng
  2017-08-16 10:47       ` Richard Sandiford
@ 2017-08-16 10:56       ` Richard Biener
  1 sibling, 0 replies; 7+ messages in thread
From: Richard Biener @ 2017-08-16 10:56 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches, Richard Sandiford

On Wed, Aug 16, 2017 at 11:22 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Aug 15, 2017 at 6:33 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Tue, Aug 15, 2017 at 11:28 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>> Hi,
>>>> This patch fixes PR81832.  Root cause for the ICE is:
>>>>   1) Loop has distributed inner loop.
>>>>   2) The guarding function call IFN_LOOP_DIST_CALL happens to be in loop's header.
>>>>   3) IFN_LOOP_DIST_CALL (int loop's header) is duplicated by pass_ch_vect thus
>>>>      not eliminated.
>>>>
>>>> Given pass_ch_vect copies loop header to enable more vectorization, we should
>>>> skip loop in this case because distributed inner loop means this loop can not
>>>> be vectorized anyway.  One point to mention is name inner_loop_distributed_p
>>>> is a little misleading.  The name indicates that each basic block is checked,
>>>> but the patch only checks loop's header for simplicity/efficiency's purpose.
>>>> Any comment?
>>>
>>> My comment would be to question pass_ch_vect placement -- what was the
>>> reason to place it so late?
>>>
>>> I also see GRAPHITE runs inbetween loop distribution and vectorization --
>>> what prevents GRAPHITE from messing up things here?  Or autopar?
>>>
>>> The patch itself shows should_duplicate_loop_header_p should
>>> handle this IFN specially (somehow all IFNs are considered "inexpensive").
>>>
>>> So can you please adjust should_duplicate_loop_header_p instead and/or
>>> gimple_inexpensive_call_p?  Since we have IFNs for stuff like EXP10 I'm not
>>> sure if that default is so good.
>>
>> I think the default itself is OK: we only use IFNs for libm functions
>> like exp10 if an underlying optab exists (unlike __builtin_exp10, which
>> is useful as the non-errno setting version of exp10), so the target must
>> have something that it thinks is worth open-coding.  Also, we currently
>> treat all MD built-ins as "simple" and thus "inexpensive", so the IFN
>> handling is consistent with that.
>>
>> Maybe there are some IFNs that are worth special-casing as expensive,
>> but IMO doing that to solve this problem would be a bit hacky.  It seems
>> like "inexpensive" should be more of a cost thing than a correctness thing.
> Hi all,
> This is updated patch.  It only adds a single check on IFN_LOOP_DIST_ALIAS
> in function should_duplicate_loop_header_p.  As for gimple_inexpensive_call_p,
> I think it's natural to consider functions like IFN_LOOP_VECTORIZED and
> IFN_LOOP_DIST_ALIAS as expensive because they are only meant to indicate
> temporary arrangement of optimizations and are never used in code generation.
> I will send a standalone patch for that.  Another thing is this patch
> doesn't check
> IFN_LOOP_VECTORIZED because it's impossible to have it with current order
> of passes.
> Bootstrap and test ongoing.  Further comments?

Ok.

Thanks,
Richard.

> Thanks,
> bin
> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/81832
>     * tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Don't
>     copy loop header which has IFN_LOOP_DIST_ALIAS call.
>
> gcc/testsuite
> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/81832
>     * gcc.dg/tree-ssa/pr81832.c: New test.
>
>>
>> Thanks,
>> Richard
>>
>>> Thanks,
>>> Richard.
>>>
>>>> Bootstrap and test on x86_64.
>>>>
>>>> Thanks,
>>>> bin
>>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         PR tree-optimization/81832
>>>>         * tree-ssa-loop-ch.c (inner_loop_distributed_p): New function.
>>>>         (pass_ch_vect::process_loop_p): Call above function.
>>>>
>>>> gcc/testsuite
>>>> 2017-08-15  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         PR tree-optimization/81832
>>>>         * gcc.dg/tree-ssa/pr81832.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-08-16 10:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-15 11:26 [PATCH PR81832]Skip copying loop header if inner loop is distributed Bin Cheng
2017-08-15 11:35 ` Richard Biener
2017-08-15 17:51   ` Richard Sandiford
2017-08-16 10:43     ` Bin.Cheng
2017-08-16 10:47       ` Richard Sandiford
2017-08-16 10:49         ` Bin.Cheng
2017-08-16 10:56       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).