public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* loop-split improvements, part 1
@ 2023-07-28  6:37 Jan Hubicka
  2023-07-28  6:42 ` Richard Biener
  2023-07-28  6:52 ` Andrew Pinski
  0 siblings, 2 replies; 3+ messages in thread
From: Jan Hubicka @ 2023-07-28  6:37 UTC (permalink / raw)
  To: gcc-patches

Hi,
while looking on profile misupdate on hmmer I noticed that loop splitting pass is not
able to handle the loop it has as an example it should apply on:

   One transformation of loops like:

   for (i = 0; i < 100; i++)
     {
       if (i < 50)
         A;
       else
         B;
     }

   into:

   for (i = 0; i < 50; i++)
     {
       A;
     }
   for (; i < 100; i++)
     {
       B;
     }

The problem is that ivcanon turns the test into i != 100 and the pass
explicitly gives up on any loops ending with != test.  It needs to know
the directoin of the induction variable in order to derive right conditions,
but that can be done also from step.

It turns out that there are no testcases for basic loop splitting.  I will add
some with the profile update fix.

There are other issues, like VRP will turn i < 99 into i == 99 based on
value range which also makes the pass to give up.

Bootstrapped/regtested x86_64-linux, OK?

gcc/ChangeLog:

	* tree-ssa-loop-split.cc (split_loop): Also support NE driven
	loops when IV test is not overflowing.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ifc-12.c: Disable loop splitting.
	* gcc.target/i386/avx2-gather-6.c: Likewise.
	* gcc.target/i386/avx2-vect-aggressive.c: Likewise.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
index 9468c070489..bedf29c7dbc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details" } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details -fno-split-loops" } */
 /* { dg-require-visibility "" } */
 
 struct st
diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
index b9119581ae2..47a95dbe989 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake" } */
+/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake -fno-split-loops" } */
 
 #include "avx2-gather-5.c"
 
diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
index 57192791857..fa336e70e84 100644
--- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
+++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-require-effective-target avx2 } */
-/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */
+/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1 -fno-split-loops" } */
 
 #include "avx2-check.h"
 #define N 64
diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
index b41b5e614c2..27780370d85 100644
--- a/gcc/tree-ssa-loop-split.cc
+++ b/gcc/tree-ssa-loop-split.cc
@@ -540,10 +545,17 @@ split_loop (class loop *loop1)
       || !empty_block_p (loop1->latch)
       || !easy_exit_values (loop1)
       || !number_of_iterations_exit (loop1, exit1, &niter, false, true)
-      || niter.cmp == ERROR_MARK
-      /* We can't yet handle loops controlled by a != predicate.  */
-      || niter.cmp == NE_EXPR)
+      || niter.cmp == ERROR_MARK)
     return false;
+  if (niter.cmp == NE_EXPR)
+    {
+      if (!niter.control.no_overflow)
+	return false;
+      if (tree_int_cst_sign_bit (niter.control.step) > 0)
+	niter.cmp = GT_EXPR;
+      else
+	niter.cmp = LT_EXPR;
+    }
 
   bbs = get_loop_body (loop1);
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: loop-split improvements, part 1
  2023-07-28  6:37 loop-split improvements, part 1 Jan Hubicka
@ 2023-07-28  6:42 ` Richard Biener
  2023-07-28  6:52 ` Andrew Pinski
  1 sibling, 0 replies; 3+ messages in thread
From: Richard Biener @ 2023-07-28  6:42 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Fri, Jul 28, 2023 at 8:38 AM Jan Hubicka via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
> while looking on profile misupdate on hmmer I noticed that loop splitting pass is not
> able to handle the loop it has as an example it should apply on:
>
>    One transformation of loops like:
>
>    for (i = 0; i < 100; i++)
>      {
>        if (i < 50)
>          A;
>        else
>          B;
>      }
>
>    into:
>
>    for (i = 0; i < 50; i++)
>      {
>        A;
>      }
>    for (; i < 100; i++)
>      {
>        B;
>      }
>
> The problem is that ivcanon turns the test into i != 100 and the pass
> explicitly gives up on any loops ending with != test.  It needs to know
> the directoin of the induction variable in order to derive right conditions,
> but that can be done also from step.
>
> It turns out that there are no testcases for basic loop splitting.  I will add
> some with the profile update fix.
>
> There are other issues, like VRP will turn i < 99 into i == 99 based on
> value range which also makes the pass to give up.
>
> Bootstrapped/regtested x86_64-linux, OK?

OK.

> gcc/ChangeLog:
>
>         * tree-ssa-loop-split.cc (split_loop): Also support NE driven
>         loops when IV test is not overflowing.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/ifc-12.c: Disable loop splitting.
>         * gcc.target/i386/avx2-gather-6.c: Likewise.
>         * gcc.target/i386/avx2-vect-aggressive.c: Likewise.
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
> index 9468c070489..bedf29c7dbc 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details" } */
> +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details -fno-split-loops" } */
>  /* { dg-require-visibility "" } */
>
>  struct st
> diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> index b9119581ae2..47a95dbe989 100644
> --- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> +++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake" } */
> +/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake -fno-split-loops" } */
>
>  #include "avx2-gather-5.c"
>
> diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
> index 57192791857..fa336e70e84 100644
> --- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
> +++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
> @@ -1,6 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-require-effective-target avx2 } */
> -/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */
> +/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1 -fno-split-loops" } */
>
>  #include "avx2-check.h"
>  #define N 64
> diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
> index b41b5e614c2..27780370d85 100644
> --- a/gcc/tree-ssa-loop-split.cc
> +++ b/gcc/tree-ssa-loop-split.cc
> @@ -540,10 +545,17 @@ split_loop (class loop *loop1)
>        || !empty_block_p (loop1->latch)
>        || !easy_exit_values (loop1)
>        || !number_of_iterations_exit (loop1, exit1, &niter, false, true)
> -      || niter.cmp == ERROR_MARK
> -      /* We can't yet handle loops controlled by a != predicate.  */
> -      || niter.cmp == NE_EXPR)
> +      || niter.cmp == ERROR_MARK)
>      return false;
> +  if (niter.cmp == NE_EXPR)
> +    {
> +      if (!niter.control.no_overflow)
> +       return false;
> +      if (tree_int_cst_sign_bit (niter.control.step) > 0)
> +       niter.cmp = GT_EXPR;
> +      else
> +       niter.cmp = LT_EXPR;
> +    }
>
>    bbs = get_loop_body (loop1);
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: loop-split improvements, part 1
  2023-07-28  6:37 loop-split improvements, part 1 Jan Hubicka
  2023-07-28  6:42 ` Richard Biener
@ 2023-07-28  6:52 ` Andrew Pinski
  1 sibling, 0 replies; 3+ messages in thread
From: Andrew Pinski @ 2023-07-28  6:52 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Thu, Jul 27, 2023 at 11:38 PM Jan Hubicka via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
> while looking on profile misupdate on hmmer I noticed that loop splitting pass is not
> able to handle the loop it has as an example it should apply on:
>
>    One transformation of loops like:
>
>    for (i = 0; i < 100; i++)
>      {
>        if (i < 50)
>          A;
>        else
>          B;
>      }
>
>    into:
>
>    for (i = 0; i < 50; i++)
>      {
>        A;
>      }
>    for (; i < 100; i++)
>      {
>        B;
>      }
>
> The problem is that ivcanon turns the test into i != 100 and the pass
> explicitly gives up on any loops ending with != test.  It needs to know
> the directoin of the induction variable in order to derive right conditions,
> but that can be done also from step.
>
> It turns out that there are no testcases for basic loop splitting.  I will add
> some with the profile update fix.

Thanks for doing this.
Here are bug reports you should look at after all of your loop
splitting patches are done:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37239
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77689

Thanks,
Andrew

>
> There are other issues, like VRP will turn i < 99 into i == 99 based on
> value range which also makes the pass to give up.
>
> Bootstrapped/regtested x86_64-linux, OK?
>
> gcc/ChangeLog:
>
>         * tree-ssa-loop-split.cc (split_loop): Also support NE driven
>         loops when IV test is not overflowing.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/ifc-12.c: Disable loop splitting.
>         * gcc.target/i386/avx2-gather-6.c: Likewise.
>         * gcc.target/i386/avx2-vect-aggressive.c: Likewise.
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
> index 9468c070489..bedf29c7dbc 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-12.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details" } */
> +/* { dg-options "-Ofast -fdump-tree-ifcvt-stats-blocks-details -fno-split-loops" } */
>  /* { dg-require-visibility "" } */
>
>  struct st
> diff --git a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> index b9119581ae2..47a95dbe989 100644
> --- a/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> +++ b/gcc/testsuite/gcc.target/i386/avx2-gather-6.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake" } */
> +/* { dg-options "-O3 -mavx2 -fno-common -fdump-tree-vect-details -mtune=skylake -fno-split-loops" } */
>
>  #include "avx2-gather-5.c"
>
> diff --git a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
> index 57192791857..fa336e70e84 100644
> --- a/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
> +++ b/gcc/testsuite/gcc.target/i386/avx2-vect-aggressive.c
> @@ -1,6 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-require-effective-target avx2 } */
> -/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1" } */
> +/* { dg-options "-mavx2 -O3 -fopenmp-simd -fdump-tree-vect-details -fdisable-tree-thread1 -fno-split-loops" } */
>
>  #include "avx2-check.h"
>  #define N 64
> diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
> index b41b5e614c2..27780370d85 100644
> --- a/gcc/tree-ssa-loop-split.cc
> +++ b/gcc/tree-ssa-loop-split.cc
> @@ -540,10 +545,17 @@ split_loop (class loop *loop1)
>        || !empty_block_p (loop1->latch)
>        || !easy_exit_values (loop1)
>        || !number_of_iterations_exit (loop1, exit1, &niter, false, true)
> -      || niter.cmp == ERROR_MARK
> -      /* We can't yet handle loops controlled by a != predicate.  */
> -      || niter.cmp == NE_EXPR)
> +      || niter.cmp == ERROR_MARK)
>      return false;
> +  if (niter.cmp == NE_EXPR)
> +    {
> +      if (!niter.control.no_overflow)
> +       return false;
> +      if (tree_int_cst_sign_bit (niter.control.step) > 0)
> +       niter.cmp = GT_EXPR;
> +      else
> +       niter.cmp = LT_EXPR;
> +    }
>
>    bbs = get_loop_body (loop1);
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-07-28  6:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-28  6:37 loop-split improvements, part 1 Jan Hubicka
2023-07-28  6:42 ` Richard Biener
2023-07-28  6:52 ` Andrew Pinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).