public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]
@ 2024-04-05 12:26 Manolis Tsamis
  2024-04-05 12:43 ` Richard Biener
  2024-04-05 18:13 ` Andrew Pinski
  0 siblings, 2 replies; 7+ messages in thread
From: Manolis Tsamis @ 2024-04-05 12:26 UTC (permalink / raw)
  To: gcc-patches
  Cc: Andrew Pinski, Philipp Tomsich, Tamar Christina, Jiangning Liu,
	Manolis Tsamis

If we consider code like:

    if (bar1 == x)
      return foo();
    if (bar2 != y)
      return foo();
    return 0;

We would like the ifcombine pass to convert this to:

    if (bar1 == x || bar2 != y)
      return foo();
    return 0;

The ifcombine pass can handle this transformation but it is ran very early and
it misses the opportunity because there are two seperate blocks for foo().
The pre pass is good at removing duplicate code and blocks and due to that
running ifcombine again after it can increase the number of successful
conversions.

        PR 102793

gcc/ChangeLog:

	* common.opt: -ftree-ifcombine option, enabled by default.
	* doc/invoke.texi: Document.
	* passes.def: Re-run ssa-ifcombine after pre.
	* tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
	* gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
	* gcc.target/aarch64/pr102793.c: New test.

Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
---

 gcc/common.opt                              |  4 +++
 gcc/doc/invoke.texi                         |  5 ++++
 gcc/passes.def                              |  1 +
 gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
 gcc/testsuite/gcc.dg/uninit-pred-6_c.c      |  4 ---
 gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +++++++++++++++++++++
 gcc/tree-ssa-ifcombine.cc                   |  5 ++++
 7 files changed, 46 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c

diff --git a/gcc/common.opt b/gcc/common.opt
index ad348844775..e943202bcf1 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3163,6 +3163,10 @@ ftree-phiprop
 Common Var(flag_tree_phiprop) Init(1) Optimization
 Enable hoisting loads from conditional pointers.
 
+ftree-ifcombine
+Common Var(flag_tree_ifcombine) Init(1) Optimization
+Merge some conditional branches to simplify control flow.
+
 ftree-pre
 Common Var(flag_tree_pre) Optimization
 Enable SSA-PRE optimization on trees.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e2edf7a6c13..8d2ff6b4512 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and higher.
 Perform hoisting of loads from conditional pointers on trees.  This
 pass is enabled by default at @option{-O1} and higher.
 
+@opindex ftree-ifcombine
+@item -ftree-ifcombine
+Merge some conditional branches to simplify control flow.  This pass
+is enabled by default at @option{-O1} and higher.
+
 @opindex fhoist-adjacent-loads
 @item -fhoist-adjacent-loads
 Speculatively hoist loads from both branches of an if-then-else if the
diff --git a/gcc/passes.def b/gcc/passes.def
index 1cbbd413097..1765b476131 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_lim);
       NEXT_PASS (pass_walloca, false);
       NEXT_PASS (pass_pre);
+      NEXT_PASS (pass_tree_ifcombine);
       NEXT_PASS (pass_sink_code, false /* unsplit edges */);
       NEXT_PASS (pass_sancov);
       NEXT_PASS (pass_asan);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
index 16c79da9521..66c9f481a2f 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
+/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
 
 struct rtx_def;
 typedef struct rtx_def *rtx;
diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
index f60868dad23..2d8e6501a45 100644
--- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
+++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
@@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
   if ( (n > 10) && l)
       blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
 
-  if (l)
-    if (n > 12)
-      blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
-
   return 0;
 }
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr102793.c b/gcc/testsuite/gcc.target/aarch64/pr102793.c
new file mode 100644
index 00000000000..78d48e01637
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr102793.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef unsigned long uint64_t;
+
+int ccmp1(uint64_t* s1, uint64_t* s2, int(*foo)(void))
+{
+  uint64_t d1, d2, bar;
+  d1 = *s1++;
+  d2 = *s2++;
+  bar = (d1 + d2) & 0xabcd;
+  if (bar == 0 || d1 != d2)
+    return foo();
+  return 0;
+}
+
+int ccmp2(uint64_t* s1, uint64_t* s2, int(*foo)(void))
+{
+  uint64_t d1, d2, bar;
+  d1 = *s1++;
+  d2 = *s2++;
+  bar = (d1 + d2) & 0xabcd;
+  if (bar == 0)
+    return foo();
+  if (d1 != d2)
+    return foo();
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "ccmp\t" 2 } } */
\ No newline at end of file
diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
index 6a3bc99190d..0bf9fe8b692 100644
--- a/gcc/tree-ssa-ifcombine.cc
+++ b/gcc/tree-ssa-ifcombine.cc
@@ -838,6 +838,11 @@ public:
   {}
 
   /* opt_pass methods: */
+  opt_pass * clone () final override
+  {
+    return new pass_tree_ifcombine (m_ctxt);
+  }
+  bool gate (function *) final override { return flag_tree_ifcombine; }
   unsigned int execute (function *) final override;
 
 }; // class pass_tree_ifcombine
-- 
2.44.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]
  2024-04-05 12:26 [PATCH] Add extra copy of the ifcombine pass after pre [PR102793] Manolis Tsamis
@ 2024-04-05 12:43 ` Richard Biener
  2024-04-05 13:33   ` Manolis Tsamis
  2024-04-05 18:13 ` Andrew Pinski
  1 sibling, 1 reply; 7+ messages in thread
From: Richard Biener @ 2024-04-05 12:43 UTC (permalink / raw)
  To: Manolis Tsamis
  Cc: gcc-patches, Andrew Pinski, Philipp Tomsich, Tamar Christina,
	Jiangning Liu

On Fri, Apr 5, 2024 at 2:28 PM Manolis Tsamis <manolis.tsamis@vrull.eu> wrote:
>
> If we consider code like:
>
>     if (bar1 == x)
>       return foo();
>     if (bar2 != y)
>       return foo();
>     return 0;
>
> We would like the ifcombine pass to convert this to:
>
>     if (bar1 == x || bar2 != y)
>       return foo();
>     return 0;
>
> The ifcombine pass can handle this transformation but it is ran very early and
> it misses the opportunity because there are two seperate blocks for foo().
> The pre pass is good at removing duplicate code and blocks and due to that
> running ifcombine again after it can increase the number of successful
> conversions.
>
>         PR 102793
>
> gcc/ChangeLog:
>
>         * common.opt: -ftree-ifcombine option, enabled by default.
>         * doc/invoke.texi: Document.
>         * passes.def: Re-run ssa-ifcombine after pre.
>         * tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
>         * gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
>         * gcc.target/aarch64/pr102793.c: New test.
>
> Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> ---
>
>  gcc/common.opt                              |  4 +++
>  gcc/doc/invoke.texi                         |  5 ++++
>  gcc/passes.def                              |  1 +
>  gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
>  gcc/testsuite/gcc.dg/uninit-pred-6_c.c      |  4 ---
>  gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +++++++++++++++++++++
>  gcc/tree-ssa-ifcombine.cc                   |  5 ++++
>  7 files changed, 46 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index ad348844775..e943202bcf1 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3163,6 +3163,10 @@ ftree-phiprop
>  Common Var(flag_tree_phiprop) Init(1) Optimization
>  Enable hoisting loads from conditional pointers.
>
> +ftree-ifcombine

Please don't add further -ftree-X flags, 'tree' means nothing
to users.  -fif-combine would be better.

> +Common Var(flag_tree_ifcombine) Init(1) Optimization
> +Merge some conditional branches to simplify control flow.
> +
>  ftree-pre
>  Common Var(flag_tree_pre) Optimization
>  Enable SSA-PRE optimization on trees.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e2edf7a6c13..8d2ff6b4512 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and higher.
>  Perform hoisting of loads from conditional pointers on trees.  This
>  pass is enabled by default at @option{-O1} and higher.
>
> +@opindex ftree-ifcombine
> +@item -ftree-ifcombine
> +Merge some conditional branches to simplify control flow.  This pass
> +is enabled by default at @option{-O1} and higher.
> +
>  @opindex fhoist-adjacent-loads
>  @item -fhoist-adjacent-loads
>  Speculatively hoist loads from both branches of an if-then-else if the
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 1cbbd413097..1765b476131 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_lim);
>        NEXT_PASS (pass_walloca, false);
>        NEXT_PASS (pass_pre);
> +      NEXT_PASS (pass_tree_ifcombine);
>        NEXT_PASS (pass_sink_code, false /* unsplit edges */);

Please move it here, after sinking.

>        NEXT_PASS (pass_sancov);
>        NEXT_PASS (pass_asan);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> index 16c79da9521..66c9f481a2f 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
> +/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
>
>  struct rtx_def;
>  typedef struct rtx_def *rtx;
> diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> index f60868dad23..2d8e6501a45 100644
> --- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> +++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> @@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
>    if ( (n > 10) && l)
>        blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
>
> -  if (l)
> -    if (n > 12)
> -      blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> -

What's "inconsistent" about this check?  I suppose we now diagnose this?
The appropriate way would be to XFAIL this but I'd like you to explain
why we now diagnose this (I don't see obvious if-combining opportunities).

On a general note you rely on the tail-merging pass which is part of PRE
and which hasn't seen any love and which isn't very powerful either.  I'm not
sure it's worth doing if-combining on the whole IL again because of it.
It might be possible to locally try if-combining from the immediate dominator
of a merged tail from inside tail-merging itself?

Richard.

>    return 0;
>  }
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr102793.c b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> new file mode 100644
> index 00000000000..78d48e01637
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +typedef unsigned long uint64_t;
> +
> +int ccmp1(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> +{
> +  uint64_t d1, d2, bar;
> +  d1 = *s1++;
> +  d2 = *s2++;
> +  bar = (d1 + d2) & 0xabcd;
> +  if (bar == 0 || d1 != d2)
> +    return foo();
> +  return 0;
> +}
> +
> +int ccmp2(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> +{
> +  uint64_t d1, d2, bar;
> +  d1 = *s1++;
> +  d2 = *s2++;
> +  bar = (d1 + d2) & 0xabcd;
> +  if (bar == 0)
> +    return foo();
> +  if (d1 != d2)
> +    return foo();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "ccmp\t" 2 } } */
> \ No newline at end of file
> diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
> index 6a3bc99190d..0bf9fe8b692 100644
> --- a/gcc/tree-ssa-ifcombine.cc
> +++ b/gcc/tree-ssa-ifcombine.cc
> @@ -838,6 +838,11 @@ public:
>    {}
>
>    /* opt_pass methods: */
> +  opt_pass * clone () final override
> +  {
> +    return new pass_tree_ifcombine (m_ctxt);
> +  }
> +  bool gate (function *) final override { return flag_tree_ifcombine; }
>    unsigned int execute (function *) final override;
>
>  }; // class pass_tree_ifcombine
> --
> 2.44.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]
  2024-04-05 12:43 ` Richard Biener
@ 2024-04-05 13:33   ` Manolis Tsamis
  0 siblings, 0 replies; 7+ messages in thread
From: Manolis Tsamis @ 2024-04-05 13:33 UTC (permalink / raw)
  To: Richard Biener
  Cc: gcc-patches, Andrew Pinski, Philipp Tomsich, Tamar Christina,
	Jiangning Liu

On Fri, Apr 5, 2024 at 3:43 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Apr 5, 2024 at 2:28 PM Manolis Tsamis <manolis.tsamis@vrull.eu> wrote:
> >
> > If we consider code like:
> >
> >     if (bar1 == x)
> >       return foo();
> >     if (bar2 != y)
> >       return foo();
> >     return 0;
> >
> > We would like the ifcombine pass to convert this to:
> >
> >     if (bar1 == x || bar2 != y)
> >       return foo();
> >     return 0;
> >
> > The ifcombine pass can handle this transformation but it is ran very early and
> > it misses the opportunity because there are two seperate blocks for foo().
> > The pre pass is good at removing duplicate code and blocks and due to that
> > running ifcombine again after it can increase the number of successful
> > conversions.
> >
> >         PR 102793
> >
> > gcc/ChangeLog:
> >
> >         * common.opt: -ftree-ifcombine option, enabled by default.
> >         * doc/invoke.texi: Document.
> >         * passes.def: Re-run ssa-ifcombine after pre.
> >         * tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
> >         * gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
> >         * gcc.target/aarch64/pr102793.c: New test.
> >
> > Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> > ---
> >
> >  gcc/common.opt                              |  4 +++
> >  gcc/doc/invoke.texi                         |  5 ++++
> >  gcc/passes.def                              |  1 +
> >  gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
> >  gcc/testsuite/gcc.dg/uninit-pred-6_c.c      |  4 ---
> >  gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +++++++++++++++++++++
> >  gcc/tree-ssa-ifcombine.cc                   |  5 ++++
> >  7 files changed, 46 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c
> >
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index ad348844775..e943202bcf1 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -3163,6 +3163,10 @@ ftree-phiprop
> >  Common Var(flag_tree_phiprop) Init(1) Optimization
> >  Enable hoisting loads from conditional pointers.
> >
> > +ftree-ifcombine
>
> Please don't add further -ftree-X flags, 'tree' means nothing
> to users.  -fif-combine would be better.
>
Ok, thanks for pointing out, will do.

> > +Common Var(flag_tree_ifcombine) Init(1) Optimization
> > +Merge some conditional branches to simplify control flow.
> > +
> >  ftree-pre
> >  Common Var(flag_tree_pre) Optimization
> >  Enable SSA-PRE optimization on trees.
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index e2edf7a6c13..8d2ff6b4512 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and higher.
> >  Perform hoisting of loads from conditional pointers on trees.  This
> >  pass is enabled by default at @option{-O1} and higher.
> >
> > +@opindex ftree-ifcombine
> > +@item -ftree-ifcombine
> > +Merge some conditional branches to simplify control flow.  This pass
> > +is enabled by default at @option{-O1} and higher.
> > +
> >  @opindex fhoist-adjacent-loads
> >  @item -fhoist-adjacent-loads
> >  Speculatively hoist loads from both branches of an if-then-else if the
> > diff --git a/gcc/passes.def b/gcc/passes.def
> > index 1cbbd413097..1765b476131 100644
> > --- a/gcc/passes.def
> > +++ b/gcc/passes.def
> > @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
> >        NEXT_PASS (pass_lim);
> >        NEXT_PASS (pass_walloca, false);
> >        NEXT_PASS (pass_pre);
> > +      NEXT_PASS (pass_tree_ifcombine);
> >        NEXT_PASS (pass_sink_code, false /* unsplit edges */);
>
> Please move it here, after sinking.
>
My initial placement was there, but I saw that in some cases
(including the testcase), sinking would introduce some blocks that
prohibited the ifcombining.

> >        NEXT_PASS (pass_sancov);
> >        NEXT_PASS (pass_asan);
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > index 16c79da9521..66c9f481a2f 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
> > +/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
> >
> >  struct rtx_def;
> >  typedef struct rtx_def *rtx;
> > diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > index f60868dad23..2d8e6501a45 100644
> > --- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > +++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > @@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
> >    if ( (n > 10) && l)
> >        blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> >
> > -  if (l)
> > -    if (n > 12)
> > -      blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> > -
>
> What's "inconsistent" about this check?  I suppose we now diagnose this?
> The appropriate way would be to XFAIL this but I'd like you to explain
> why we now diagnose this (I don't see obvious if-combining opportunities).
>

Maybe inconsistent wasn't the correct word, but what I meant was:
If the condition in the test is merged into if (l && (n > 12)) then
the "bogus warning" appears on master too and the test fails. By
inconsistent I meant that the analysis relies on a particular
arrangement of conditionals that is fragile.

After this change the branch is optimized by ifcombine and the "bogus
warning" appears. We also get better codegen, at least for AArch64.

It felt wrong to just remove this, it didn't occur to me that this can
be made XFAIL instead. I can change it to that if you think it's
justified.

> On a general note you rely on the tail-merging pass which is part of PRE
> and which hasn't seen any love and which isn't very powerful either.  I'm not
> sure it's worth doing if-combining on the whole IL again because of it.
> It might be possible to locally try if-combining from the immediate dominator
> of a merged tail from inside tail-merging itself?
>

I had a feeling that the ifcombine pass is not very expensive and
that's why I opted to just duplicate it (instead of e.g. moving it).
I'm not really familiar with PRE so I made the placement just by
observing the block deduplication.
I don't know the answer to the last part, but if you believe it's
worthwhile I can try it out.

Thanks,
Manolis

> Richard.
>
> >    return 0;
> >  }
> >
> > diff --git a/gcc/testsuite/gcc.target/aarch64/pr102793.c b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> > new file mode 100644
> > index 00000000000..78d48e01637
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> > @@ -0,0 +1,30 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> > +
> > +typedef unsigned long uint64_t;
> > +
> > +int ccmp1(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> > +{
> > +  uint64_t d1, d2, bar;
> > +  d1 = *s1++;
> > +  d2 = *s2++;
> > +  bar = (d1 + d2) & 0xabcd;
> > +  if (bar == 0 || d1 != d2)
> > +    return foo();
> > +  return 0;
> > +}
> > +
> > +int ccmp2(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> > +{
> > +  uint64_t d1, d2, bar;
> > +  d1 = *s1++;
> > +  d2 = *s2++;
> > +  bar = (d1 + d2) & 0xabcd;
> > +  if (bar == 0)
> > +    return foo();
> > +  if (d1 != d2)
> > +    return foo();
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "ccmp\t" 2 } } */
> > \ No newline at end of file
> > diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
> > index 6a3bc99190d..0bf9fe8b692 100644
> > --- a/gcc/tree-ssa-ifcombine.cc
> > +++ b/gcc/tree-ssa-ifcombine.cc
> > @@ -838,6 +838,11 @@ public:
> >    {}
> >
> >    /* opt_pass methods: */
> > +  opt_pass * clone () final override
> > +  {
> > +    return new pass_tree_ifcombine (m_ctxt);
> > +  }
> > +  bool gate (function *) final override { return flag_tree_ifcombine; }
> >    unsigned int execute (function *) final override;
> >
> >  }; // class pass_tree_ifcombine
> > --
> > 2.44.0
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]
  2024-04-05 12:26 [PATCH] Add extra copy of the ifcombine pass after pre [PR102793] Manolis Tsamis
  2024-04-05 12:43 ` Richard Biener
@ 2024-04-05 18:13 ` Andrew Pinski
  2024-05-16  8:35   ` Richard Biener
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Pinski @ 2024-04-05 18:13 UTC (permalink / raw)
  To: Manolis Tsamis
  Cc: gcc-patches, Andrew Pinski, Philipp Tomsich, Tamar Christina,
	Jiangning Liu

On Fri, Apr 5, 2024 at 5:28 AM Manolis Tsamis <manolis.tsamis@vrull.eu> wrote:
>
> If we consider code like:
>
>     if (bar1 == x)
>       return foo();
>     if (bar2 != y)
>       return foo();
>     return 0;
>
> We would like the ifcombine pass to convert this to:
>
>     if (bar1 == x || bar2 != y)
>       return foo();
>     return 0;
>
> The ifcombine pass can handle this transformation but it is ran very early and
> it misses the opportunity because there are two seperate blocks for foo().
> The pre pass is good at removing duplicate code and blocks and due to that
> running ifcombine again after it can increase the number of successful
> conversions.

I do think we should have something similar to re-running
ssa-ifcombine but I think it should be much later, like after the loop
optimizations are done.
Maybe just a simplified version of it (that does the combining and not
the optimizations part) included in isel or pass_optimize_widening_mul
(which itself should most likely become part of isel or renamed since
it handles more than just widening multiply these days).


Thanks,
Andrew Pinski


>
>         PR 102793
>
> gcc/ChangeLog:
>
>         * common.opt: -ftree-ifcombine option, enabled by default.
>         * doc/invoke.texi: Document.
>         * passes.def: Re-run ssa-ifcombine after pre.
>         * tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
>         * gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
>         * gcc.target/aarch64/pr102793.c: New test.
>
> Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> ---
>
>  gcc/common.opt                              |  4 +++
>  gcc/doc/invoke.texi                         |  5 ++++
>  gcc/passes.def                              |  1 +
>  gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
>  gcc/testsuite/gcc.dg/uninit-pred-6_c.c      |  4 ---
>  gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +++++++++++++++++++++
>  gcc/tree-ssa-ifcombine.cc                   |  5 ++++
>  7 files changed, 46 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index ad348844775..e943202bcf1 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3163,6 +3163,10 @@ ftree-phiprop
>  Common Var(flag_tree_phiprop) Init(1) Optimization
>  Enable hoisting loads from conditional pointers.
>
> +ftree-ifcombine
> +Common Var(flag_tree_ifcombine) Init(1) Optimization
> +Merge some conditional branches to simplify control flow.
> +
>  ftree-pre
>  Common Var(flag_tree_pre) Optimization
>  Enable SSA-PRE optimization on trees.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e2edf7a6c13..8d2ff6b4512 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and higher.
>  Perform hoisting of loads from conditional pointers on trees.  This
>  pass is enabled by default at @option{-O1} and higher.
>
> +@opindex ftree-ifcombine
> +@item -ftree-ifcombine
> +Merge some conditional branches to simplify control flow.  This pass
> +is enabled by default at @option{-O1} and higher.
> +
>  @opindex fhoist-adjacent-loads
>  @item -fhoist-adjacent-loads
>  Speculatively hoist loads from both branches of an if-then-else if the
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 1cbbd413097..1765b476131 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_lim);
>        NEXT_PASS (pass_walloca, false);
>        NEXT_PASS (pass_pre);
> +      NEXT_PASS (pass_tree_ifcombine);
>        NEXT_PASS (pass_sink_code, false /* unsplit edges */);
>        NEXT_PASS (pass_sancov);
>        NEXT_PASS (pass_asan);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> index 16c79da9521..66c9f481a2f 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
> +/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
>
>  struct rtx_def;
>  typedef struct rtx_def *rtx;
> diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> index f60868dad23..2d8e6501a45 100644
> --- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> +++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> @@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
>    if ( (n > 10) && l)
>        blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
>
> -  if (l)
> -    if (n > 12)
> -      blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> -
>    return 0;
>  }
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr102793.c b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> new file mode 100644
> index 00000000000..78d48e01637
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +typedef unsigned long uint64_t;
> +
> +int ccmp1(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> +{
> +  uint64_t d1, d2, bar;
> +  d1 = *s1++;
> +  d2 = *s2++;
> +  bar = (d1 + d2) & 0xabcd;
> +  if (bar == 0 || d1 != d2)
> +    return foo();
> +  return 0;
> +}
> +
> +int ccmp2(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> +{
> +  uint64_t d1, d2, bar;
> +  d1 = *s1++;
> +  d2 = *s2++;
> +  bar = (d1 + d2) & 0xabcd;
> +  if (bar == 0)
> +    return foo();
> +  if (d1 != d2)
> +    return foo();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "ccmp\t" 2 } } */
> \ No newline at end of file
> diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
> index 6a3bc99190d..0bf9fe8b692 100644
> --- a/gcc/tree-ssa-ifcombine.cc
> +++ b/gcc/tree-ssa-ifcombine.cc
> @@ -838,6 +838,11 @@ public:
>    {}
>
>    /* opt_pass methods: */
> +  opt_pass * clone () final override
> +  {
> +    return new pass_tree_ifcombine (m_ctxt);
> +  }
> +  bool gate (function *) final override { return flag_tree_ifcombine; }
>    unsigned int execute (function *) final override;
>
>  }; // class pass_tree_ifcombine
> --
> 2.44.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]
  2024-04-05 18:13 ` Andrew Pinski
@ 2024-05-16  8:35   ` Richard Biener
  2024-05-16 10:55     ` Oleg Endo
  0 siblings, 1 reply; 7+ messages in thread
From: Richard Biener @ 2024-05-16  8:35 UTC (permalink / raw)
  To: Andrew Pinski
  Cc: Manolis Tsamis, gcc-patches, Andrew Pinski, Philipp Tomsich,
	Tamar Christina, Jiangning Liu

On Fri, Apr 5, 2024 at 8:14 PM Andrew Pinski <pinskia@gmail.com> wrote:
>
> On Fri, Apr 5, 2024 at 5:28 AM Manolis Tsamis <manolis.tsamis@vrull.eu> wrote:
> >
> > If we consider code like:
> >
> >     if (bar1 == x)
> >       return foo();
> >     if (bar2 != y)
> >       return foo();
> >     return 0;
> >
> > We would like the ifcombine pass to convert this to:
> >
> >     if (bar1 == x || bar2 != y)
> >       return foo();
> >     return 0;
> >
> > The ifcombine pass can handle this transformation but it is ran very early and
> > it misses the opportunity because there are two seperate blocks for foo().
> > The pre pass is good at removing duplicate code and blocks and due to that
> > running ifcombine again after it can increase the number of successful
> > conversions.
>
> I do think we should have something similar to re-running
> ssa-ifcombine but I think it should be much later, like after the loop
> optimizations are done.
> Maybe just a simplified version of it (that does the combining and not
> the optimizations part) included in isel or pass_optimize_widening_mul
> (which itself should most likely become part of isel or renamed since
> it handles more than just widening multiply these days).

I've long wished we had a (late?) pass that can also undo if-conversion
(basically do what RTL expansion would later do).  Maybe
gimple-predicate-analysis.cc (what's used by uninit analysis) can
represent mixed CFG + if-converted conditions so we can optimize
it and code-gen the condition in a more optimal manner much like
we have if-to-switch, switch-conversion and switch-expansion.

That said, I agree that re-running ifcombine should be later.  And there's
still the old task of splitting tail-merging from PRE (and possibly making
it more effective).

Richard.

>
> Thanks,
> Andrew Pinski
>
>
> >
> >         PR 102793
> >
> > gcc/ChangeLog:
> >
> >         * common.opt: -ftree-ifcombine option, enabled by default.
> >         * doc/invoke.texi: Document.
> >         * passes.def: Re-run ssa-ifcombine after pre.
> >         * tree-ssa-ifcombine.cc: Make ifcombine cloneable. Add gate function.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-ssa/20030922-2.c: Change flag to -fno-tree-ifcombine.
> >         * gcc.dg/uninit-pred-6_c.c: Remove inconsistent check.
> >         * gcc.target/aarch64/pr102793.c: New test.
> >
> > Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> > ---
> >
> >  gcc/common.opt                              |  4 +++
> >  gcc/doc/invoke.texi                         |  5 ++++
> >  gcc/passes.def                              |  1 +
> >  gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c  |  2 +-
> >  gcc/testsuite/gcc.dg/uninit-pred-6_c.c      |  4 ---
> >  gcc/testsuite/gcc.target/aarch64/pr102793.c | 30 +++++++++++++++++++++
> >  gcc/tree-ssa-ifcombine.cc                   |  5 ++++
> >  7 files changed, 46 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr102793.c
> >
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index ad348844775..e943202bcf1 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -3163,6 +3163,10 @@ ftree-phiprop
> >  Common Var(flag_tree_phiprop) Init(1) Optimization
> >  Enable hoisting loads from conditional pointers.
> >
> > +ftree-ifcombine
> > +Common Var(flag_tree_ifcombine) Init(1) Optimization
> > +Merge some conditional branches to simplify control flow.
> > +
> >  ftree-pre
> >  Common Var(flag_tree_pre) Optimization
> >  Enable SSA-PRE optimization on trees.
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index e2edf7a6c13..8d2ff6b4512 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -13454,6 +13454,11 @@ This flag is enabled by default at @option{-O1} and higher.
> >  Perform hoisting of loads from conditional pointers on trees.  This
> >  pass is enabled by default at @option{-O1} and higher.
> >
> > +@opindex ftree-ifcombine
> > +@item -ftree-ifcombine
> > +Merge some conditional branches to simplify control flow.  This pass
> > +is enabled by default at @option{-O1} and higher.
> > +
> >  @opindex fhoist-adjacent-loads
> >  @item -fhoist-adjacent-loads
> >  Speculatively hoist loads from both branches of an if-then-else if the
> > diff --git a/gcc/passes.def b/gcc/passes.def
> > index 1cbbd413097..1765b476131 100644
> > --- a/gcc/passes.def
> > +++ b/gcc/passes.def
> > @@ -270,6 +270,7 @@ along with GCC; see the file COPYING3.  If not see
> >        NEXT_PASS (pass_lim);
> >        NEXT_PASS (pass_walloca, false);
> >        NEXT_PASS (pass_pre);
> > +      NEXT_PASS (pass_tree_ifcombine);
> >        NEXT_PASS (pass_sink_code, false /* unsplit edges */);
> >        NEXT_PASS (pass_sancov);
> >        NEXT_PASS (pass_asan);
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > index 16c79da9521..66c9f481a2f 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/20030922-2.c
> > @@ -1,5 +1,5 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-O1 -fdump-tree-dom2 -fdisable-tree-ifcombine" } */
> > +/* { dg-options "-O1 -fdump-tree-dom2 -fno-tree-ifcombine" } */
> >
> >  struct rtx_def;
> >  typedef struct rtx_def *rtx;
> > diff --git a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > index f60868dad23..2d8e6501a45 100644
> > --- a/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > +++ b/gcc/testsuite/gcc.dg/uninit-pred-6_c.c
> > @@ -20,10 +20,6 @@ int foo (int n, int l, int m, int r)
> >    if ( (n > 10) && l)
> >        blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> >
> > -  if (l)
> > -    if (n > 12)
> > -      blah(v); /* { dg-bogus "uninitialized" "bogus warning" } */
> > -
> >    return 0;
> >  }
> >
> > diff --git a/gcc/testsuite/gcc.target/aarch64/pr102793.c b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> > new file mode 100644
> > index 00000000000..78d48e01637
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/pr102793.c
> > @@ -0,0 +1,30 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> > +
> > +typedef unsigned long uint64_t;
> > +
> > +int ccmp1(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> > +{
> > +  uint64_t d1, d2, bar;
> > +  d1 = *s1++;
> > +  d2 = *s2++;
> > +  bar = (d1 + d2) & 0xabcd;
> > +  if (bar == 0 || d1 != d2)
> > +    return foo();
> > +  return 0;
> > +}
> > +
> > +int ccmp2(uint64_t* s1, uint64_t* s2, int(*foo)(void))
> > +{
> > +  uint64_t d1, d2, bar;
> > +  d1 = *s1++;
> > +  d2 = *s2++;
> > +  bar = (d1 + d2) & 0xabcd;
> > +  if (bar == 0)
> > +    return foo();
> > +  if (d1 != d2)
> > +    return foo();
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "ccmp\t" 2 } } */
> > \ No newline at end of file
> > diff --git a/gcc/tree-ssa-ifcombine.cc b/gcc/tree-ssa-ifcombine.cc
> > index 6a3bc99190d..0bf9fe8b692 100644
> > --- a/gcc/tree-ssa-ifcombine.cc
> > +++ b/gcc/tree-ssa-ifcombine.cc
> > @@ -838,6 +838,11 @@ public:
> >    {}
> >
> >    /* opt_pass methods: */
> > +  opt_pass * clone () final override
> > +  {
> > +    return new pass_tree_ifcombine (m_ctxt);
> > +  }
> > +  bool gate (function *) final override { return flag_tree_ifcombine; }
> >    unsigned int execute (function *) final override;
> >
> >  }; // class pass_tree_ifcombine
> > --
> > 2.44.0
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]
  2024-05-16  8:35   ` Richard Biener
@ 2024-05-16 10:55     ` Oleg Endo
  2024-05-16 11:03       ` Andrew Pinski
  0 siblings, 1 reply; 7+ messages in thread
From: Oleg Endo @ 2024-05-16 10:55 UTC (permalink / raw)
  To: Richard Biener, Andrew Pinski
  Cc: Manolis Tsamis, gcc-patches, Andrew Pinski, Philipp Tomsich,
	Tamar Christina, Jiangning Liu


On Thu, 2024-05-16 at 10:35 +0200, Richard Biener wrote:
> On Fri, Apr 5, 2024 at 8:14 PM Andrew Pinski <pinskia@gmail.com> wrote:
> > 
> > On Fri, Apr 5, 2024 at 5:28 AM Manolis Tsamis <manolis.tsamis@vrull.eu> wrote:
> > > 
> > > If we consider code like:
> > > 
> > >     if (bar1 == x)
> > >       return foo();
> > >     if (bar2 != y)
> > >       return foo();
> > >     return 0;
> > > 
> > > We would like the ifcombine pass to convert this to:
> > > 
> > >     if (bar1 == x || bar2 != y)
> > >       return foo();
> > >     return 0;
> > > 
> > > The ifcombine pass can handle this transformation but it is ran very early and
> > > it misses the opportunity because there are two seperate blocks for foo().
> > > The pre pass is good at removing duplicate code and blocks and due to that
> > > running ifcombine again after it can increase the number of successful
> > > conversions.
> > 
> > I do think we should have something similar to re-running
> > ssa-ifcombine but I think it should be much later, like after the loop
> > optimizations are done.
> > Maybe just a simplified version of it (that does the combining and not
> > the optimizations part) included in isel or pass_optimize_widening_mul
> > (which itself should most likely become part of isel or renamed since
> > it handles more than just widening multiply these days).
> 
> I've long wished we had a (late?) pass that can also undo if-conversion
> (basically do what RTL expansion would later do).  Maybe
> gimple-predicate-analysis.cc (what's used by uninit analysis) can
> represent mixed CFG + if-converted conditions so we can optimize
> it and code-gen the condition in a more optimal manner much like
> we have if-to-switch, switch-conversion and switch-expansion.
> 
> That said, I agree that re-running ifcombine should be later.  And there's
> still the old task of splitting tail-merging from PRE (and possibly making
> it more effective).

Sorry to butt in, but it might be little bit relevant and caught my
attention.

I've got this SH patch sitting around
https://gcc.gnu.org/bugzilla/attachment.cgi?id=55543

The idea is basically to run an additional loop pass after combine and
split1.  The main purpose is to hoist constant loads out of loops. Such
constant loads might be formed (in this particular case) during combine
transformations.

The patch adds a new file gcc/config/sh/sh_loop.cc, which has some boiler-
plate code copy pasted from other places to get the loop pass setup and
going.

Any thoughts on this way of doing it?


Best regards,
Oleg Endo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add extra copy of the ifcombine pass after pre [PR102793]
  2024-05-16 10:55     ` Oleg Endo
@ 2024-05-16 11:03       ` Andrew Pinski
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Pinski @ 2024-05-16 11:03 UTC (permalink / raw)
  To: Oleg Endo
  Cc: Richard Biener, Manolis Tsamis, GCC Patches, Andrew Pinski,
	Philipp Tomsich, Tamar Christina, Jiangning Liu

[-- Attachment #1: Type: text/plain, Size: 3347 bytes --]

On Thu, May 16, 2024, 12:55 PM Oleg Endo <oleg.endo@t-online.de> wrote:

>
> On Thu, 2024-05-16 at 10:35 +0200, Richard Biener wrote:
> > On Fri, Apr 5, 2024 at 8:14 PM Andrew Pinski <pinskia@gmail.com> wrote:
> > >
> > > On Fri, Apr 5, 2024 at 5:28 AM Manolis Tsamis <manolis.tsamis@vrull.eu>
> wrote:
> > > >
> > > > If we consider code like:
> > > >
> > > >     if (bar1 == x)
> > > >       return foo();
> > > >     if (bar2 != y)
> > > >       return foo();
> > > >     return 0;
> > > >
> > > > We would like the ifcombine pass to convert this to:
> > > >
> > > >     if (bar1 == x || bar2 != y)
> > > >       return foo();
> > > >     return 0;
> > > >
> > > > The ifcombine pass can handle this transformation but it is ran very
> early and
> > > > it misses the opportunity because there are two seperate blocks for
> foo().
> > > > The pre pass is good at removing duplicate code and blocks and due
> to that
> > > > running ifcombine again after it can increase the number of
> successful
> > > > conversions.
> > >
> > > I do think we should have something similar to re-running
> > > ssa-ifcombine but I think it should be much later, like after the loop
> > > optimizations are done.
> > > Maybe just a simplified version of it (that does the combining and not
> > > the optimizations part) included in isel or pass_optimize_widening_mul
> > > (which itself should most likely become part of isel or renamed since
> > > it handles more than just widening multiply these days).
> >
> > I've long wished we had a (late?) pass that can also undo if-conversion
> > (basically do what RTL expansion would later do).  Maybe
> > gimple-predicate-analysis.cc (what's used by uninit analysis) can
> > represent mixed CFG + if-converted conditions so we can optimize
> > it and code-gen the condition in a more optimal manner much like
> > we have if-to-switch, switch-conversion and switch-expansion.
> >
> > That said, I agree that re-running ifcombine should be later.  And
> there's
> > still the old task of splitting tail-merging from PRE (and possibly
> making
> > it more effective).
>
> Sorry to butt in, but it might be little bit relevant and caught my
> attention.
>
> I've got this SH patch sitting around
> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55543
>
> The idea is basically to run an additional loop pass after combine and
> split1.  The main purpose is to hoist constant loads out of loops. Such
> constant loads might be formed (in this particular case) during combine
> transformations.
>
> The patch adds a new file gcc/config/sh/sh_loop.cc, which has some boiler-
> plate code copy pasted from other places to get the loop pass setup and
> going.
>
> Any thoughts on this way of doing it?
>

I have been looking at a similar issue on aarch64 for a few cases, csinc
and nand. What I decided to do for nand was not depend on combine in the
end and create a new infrastructure to expand better to rtl from gimple and
maybe even have target specific pattern matching on the gimple level. So
the constant is not part of the other instruction.

I should have a write up/first draft of an implementation by August time
frame or so. The write up will most likely be earlier.

Thanks,
Andrew



>
> Best regards,
> Oleg Endo
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-05-16 11:03 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-05 12:26 [PATCH] Add extra copy of the ifcombine pass after pre [PR102793] Manolis Tsamis
2024-04-05 12:43 ` Richard Biener
2024-04-05 13:33   ` Manolis Tsamis
2024-04-05 18:13 ` Andrew Pinski
2024-05-16  8:35   ` Richard Biener
2024-05-16 10:55     ` Oleg Endo
2024-05-16 11:03       ` Andrew Pinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).