public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kong, Lingling" <lingling.kong@intel.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	"Liu, Hongtao" <hongtao.liu@intel.com>
Subject: RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]
Date: Fri, 16 Sep 2022 02:30:48 +0000	[thread overview]
Message-ID: <DM4PR11MB54879C9C4354840AEC78FB29EC489@DM4PR11MB5487.namprd11.prod.outlook.com> (raw)
In-Reply-To: <CAFiYyc3hAu2pmngR5mAdmVG2gmkdVVHvFYvbrw7q9BtB0BX4HQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 15809 bytes --]

Hi Richard,

Thanks again for your reviewing.

> Yes, use else if for the bitwise induction.  Can you also make the new case
> conditional on 'def'
> (the compute_overall_effect_of_inner_loop) being chrec_dont_know?  If that
> call produced something useful it will not be of either of the two special forms.
> Thus like
> 
>   if (def != chrec_dont_know)
>     /* Already OK.  */
>     ;
>  else if ((bitinv_def = ...)
>     ..
>  else if (tree_fits_uhwi_p (niter)
>              ... bitwise induction case...)
>     ...
>
Yes, I fixed it in new patch. Thanks.
Ok for master ?

Thanks,
Lingling

> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Wednesday, September 14, 2022 4:16 PM
> To: Kong, Lingling <lingling.kong@intel.com>
> Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao <hongtao.liu@intel.com>
> Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop
> with an invariant induction.[PR105735]
> 
> On Tue, Sep 13, 2022 at 9:54 AM Kong, Lingling <lingling.kong@intel.com>
> wrote:
> >
> > Hi Richard,
> >
> > Thanks you so much for reviewing this patch.  I really appreciate it. For these
> review comments, I have made some changes.
> >
> > > That's a single-stmt match, you shouldn't use match.pd matching for this.
> > > Instead just do
> > >
> > >   if (is_gimple_assign (stmt)
> > >       && ((code = gimple_assign_rhs_code (stmt)), true)
> > >       && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code ==
> > > BIT_XOR_EXPR))
> >
> > Yes, I fixed it and dropped modification for match.pd.
> >
> > > and pick gimple_assign_rhs{1,2} (stmt) as the operands.  The :c in
> > > bit_op:c is redundant btw. - while the name suggests "with
> > > invariant" you don't actually check for that.  But again, given
> > > canonicalization rules the invariant will be rhs2 so above add
> > >
> > >         && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST
> >
> > For " with invariant", this needed op1 is invariant, and I used
> `expr_invariant_in_loop_p (loop, match_op[0])` for check.
> > And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can be directly
> optimized and do not need modification.
> >
> > > you probably need dg-require-effective-target longlong, but is it
> > > necessary to use long long for the testcases in the first place?
> > > The IV seems to be unused, if it should match the variables bit size
> > > use sizeof
> > > (type) * 8
> >
> > Yes, It is not necessary to use long long for the testcases. I changed type to
> unsigned int.
> >
> > > > +  inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge
> > > > + (loop));  return fold_build2 (code1, type, inv, match_op[0]); }
> > >
> > > The } goes to the next line.
> >
> > Sorry, It might be something wrong with my use of gcc send-email format.
> >
> > > > +      tree bitinv_def;
> > > > +      if ((bitinv_def
> > >
> > > please use else if here
> >
> > Sorry, If use the else if here, there is no corresponding above if. I'm not sure if
> you mean change bitwise induction expression if to else if.
> 
> Yes, use else if for the bitwise induction.  Can you also make the new case
> conditional on 'def'
> (the compute_overall_effect_of_inner_loop) being chrec_dont_know?  If that
> call produced something useful it will not be of either of the two special forms.
> Thus like
> 
>   if (def != chrec_dont_know)
>     /* Already OK.  */
>     ;
>  else if ((bitinv_def = ...)
>     ..
>  else if (tree_fits_uhwi_p (niter)
>              ... bitwise induction case...)
>     ...
> 
> ?
> 
> Otherwise looks OK now.
> 
> Thanks,
> Richard.
> 
> > Do you agree with these changes?  Thanks again for taking a look.
> >
> > Thanks,
> > Lingling
> >
> > > -----Original Message-----
> > > From: Richard Biener <richard.guenther@gmail.com>
> > > Sent: Tuesday, August 23, 2022 3:27 PM
> > > To: Kong, Lingling <lingling.kong@intel.com>
> > > Cc: Liu, Hongtao <hongtao.liu@intel.com>; gcc-patches@gcc.gnu.org
> > > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle
> > > bitop with an invariant induction.[PR105735]
> > >
> > > On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches <gcc-
> > > patches@gcc.gnu.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > This patch is for pr105735/pr101991. It will enable below optimization:
> > > > {
> > > > -  long unsigned int bit;
> > > > -
> > > > -  <bb 2> [local count: 32534376]:
> > > > -
> > > > -  <bb 3> [local count: 1041207449]:
> > > > -  # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)>
> > > > -  # bit_12 = PHI <bit_8(5), 0(2)>
> > > > -  tmp_7 = bit2_6(D) & tmp_10;
> > > > -  bit_8 = bit_12 + 1;
> > > > -  if (bit_8 != 32)
> > > > -    goto <bb 5>; [96.97%]
> > > > -  else
> > > > -    goto <bb 4>; [3.03%]
> > > > -
> > > > -  <bb 5> [local count: 1009658865]:
> > > > -  goto <bb 3>; [100.00%]
> > > > -
> > > > -  <bb 4> [local count: 32534376]:
> > > > -  # tmp_11 = PHI <tmp_7(3)>
> > > > -  return tmp_11;
> > > > +  tmp_11 = tmp_4(D) & bit2_6(D);
> > > > +  return tmp_11;
> > > >
> > > > }
> > > >
> > > > Ok for master ?
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         PR middle-end/105735
> > > >         * match.pd (bitop_with_inv_p): New match.
> > > >         * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare.
> > > >         (analyze_and_compute_bitop_with_inv_effect): New function.
> > > >         (final_value_replacement_loop): Enhanced to handle bitop
> > > >         with inv induction.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >         * gcc.target/i386/pr105735-1.c: New test.
> > > >         * gcc.target/i386/pr105735-2.c: New test.
> > > > ---
> > > >  gcc/match.pd                               |  4 +
> > > >  gcc/testsuite/gcc.target/i386/pr105735-1.c | 88
> > > > ++++++++++++++++++++++
> > > gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++
> > > >  gcc/tree-scalar-evolution.cc               | 59 +++++++++++++++
> > > >  4 files changed, 179 insertions(+)  create mode 100644
> > > > gcc/testsuite/gcc.target/i386/pr105735-1.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c
> > > >
> > > > diff --git a/gcc/match.pd b/gcc/match.pd index
> > > > 562138a8034..cfe593ebb02 100644
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -8050,6 +8050,10 @@ and,
> > > >   (bit_not
> > > >    (nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1
> > > > @2))
> > > > @3))))
> > > >
> > > > +(for bit_op (bit_and bit_ior bit_xor)  (match (bitop_with_inv_p
> > > > +@0
> > > > +@1)
> > > > +  (bit_op:c @0 @1)))
> > > > +
> > >
> > > That's a single-stmt match, you shouldn't use match.pd matching for this.
> > > Instead just do
> > >
> > >   if (is_gimple_assign (stmt)
> > >       && ((code = gimple_assign_rhs_code (stmt)), true)
> > >       && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code ==
> > > BIT_XOR_EXPR))
> > >    ..
> > >
> > > and pick gimple_assign_rhs{1,2} (stmt) as the operands.  The :c in
> > > bit_op:c is redundant btw. - while the name suggests "with
> > > invariant" you don't actually check for that.  But again, given
> > > canonicalization rules the invariant will be rhs2 so above add
> > >
> > >         && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST
> > >
> > > for example.
> > >
> > > >  /* n - (((n > C1) ? n : C1) & -C2) ->  n & C1 for unsigned case.
> > > >     n - (((n > C1) ? n : C1) & -C2) ->  (n <= C1) ? n : (n & C1)
> > > > for signed case.  */  (simplify diff --git
> > > > a/gcc/testsuite/gcc.target/i386/pr105735-1.c
> > > > b/gcc/testsuite/gcc.target/i386/pr105735-1.c
> > > > new file mode 100644
> > > > index 00000000000..8d2123ed351
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c
> > > > @@ -0,0 +1,88 @@
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-O1 -fdump-tree-sccp-details" } */
> > > > +/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp"
> > > > +} } */
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +foo (unsigned long long tmp, unsigned long long bit2) {
> > >
> > > you probably need dg-require-effective-target longlong, but is it
> > > necessary to use long long for the testcases in the first place?
> > > The IV seems to be unused, if it should match the variables bit size
> > > use sizeof
> > > (type) * 8
> > >
> > > > +  for (int bit = 0; bit < 64; bit++)
> > > > +    tmp &= bit2;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +foo1 (unsigned long long tmp, unsigned long long bit2) {
> > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > +    tmp &= bit2;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +foo2 (unsigned long long tmp, unsigned long long bit2) {
> > > > +  for (int bit = 0; bit < 64; bit++)
> > > > +    tmp |= bit2;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +foo3 (unsigned long long tmp, unsigned long long bit2) {
> > > > +  for (int bit = 63; bit >= 0; bit -=3)
> > > > +    tmp |= bit2;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +foo4 (unsigned long long tmp, unsigned long long bit2) {
> > > > +  for (int bit = 0; bit < 64; bit++)
> > > > +    tmp ^= bit2;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +foo5 (unsigned long long tmp, unsigned long long bit2) {
> > > > +  for (int bit = 0; bit < 63; bit++)
> > > > +    tmp ^= bit2;
> > > > +  return tmp;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +f (unsigned long long tmp, long long bit, unsigned long long
> > > > +bit2) {
> > > > +  unsigned long long res = tmp;
> > > > +  for (long long i = 0; i < bit; i++)
> > > > +    res &= bit2;
> > > > +  return res;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +f1 (unsigned long long tmp, long long bit, unsigned long long
> > > > +bit2) {
> > > > +  unsigned long long res = tmp;
> > > > +  for (long long i = 0; i < bit; i++)
> > > > +    res |= bit2;
> > > > +  return res;
> > > > +}
> > > > +
> > > > +unsigned long long
> > > > +__attribute__((noipa))
> > > > +f2 (unsigned long long tmp, long long bit, unsigned long long
> > > > +bit2) {
> > > > +  unsigned long long res = tmp;
> > > > +  for (long long i = 0; i < bit; i++)
> > > > +    res ^= bit2;
> > > > +  return res;
> > > > +}
> > > > +
> > > > diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c
> > > > b/gcc/testsuite/gcc.target/i386/pr105735-2.c
> > > > new file mode 100644
> > > > index 00000000000..79c1d300b1b
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c
> > > > @@ -0,0 +1,28 @@
> > > > +/* { dg-do run } */
> > > > +/* { dg-options "-O1" } */
> > > > +
> > > > +#include "pr105735-1.c"
> > > > +
> > > > +int main()
> > > > +{
> > > > +  unsigned long long tmp = 0x1101101ULL;
> > > > +  unsigned long long bit2 = 0x1111111011110111ULL;
> > > > +  if (foo (tmp, bit2) != 0x1100101ULL)
> > > > +    __builtin_abort ();
> > > > +  if (foo1 (tmp, bit2) != 0x1100101ULL)
> > > > +    __builtin_abort ();
> > > > +  if (foo2 (tmp, bit2) != 0x1111111011111111ULL)
> > > > +    __builtin_abort ();
> > > > +  if (foo3 (tmp, bit2) != 0x1111111011111111ULL)
> > > > +    __builtin_abort ();
> > > > +  if (foo4 (tmp, bit2) != 0x1101101ULL)
> > > > +    __builtin_abort ();
> > > > +  if (foo5 (tmp, bit2) != 0x1111111010011010ULL)
> > > > +    __builtin_abort ();
> > > > +  if (f (tmp, 64, bit2) != 0x1100101ULL)
> > > > +    __builtin_abort ();
> > > > +  if (f1 (tmp, 64, bit2) != 0x1111111011111111ULL)
> > > > +    __builtin_abort ();
> > > > +  if (f2 (tmp, 64, bit2) != 0x1101101ULL)
> > > > +    __builtin_abort ();
> > > > +}
> > > > diff --git a/gcc/tree-scalar-evolution.cc
> > > > b/gcc/tree-scalar-evolution.cc index fc59d035b19..81220f08d2e
> > > > 100644
> > > > --- a/gcc/tree-scalar-evolution.cc
> > > > +++ b/gcc/tree-scalar-evolution.cc
> > > > @@ -3635,6 +3635,53 @@ enum bit_op_kind
> > > >    return fold_build2 (code1, type, inv, wide_int_to_tree (type,
> > > > bits));  }
> > > >
> > > > +/* Match.pd function to match bitop with invariant expression
> > > > +  .i.e.
> > > > +  tmp_7 = _0 & _1; */
> > > > +extern bool gimple_bitop_with_inv_p (tree, tree *, tree
> > > > +(*)(tree));
> > > > +
> > > > +/* Return the inductive expression of bitop with invariant if possible,
> > > > +   otherwise returns DEF.  */
> > > > +static tree
> > > > +analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree
> phidef,
> > > > +                                          tree niter) {
> > > > +  tree match_op[2],inv;
> > > > +  tree type = TREE_TYPE (phidef);
> > > > +  gphi* header_phi = NULL;
> > > > +  /* match thing like op0 (match[0]), op1(match[1]),
> > > > +phidef(PHIDEF)
> > > > +
> > > > +    op1 =  PHI <phidef, inv>
> > > > +    phidef = op0 & op1
> > > > +    if op0 is an invariant, it could change to
> > > > +    phidef = op0 & inv.  */
> > > > +  if (!gimple_bitop_with_inv_p (phidef, &match_op[0], NULL)
> > > > +      || TREE_CODE (match_op[1]) != SSA_NAME
> > > > +      || !expr_invariant_in_loop_p (loop, match_op[0])
> > > > +      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT
> > > (match_op[1])))
> > > > +      || gimple_phi_num_args (header_phi) != 2)
> > > > +    return NULL_TREE;
> > > > +
> > > > +  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop))
> > > > + !=
> > > phidef)
> > > > +    return NULL_TREE;
> > > > +
> > > > +  enum tree_code code1
> > > > +    = gimple_assign_rhs_code (SSA_NAME_DEF_STMT (phidef));
> > > > +
> > > > +  if (code1 == BIT_XOR_EXPR)
> > > > +    {
> > > > +       if (!tree_fits_uhwi_p (niter))
> > > > +       return NULL_TREE;
> > > > +       unsigned HOST_WIDE_INT niter_num;
> > > > +       niter_num = tree_to_uhwi (niter);
> > > > +       if (niter_num % 2 != 0)
> > > > +       match_op[0] =  build_zero_cst (type);
> > > > +    }
> > > > +
> > > > +  inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge
> > > > + (loop));  return fold_build2 (code1, type, inv, match_op[0]); }
> > >
> > > The } goes to the next line.
> > >
> > > > +
> > > >  /* Do final value replacement for LOOP, return true if we did
> > > > anything.  */
> > > >
> > > >  bool
> > > > @@ -3687,6 +3734,18 @@ final_value_replacement_loop (class loop
> *loop)
> > > >                                               &folded_casts);
> > > >        def = compute_overall_effect_of_inner_loop (ex_loop, def);
> > > >
> > > > +      /* Handle bitop with invariant induction expression.
> > > > +
> > > > +       .i.e
> > > > +       for (int i =0 ;i < 32; i++)
> > > > +         tmp &= bit2;
> > > > +       if bit2 is an invariant in loop which could simple to
> > > > +       tmp &= bit2.  */
> > > > +      tree bitinv_def;
> > > > +      if ((bitinv_def
> > >
> > > please use else if here
> > >
> > > otherwise looks OK.
> > >
> > > > +          = analyze_and_compute_bitop_with_inv_effect (loop, phidef, niter)))
> > > > +       def = bitinv_def;
> > > > +
> > > >        /* Handle bitwise induction expression.
> > > >
> > > >          .i.e.
> > > > --
> > > > 2.18.2
> > > >

[-- Attachment #2: 0001-Enhance-final_value_replacement_loop-to-handle-bitop.patch --]
[-- Type: application/octet-stream, Size: 8389 bytes --]

From ea991bd9022416ad67f51d228bbfaa9343db272d Mon Sep 17 00:00:00 2001
From: konglin1 <lingling.kong@intel.com>
Date: Mon, 1 Aug 2022 17:48:05 +0800
Subject: [PATCH] Enhance final_value_replacement_loop to handle bitop with an
 invariant induction.

This patch will enable below optimization:

{
-  long unsigned int bit;
-
-  <bb 2> [local count: 32534376]:
-
-  <bb 3> [local count: 1041207449]:
-  # tmp_10 = PHI <tmp_7(5), tmp_4(D)(2)>
-  # bit_12 = PHI <bit_8(5), 0(2)>
-  tmp_7 = bit2_6(D) & tmp_10;
-  bit_8 = bit_12 + 1;
-  if (bit_8 != 32)
-    goto <bb 5>; [96.97%]
-  else
-    goto <bb 4>; [3.03%]
-
-  <bb 5> [local count: 1009658865]:
-  goto <bb 3>; [100.00%]
-
-  <bb 4> [local count: 32534376]:
-  # tmp_11 = PHI <tmp_7(3)>
-  return tmp_11;
+  tmp_11 = tmp_4 (D) & bit2_6 (D);
+  return tmp_11;

}

gcc/ChangeLog:

	PR middle-end/105735
	* tree-scalar-evolution.cc
	(analyze_and_compute_bitop_with_inv_effect): New function.
	(final_value_replacement_loop): Enhanced to handle bitop
	with inv induction.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr105735-1.c: New test.
	* gcc.target/i386/pr105735-2.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++++++++++++++++++++
 gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++++++
 gcc/tree-scalar-evolution.cc               | 93 +++++++++++++++++++---
 3 files changed, 199 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c

diff --git a/gcc/testsuite/gcc.target/i386/pr105735-1.c b/gcc/testsuite/gcc.target/i386/pr105735-1.c
new file mode 100644
index 00000000000..69de6b2911a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105735-1.c
@@ -0,0 +1,88 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-sccp-details" } */
+/* { dg-final { scan-tree-dump-times {final value replacement} 8 "sccp" } } */
+
+unsigned int
+__attribute__((noipa))
+foo (unsigned int tmp, unsigned int bit2)
+{
+  for (int bit = 0; bit < 64; bit++)
+    tmp &= bit2;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo1 (unsigned int tmp, unsigned int bit2)
+{
+  for (int bit = 63; bit >= 0; bit -=3)
+    tmp &= bit2;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo2 (unsigned int tmp, unsigned int bit2)
+{
+  for (int bit = 0; bit < 64; bit++)
+    tmp |= bit2;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo3 (unsigned int tmp, unsigned int bit2)
+{
+  for (int bit = 63; bit >= 0; bit -=3)
+    tmp |= bit2;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo4 (unsigned int tmp, unsigned int bit2)
+{
+  for (int bit = 0; bit < 64; bit++)
+    tmp ^= bit2;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+foo5 (unsigned int tmp, unsigned int bit2)
+{
+  for (int bit = 0; bit < 63; bit++)
+    tmp ^= bit2;
+  return tmp;
+}
+
+unsigned int
+__attribute__((noipa))
+f (unsigned int tmp, int bit, unsigned int bit2)
+{
+  unsigned int res = tmp;
+  for (int i = 0; i < bit; i++)
+    res &= bit2;
+  return res;
+}
+
+unsigned int
+__attribute__((noipa))
+f1 (unsigned int tmp, int bit, unsigned int bit2)
+{
+  unsigned int res = tmp;
+  for (int i = 0; i < bit; i++)
+    res |= bit2;
+  return res;
+}
+
+unsigned int
+__attribute__((noipa))
+f2 (unsigned int tmp, int bit, unsigned int bit2)
+{
+  unsigned int res = tmp;
+  for (int i = 0; i < bit; i++)
+    res ^= bit2;
+  return res;
+}
+
diff --git a/gcc/testsuite/gcc.target/i386/pr105735-2.c b/gcc/testsuite/gcc.target/i386/pr105735-2.c
new file mode 100644
index 00000000000..66cc5fba1e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105735-2.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+#include "pr105735-1.c"
+
+int main()
+{
+  unsigned int tmp = 0x1101;
+  unsigned int bit2 = 0x111101;
+  if (foo (tmp, bit2) != 0x1101)
+    __builtin_abort (); 
+  if (foo1 (tmp, bit2) != 0x1101)
+    __builtin_abort ();
+  if (foo2 (tmp, bit2) != 0x111101)
+    __builtin_abort ();
+  if (foo3 (tmp, bit2) != 0x111101)
+    __builtin_abort ();
+  if (foo4 (tmp, bit2) != 0x1101)
+    __builtin_abort ();
+  if (foo5 (tmp, bit2) != 0x110000)
+    __builtin_abort ();
+  if (f (tmp, 64, bit2) != 0x1101)
+    __builtin_abort ();
+  if (f1 (tmp, 64, bit2) != 0x111101)
+    __builtin_abort ();
+  if (f2 (tmp, 64, bit2) != 0x1101)
+    __builtin_abort ();
+}
diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc
index fc59d035b19..9f30f78cb5d 100644
--- a/gcc/tree-scalar-evolution.cc
+++ b/gcc/tree-scalar-evolution.cc
@@ -3635,6 +3635,64 @@ enum bit_op_kind
   return fold_build2 (code1, type, inv, wide_int_to_tree (type, bits));
 }
 
+/* Match.pd function to match bitop with invariant expression
+  .i.e.
+  tmp_7 = _0 & _1; */
+extern bool gimple_bitop_with_inv_p (tree, tree *, tree (*)(tree));
+
+/* Return the inductive expression of bitop with invariant if possible,
+   otherwise returns DEF.  */
+static tree
+analyze_and_compute_bitop_with_inv_effect (class loop* loop, tree phidef,
+					   tree niter)
+{
+  tree match_op[2],inv;
+  tree type = TREE_TYPE (phidef);
+  gphi* header_phi = NULL;
+  enum tree_code code;
+  /* match thing like op0 (match[0]), op1 (match[1]), phidef (PHIDEF)
+
+    op1 =  PHI <phidef, inv>
+    phidef = op0 & op1
+    if op0 is an invariant, it could change to
+    phidef = op0 & inv.  */
+  gimple *def;
+  def = SSA_NAME_DEF_STMT (phidef);
+  if (!(is_gimple_assign (def)
+      && ((code = gimple_assign_rhs_code (def)), true)
+      && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR
+	  || code == BIT_XOR_EXPR)))
+    return NULL_TREE;
+
+  match_op[0] = gimple_assign_rhs1 (def);
+  match_op[1] = gimple_assign_rhs2 (def);
+
+  if (TREE_CODE (match_op[1]) != SSA_NAME
+      || !expr_invariant_in_loop_p (loop, match_op[0])
+      || !(header_phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (match_op[1])))
+      || gimple_phi_num_args (header_phi) != 2)
+    return NULL_TREE;
+
+  if (PHI_ARG_DEF_FROM_EDGE (header_phi, loop_latch_edge (loop)) != phidef)
+    return NULL_TREE;
+
+  enum tree_code code1
+    = gimple_assign_rhs_code (def);
+
+  if (code1 == BIT_XOR_EXPR)
+    {
+       if (!tree_fits_uhwi_p (niter))
+	return NULL_TREE;
+       unsigned HOST_WIDE_INT niter_num;
+       niter_num = tree_to_uhwi (niter);
+       if (niter_num % 2 != 0)
+	match_op[0] =  build_zero_cst (type);
+    }
+
+  inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge (loop));
+  return fold_build2 (code1, type, inv, match_op[0]);
+}
+
 /* Do final value replacement for LOOP, return true if we did anything.  */
 
 bool
@@ -3685,7 +3743,24 @@ final_value_replacement_loop (class loop *loop)
       bool folded_casts;
       def = analyze_scalar_evolution_in_loop (ex_loop, loop, def,
 					      &folded_casts);
-      def = compute_overall_effect_of_inner_loop (ex_loop, def);
+
+      tree bitinv_def, bit_def;
+      unsigned HOST_WIDE_INT niter_num;
+
+      if (def != chrec_dont_know)
+	def = compute_overall_effect_of_inner_loop (ex_loop, def);
+
+      /* Handle bitop with invariant induction expression.
+
+	.i.e
+	for (int i =0 ;i < 32; i++)
+	  tmp &= bit2;
+	if bit2 is an invariant in loop which could simple to
+	tmp &= bit2.  */
+      else if ((bitinv_def
+		= analyze_and_compute_bitop_with_inv_effect (loop,
+							     phidef, niter)))
+	def = bitinv_def;
 
       /* Handle bitwise induction expression.
 
@@ -3697,15 +3772,13 @@ final_value_replacement_loop (class loop *loop)
 	 expressible, but in fact final value of RES can be replaced by
 	 RES & CONSTANT where CONSTANT all ones with bit {0,3,6,9,... ,63}
 	 being cleared, similar for BIT_IOR_EXPR/BIT_XOR_EXPR.  */
-      unsigned HOST_WIDE_INT niter_num;
-      tree bit_def;
-      if (tree_fits_uhwi_p (niter)
-	  && (niter_num = tree_to_uhwi (niter)) != 0
-	  && niter_num < TYPE_PRECISION (TREE_TYPE (phidef))
-	  && (bit_def
-	      = analyze_and_compute_bitwise_induction_effect (loop,
-							      phidef,
-							      niter_num)))
+      else if (tree_fits_uhwi_p (niter)
+	       && (niter_num = tree_to_uhwi (niter)) != 0
+	       && niter_num < TYPE_PRECISION (TREE_TYPE (phidef))
+	       && (bit_def
+		   = analyze_and_compute_bitwise_induction_effect (loop,
+								   phidef,
+								   niter_num)))
 	def = bit_def;
 
       if (!tree_does_not_contain_chrecs (def)
-- 
2.18.2


  reply	other threads:[~2022-09-16  2:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-18  6:47 Kong, Lingling
2022-08-23  2:17 ` Kong, Lingling
2022-08-23  7:26 ` Richard Biener
2022-09-13  7:54   ` Kong, Lingling
2022-09-14  8:15     ` Richard Biener
2022-09-16  2:30       ` Kong, Lingling [this message]
2022-09-20  6:36         ` Kong, Lingling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM4PR11MB54879C9C4354840AEC78FB29EC489@DM4PR11MB5487.namprd11.prod.outlook.com \
    --to=lingling.kong@intel.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).