public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
To: Richard Biener <richard.guenther@gmail.com>
Cc: "Amker.Cheng" <amker.cheng@gmail.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [RFC][PR82479] missing popcount builtin detection
Date: Tue, 12 Jun 2018 03:10:00 -0000	[thread overview]
Message-ID: <CAELXzTOGg5zk__gJJX=6_r5TugKYEW9g6ngug3iAUk7Pdq9mrQ@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc0=Qmj9AjNRM73TNpZYizwOt4my=TO0NtN5aVeZ3oz2-g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 18106 bytes --]

Hi Richard,

Thanks for the review.

On 7 June 2018 at 19:21, Richard Biener <richard.guenther@gmail.com> wrote:
> On Thu, Jun 7, 2018 at 2:52 AM Kugan Vivekanandarajah
> <kugan.vivekanandarajah@linaro.org> wrote:
>>
>> Hi Richard,
>>
>> Thanks for the review.
>>
>> On 5 June 2018 at 21:25, Richard Biener <richard.guenther@gmail.com> wrote:
>> > On Tue, Jun 5, 2018 at 10:59 AM Kugan Vivekanandarajah
>> > <kugan.vivekanandarajah@linaro.org> wrote:
>> >>
>> >> Hi Richard,
>> >>
>> >> Thanks for the review.
>> >>
>> >> On 1 June 2018 at 23:12, Richard Biener <richard.guenther@gmail.com> wrote:
>> >> > On Fri, Jun 1, 2018 at 12:06 PM Bin.Cheng <amker.cheng@gmail.com> wrote:
>> >> >>
>> >> >> On Fri, Jun 1, 2018 at 9:56 AM, Kugan Vivekanandarajah
>> >> >> <kugan.vivekanandarajah@linaro.org> wrote:
>> >> >> > Hi Bin,
>> >> >> >
>> >> >> > Thanks a lo for the review.
>> >> >> >
>> >> >> > On 1 June 2018 at 03:45, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> >> >> >> On Thu, May 31, 2018 at 3:51 AM, Kugan Vivekanandarajah
>> >> >> >> <kugan.vivekanandarajah@linaro.org> wrote:
>> >> >> >>> Hi Bin,
>> >> >> >>>
>> >> >> >>> Thanks for the review. Please find the revised patch based on the
>> >> >> >>> review comments.
>> >> >> >>>
>> >> >> >>> Thanks,
>> >> >> >>> Kugan
>> >> >> >>>
>> >> >> >>> On 17 May 2018 at 19:56, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> >> >> >>>> On Thu, May 17, 2018 at 2:39 AM, Kugan Vivekanandarajah
>> >> >> >>>> <kugan.vivekanandarajah@linaro.org> wrote:
>> >> >> >>>>> Hi Richard,
>> >> >> >>>>>
>> >> >> >>>>> On 6 March 2018 at 02:24, Richard Biener <richard.guenther@gmail.com> wrote:
>> >> >> >>>>>> On Thu, Feb 8, 2018 at 1:41 AM, Kugan Vivekanandarajah
>> >> >> >>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>> >> >> >>>>>>> Hi Richard,
>> >> >> >>>>>>>
>> >> >> >>
>> >> >> >> Hi,
>> >> >> >> Thanks very much for working.
>> >> >> >>
>> >> >> >>> +/* Utility function to check if OP is defined by a stmt
>> >> >> >>> +   that is a val - 1.  If that is the case, set it to STMT.  */
>> >> >> >>> +
>> >> >> >>> +static bool
>> >> >> >>> +ssa_defined_by_and_minus_one_stmt_p (tree op, tree val, gimple **stmt)
>> >> >> >> This is checking if op is defined as val - 1, so name it as
>> >> >> >> ssa_defined_by_minus_one_stmt_p?
>> >> >> >>
>> >> >> >>> +{
>> >> >> >>> +  if (TREE_CODE (op) == SSA_NAME
>> >> >> >>> +      && (*stmt = SSA_NAME_DEF_STMT (op))
>> >> >> >>> +      && is_gimple_assign (*stmt)
>> >> >> >>> +      && (gimple_assign_rhs_code (*stmt) == PLUS_EXPR)
>> >> >> >>> +      && val == gimple_assign_rhs1 (*stmt)
>> >> >> >>> +      && integer_minus_onep (gimple_assign_rhs2 (*stmt)))
>> >> >> >>> +    return true;
>> >> >> >>> +  else
>> >> >> >>> +    return false;
>> >> >> >> You can simply return the boolean condition.
>> >> >> > Done.
>> >> >> >
>> >> >> >>
>> >> >> >>> +}
>> >> >> >>> +
>> >> >> >>> +/* See if LOOP is a popcout implementation of the form
>> >> >> >> ...
>> >> >> >>> +  rhs1 = gimple_assign_rhs1 (and_stmt);
>> >> >> >>> +  rhs2 = gimple_assign_rhs2 (and_stmt);
>> >> >> >>> +
>> >> >> >>> +  if (ssa_defined_by_and_minus_one_stmt_p (rhs1, rhs2, &and_minus_one))
>> >> >> >>> +    rhs1 = rhs2;
>> >> >> >>> +  else if (ssa_defined_by_and_minus_one_stmt_p (rhs2, rhs1, &and_minus_one))
>> >> >> >>> +    ;
>> >> >> >>> +  else
>> >> >> >>> +    return false;
>> >> >> >>> +
>> >> >> >>> +  /* Check the recurrence.  */
>> >> >> >>> +  phi = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (and_minus_one));
>> >> >> >> So gimple_assign_rhs1 (and_minus_one) == rhs1 is always true?  Please
>> >> >> >> use rhs1 directly.
>> >> >> >
>> >> >> > Done.
>> >> >> >>> +  gimple *src_phi = SSA_NAME_DEF_STMT (rhs2);
>> >> >> >> I think this is checking wrong thing and is redundant.  Either rhs2
>> >> >> >> equals to rhs1 or is defined as (rhs1 - 1).  For (rhs2 == rhs1) case,
>> >> >> >> the check duplicates checking on phi; for the latter, it's never a PHI
>> >> >> >> stmt and shouldn't be checked.
>> >> >> >>
>> >> >> >>> +  if (gimple_code (phi) != GIMPLE_PHI
>> >> >> >>> +      || gimple_code (src_phi) != GIMPLE_PHI)
>> >> >> >>> +    return false;
>> >> >> >>> +
>> >> >> >>> +  dest = gimple_assign_lhs (count_stmt);
>> >> >> >>> +  tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
>> >> >> >>> +  tree src = gimple_phi_arg_def (src_phi, loop_preheader_edge (loop)->dest_idx);
>> >> >> >>> +  if (adjust)
>> >> >> >>> +    iter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest),
>> >> >> >>> +            build_call_expr (fn, 1, src),
>> >> >> >>> +            build_int_cst (TREE_TYPE (dest), 1));
>> >> >> >>> +  else
>> >> >> >>> +    iter = build_call_expr (fn, 1, src);
>> >> >> >> Note tree-ssa-loop-niters.c always use unsigned_type_for (IV-type) as
>> >> >> >> niters type.  Though unsigned type is unnecessary in this case, but
>> >> >> >> better to follow existing behavior?
>> >> >> >
>> >> >> > Done.
>> >> >> >>
>> >> >> >>> +  max = int_cst_value (TYPE_MAX_VALUE (TREE_TYPE (dest)));
>> >> >> >> As richi suggested, max should be the number of bits in type of IV.
>> >> >> >>
>> >> >> >>> +
>> >> >> >>> +  niter->assumptions = boolean_false_node;
>> >> >> >> Redundant.
>> >> >> >
>> >> >> > Not sure I understand. If I remove this,  I am getting ICE
>> >> >> > (segmentation fault). What is the expectation here?
>> >> >> Is it a simple typo?  Because assumptions is set to boolean_true_node
>> >> >> just 5 lines below?
>> >> >> The niters part looks good for me with this change.  You may need
>> >> >> richi's approval for other parts?
>> >> >
>> >> > diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
>> >> > index bdf9ba1..4dc156a 100644
>> >> > --- a/gcc/ipa-fnsummary.c
>> >> > +++ b/gcc/ipa-fnsummary.c
>> >> > @@ -1485,6 +1485,10 @@ will_be_nonconstant_expr_predicate (struct
>> >> > ipa_node_params *info,
>> >> >                                                nonconstant_names);
>> >> >        return p2.or_with (summary->conds, p1);
>> >> >      }
>> >> > +  else if (TREE_CODE (expr) == CALL_EXPR)
>> >> > +    {
>> >> > +      return false;
>> >> > +    }
>> >> >    else
>> >> >      {
>> >> >        debug_tree (expr);
>> >> >
>> >> > I'd return true here instead - we don't want to track popcount() in
>> >> > predicates.  Also unnecessary braces.
>> >> >
>> >> > @@ -3492,6 +3494,11 @@ expression_expensive_p (tree expr)
>> >> >        if (!integer_pow2p (TREE_OPERAND (expr, 1)))
>> >> >         return true;
>> >> >      }
>> >> > +  if (code == CALL_EXPR
>> >> > +      && (fndecl = get_callee_fndecl (expr))
>> >> > +      && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL
>> >> > +      && (BUILT_IN_POPCOUNT == DECL_FUNCTION_CODE (fndecl)))
>> >> > +    return false;
>> >> >
>> >> > please use
>> >> >
>> >> >     if (code == CALL_EXPR)
>> >> >      {
>> >> >          if (!is_inexpensive_builtin (get_callee_fndecl (expr)))
>> >> >            return true;
>> >> >          FOR_EACH_CALL_EXPR_ARG (arg, iter, expr)
>> >> >              if (expression_expensive_p (arg))
>> >> >                 return true;
>> >> >          return false;
>> >> >      }
>> >> >
>> >> > instead.
>> >> Done.
>> >>
>> >> >
>> >> > +  /* Check "b = b & (b - 1)" is calculated.  */
>> >> > +  if (!is_gimple_assign (and_stmt)
>> >> > +      || gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR)
>> >> > +    return false;
>> >> > +
>> >> > +  rhs1 = gimple_assign_rhs1 (and_stmt);
>> >> > +  rhs2 = gimple_assign_rhs2 (and_stmt);
>> >> > +
>> >> > +  if (ssa_defined_by_minus_one_stmt_p (rhs1, rhs2, &and_minus_one))
>> >> > +    std::swap (rhs1, rhs2);
>> >> > +  else if (ssa_defined_by_minus_one_stmt_p (rhs2, rhs1, &and_minus_one))
>> >> > +    ;
>> >> > +  else
>> >> > +    return false;
>> >> >
>> >> > so the powers of match-and-simplify let you add sth like the following to
>> >> > match.pd:
>> >> >
>> >> > #if GIMPLE
>> >> > (match (b_bit_and_b_minus_one @0)
>> >> >  (bit_and:c @0 (plus @0 integer_minus_onep)))
>> >> > #endif
>> >> >
>> >> > and then use it like
>> >> >
>> >> >    extern bool gimple_b_bit_and_b_minus_one (tree, tree *, tree (*)(tree));
>> >> >    if (!gimple_b_bit_and_b_minus_one (gimple_assign_lhs (and_stmt),
>> >> > &rhs1, NULL))
>> >> >       return false;
>> >> >
>> >> > note you need to explicitely declare the matcher as there's no header
>> >> > file generated
>> >> > for them.  It would be also the first user for this "feature" and at
>> >> > some point I
>> >> > considered using in-line source markup (and sth like GTY_FILES for
>> >> > what files to scan)
>> >> > to gather patterns.
>> >> >
>> >> > +  /* Check the loop closed SSA definition for just the variable c defined in
>> >> > +     loop.  */
>> >> > +  basic_block bb = exit->dest;
>> >> > +  for (gphi_iterator gpi = gsi_start_phis (bb);
>> >> > +       !gsi_end_p (gpi); gsi_next (&gpi))
>> >> > +    {
>> >> > +      phi = gpi.phi ();
>> >> > +      count++;
>> >> > +    }
>> >> >
>> >> > I don't understand why you are checking for this - isn't the number of
>> >> > iterations
>> >> I am trying to be overly conservative. I have removed it now and adjusted.
>> >>
>> >> > independent on the rest of the loop-closed PHIs?  That is, even for just
>> >> >
>> >> >  while (b) { b = b & (b-1); }
>> >> >
>> >> > the number of iterations is popcount (b).  So it's enough to check
>> >> > the exit condition?  In fact you are coming from number_of_iterations_exit
>> >> > and thus are asked for a specific exit and thus
>> >> >
>> >> > +  if (!(exit = single_exit (loop)))
>> >> > +    return false;
>> >> Ok, changed it.
>> >>
>> >>
>> >> > is premature.  That is, for
>> >> >
>> >> >   while (b) { b = b & (b - 1); if (--c == 0) break; }
>> >> >
>> >> > still should have popcount (b) iterations for the exit involving the
>> >> > while (b) test.
>> >> >
>> >> > So please simplify (and thus genericize) the code accordingly.  Do so
>> >> > without the match.pd trick for now, I think we can simplify the current
>> >> > beast down enough to not need the factored out function.
>> >> I am not using match.pd changes as you have asked. Please let me know
>> >> if you want that to be included as well.
>> >>
>> >> >
>> >> > You then also can handle replacing
>> >> >
>> >> > int c = 0;
>> >> > while (b) { b = b & (b-1); c+=3; }
>> >> >
>> >> > with 3 * popcount (b);
>> >> Made to handle this too. But, there are still cases we don't handle. I
>> >> am not sure if it is worth the complexity handling all the possible
>> >> cases.
>> >>
>> >> Is the attached patch looks better now?
>> >
>> > +  /* Check loop terminating branch is like
>> > +     if (b != 0).  */
>> > +  gimple *stmt = last_stmt (loop->header);
>> > +  if (!stmt
>> >
>> > this should now look at exit->src instead of loop->header.
>> Done.
>>
>> >
>> > +      || !zerop (gimple_cond_rhs (stmt))
>> >
>> > use integer_zerop
>> Done.
>>
>> >
>> > +  gimple *count_stmt = SSA_NAME_DEF_STMT (rhs1);
>> >
>> > you still didn't remove the whole count-stmt stuff.  It's unrelated
>> > and is not required at all.  Only the GIMPLE_COND lhs def-use
>> > cycle is interesting for niter purposes.
>> Changed it.
>>
>> >
>> > Please adjust accordingly.
>> >
>> > @@ -2441,7 +2578,11 @@ number_of_iterations_exit (struct loop *loop, edge exit,
>> >    gcond *stmt;
>> >    if (!number_of_iterations_exit_assumptions (loop, exit, niter,
>> >                                               &stmt, every_iteration))
>> > -    return false;
>> > +    {
>> > +      if (number_of_iterations_popcount (loop, exit, niter))
>> > +       return true;
>> > +      return false;
>> > +    }
>> >
>> > this should be done in number_of_iterations_exit_assumptions.  I realize
>> > you want to do it only if that fails but in that process you have to be
>> > sure to properly handle every_iteration & friends which is much easier
>> > to do in number_of_iterations_exit_assumptions.  So please put it
>> > where that fails for this kind of IVs:
>> >
>> >   tree iv0_niters = NULL_TREE;
>> >   if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
>> >                               op0, &iv0, safe ? &iv0_niters : NULL, false))
>> > ^^^ here
>> >     return false;
>>
>> Done.
>>
>> Is this OK now. Regression tested and bootstrapped on x86-linux-gnu
>> with no new regressions.
>
> Please pass down 'code' from number_of_iterations_exit_assumptions becaue
> that correctly does

Done.

>
>   /* We want the condition for staying inside loop.  */
>   code = gimple_cond_code (stmt);
>   if (exit->flags & EDGE_TRUE_VALUE)
>     code = invert_tree_comparison (code, false);
>
> alternatively if you want to keep the function more isolated from the actual
> caller you'd have to do sth like the above in number_of_iterations_popcount.
>
> +  gphi_iterator gpi = gsi_start_phis (exit->dest);
> +  if (gsi_end_p (gpi))
> +    return false;
> +  rhs1 = gimple_phi_arg_def (gpi.phi (), exit->dest_idx);
> +  if (TREE_CODE (rhs1) != SSA_NAME)
> +    return false;
>
> drop this, it's not necessary.
Done.

>
> +  /* Depending on copy-header is performed, feeding PHI stmts might be in
> +     the loop header or loop exit, handle this.  */
>
> the loop header or loop latch
>
> +  if (gimple_code (and_stmt) == GIMPLE_PHI
> +      && gimple_bb (and_stmt) == exit->src
>
> this would be loop->header still, not exit->src, otherwise the
> gimple_phi_arg_def call on the latch-edge might ICE.

Done.

>
> +  /* Check the recurrence.  */
> +  gimple *phi = SSA_NAME_DEF_STMT (rhs1);
> +  if (gimple_code (phi) != GIMPLE_PHI)
> +    return false;
>
> I think you need to verify that the latch edge argument of this
> PHI has the proper value which should be the LHS of the
> and_stmt.
Done.


>
> What would make the code clearer is probably naming
> variables after a pattern you put in comments.  For example
>
>  /* We match
>       # b_1 = PHI <..., b_2>
>       tem_3 = b_1 + -1;
>       b_2 = b_1 & tem_3;  */
>
> and then have local vars b1 and b2 instead of rhs1/rhs2, etc.
Done.

>
> The overall function comment is now misleading since
> it still mentions 'c'.
>
> +  tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
> +  tree src = gimple_phi_arg_def (phi, loop_preheader_edge (loop)->dest_idx);
> +  tree call = build_call_expr (fn, 1, src);
>
> POPCOUNT expects an 'unsigned int' argument so you have to choose
> POPCOUNT, POPCOUNTL or POPCOUNTLL based on the type of SRC
> and also convert it properly (from say signed int to unsigned int).  The
> testcases all use 'long' but no actual value that would make a difference
> if you use popcount ((int)long-value).
>
> Note there's an internal functiuon as well but it is only supported if the
> target natively supports popcount which isn't what is desired here.
> So you have to deal with the type of src, making sure you zero-extent
> it if you need extension (for a loop that works on singed char or short)
> because sign-extension can change the popcount result (likely you'll
> need a GIMPLE testcase to trigger a bogus transform here).
>
> That said, for simplicity and not complicating this even further - heh - I
> suggest you do
>
>   if (TYPE_PRECISION (TREE_TYPE (src)) == TYPE_PRECISION (integer_type_node))
>     fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
>  else if (TYPE_PRECISION (TREE_TYPE (src)) == TYPE_PRECISION
> (long_integer_type_node))
>    fn = builtin_decl_implicit (BUILT_IN_POPCOUNTL);
>  else if
>     ... LL case
>
>   /* ???  Support promoting char/short to int.  */
>   if (!fn)
>     return false;
>
>   src = fold_convert (unsigned_type_for (TREE_TYPE (src)), src);
>
> +  tree utype = unsigned_type_for (TREE_TYPE (call));
> +  if (adjust)
> +    iter = fold_build2 (MINUS_EXPR, utype,
> +                       build_call_expr (fn, 1, src),
>
> you need to convert the call result to utype, thus wrap it
> in fold_convert (utype, build_call_expr (...)).
>
> +                       build_int_cst (utype, 1));
>
> so if adjust is true then you have to guard against the initial value
> being zero because then your niter is -1U.  You do this by
> setting may_be_zero to
Done. But in this case now we dont do the final value replacement for
the case when we do copy header.

When we do copy header (where we do -1), loop seems to always have the
zero check at the beginning. So is it still necessary?

>
>   fold_build2 (EQ_EXPR, boolean_type_node, src, build_zero_cst
> (TREE_TYPE (src)));
>
> +  else
> +    iter = fold_convert (utype, build_call_expr (fn, 1, src));
> +  max = int_cst_value (TYPE_MAX_VALUE (utype));
>
> Surely max is not TYPE_MAX_VALUE of utype but instead
> TYPE_PRECISION (TREE_TYPE (src)) (of the original
> unpromoted src).  In case of adjust == true it is even one less.
Done.



>
> Sorry for the many (correctness) issues again :/

Thanks again for the review.

Kugan
>
> Thanks,
> Richard.
>
> +
> +  niter->assumptions = boolean_false_node;
> +  niter->control.base = NULL_TREE;
> +  niter->control.step = NULL_TREE;
> +  niter->control.no_overflow = false;
> +  niter->niter = iter;
> +  niter->assumptions = boolean_true_node;
> +  niter->may_be_zero = boolean_false_node;
> +  niter->max = max;
> +  niter->bound = NULL_TREE;
> +  niter->cmp = ERROR_MARK;
>
>
>
>> Thanks,
>> Kugan
>> >
>> >
>> > Thanks,
>> > Richard.
>> >
>> >> Thanks,
>> >> Kugan
>> >>
>> >>
>> >> >
>> >> > Richard.
>> >> >
>> >> >> Thanks,
>> >> >> bin
>> >> >> >
>> >> >> >>> +  niter->control.base = NULL_TREE;
>> >> >> >>> +  niter->control.step = NULL_TREE;
>> >> >> >>> +  niter->control.no_overflow = false;
>> >> >> >>> +  niter->niter = iter;
>> >> >> >>> +  niter->assumptions = boolean_true_node;
>> >> >> >>> +  niter->may_be_zero = boolean_false_node;
>> >> >> >>> +  niter->max = max;
>> >> >> >>> +  niter->bound = NULL_TREE;
>> >> >> >>> +  niter->cmp = ERROR_MARK;
>> >> >> >>> +  return true;
>> >> >> >>> +}
>> >> >> >>> +
>> >> >> >>> +
>> >> >> >> Appology if these are nitpickings.
>> >> >> > Thanks for the review. I am happy to make the changes needed to get it
>> >> >> > to how it should be :)
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Kugan
>> >> >> >
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> bin

[-- Attachment #2: 0001-popcount.patch --]
[-- Type: text/x-patch, Size: 11404 bytes --]

From 01257de5b8b19d5f0bd8fde6edfa3908b74aba90 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Date: Thu, 10 May 2018 21:41:53 +1000
Subject: [PATCH] popcount

Change-Id: Ib211004cc42998f1da8caedfedfaa72ba3aaf5e2
---
 gcc/ipa-fnsummary.c                       |   2 +
 gcc/testsuite/gcc.dg/tree-ssa/popcount.c  |  41 +++++++
 gcc/testsuite/gcc.dg/tree-ssa/popcount2.c |  29 +++++
 gcc/testsuite/gcc.dg/tree-ssa/popcount3.c |  28 +++++
 gcc/tree-scalar-evolution.c               |  15 +++
 gcc/tree-ssa-loop-ivopts.c                |  10 ++
 gcc/tree-ssa-loop-niter.c                 | 176 +++++++++++++++++++++++++++++-
 7 files changed, 300 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount3.c

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index bdf9ba1..feb1c9e 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -1485,6 +1485,8 @@ will_be_nonconstant_expr_predicate (struct ipa_node_params *info,
 					       nonconstant_names);
       return p2.or_with (summary->conds, p1);
     }
+  else if (TREE_CODE (expr) == CALL_EXPR)
+    return true;
   else
     {
       debug_tree (expr);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
new file mode 100644
index 0000000..86a66cb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+extern int foo (int);
+
+int PopCount (long b) {
+    int c = 0;
+    b++;
+
+    while (b) {
+	b &= b - 1;
+	c++;
+    }
+    return c;
+}
+int PopCount2 (long b) {
+    int c = 0;
+
+    while (b) {
+	b &= b - 1;
+	c++;
+    }
+    foo (c);
+    return foo (c);
+}
+
+void PopCount3 (long b1) {
+
+    for (long i = 0; i < b1; ++i)
+      {
+	long b = i;
+	int c = 0;
+	while (b) {
+	    b &= b - 1;
+	    c++;
+	}
+	foo (c);
+      }
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_popcount" 3 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
new file mode 100644
index 0000000..362d835
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount2.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+__attribute__ ((noinline, noclone))
+foo (int i, long b)
+{
+    int c = i;
+
+    while (b) {
+	b &= b - 1;
+	c++;
+    }
+    return c;
+}
+
+int main()
+{
+
+  if (foo (1, 7) != 4)
+   __builtin_abort ();
+  if (foo (0, 0) != 0)
+   __builtin_abort ();
+  if (foo (8, 0xff) != 16)
+   __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
new file mode 100644
index 0000000..9096c6b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount3.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-tree-ch -fdump-tree-optimized" } */
+
+int
+__attribute__ ((noinline, noclone))
+foo (long b)
+{
+    int c = 0;
+
+    while (b) {
+	b &= b - 1;
+	c++;
+    }
+    return c;
+}
+
+int main()
+{
+  if (foo (7) != 3)
+   __builtin_abort ();
+  if (foo (0) != 0)
+   __builtin_abort ();
+  if (foo (0xff) != 8)
+   __builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_popcount" 1 "optimized" } } */
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index fefc9de..4b0ec02 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -281,6 +281,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-propagate.h"
 #include "gimple-fold.h"
 #include "tree-into-ssa.h"
+#include "builtins.h"
 
 static tree analyze_scalar_evolution_1 (struct loop *, tree);
 static tree analyze_scalar_evolution_for_address_of (struct loop *loop,
@@ -1984,6 +1985,7 @@ interpret_expr (struct loop *loop, gimple *at_stmt, tree expr)
     return expr;
 
   if (TREE_CODE (expr) == POLYNOMIAL_CHREC
+      || TREE_CODE (expr) == CALL_EXPR
       || get_gimple_rhs_class (TREE_CODE (expr)) == GIMPLE_TERNARY_RHS)
     return chrec_dont_know;
 
@@ -3493,6 +3495,19 @@ expression_expensive_p (tree expr)
 	return true;
     }
 
+  if (code == CALL_EXPR)
+    {
+      tree arg;
+      call_expr_arg_iterator iter;
+
+      if (!is_inexpensive_builtin (get_callee_fndecl (expr)))
+	return true;
+      FOR_EACH_CALL_EXPR_ARG (arg, iter, expr)
+	if (expression_expensive_p (arg))
+	  return true;
+      return false;
+    }
+
   switch (TREE_CODE_CLASS (code))
     {
     case tcc_binary:
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index b313571..519649a 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -985,6 +985,16 @@ contains_abnormal_ssa_name_p (tree expr)
   code = TREE_CODE (expr);
   codeclass = TREE_CODE_CLASS (code);
 
+  if (code == CALL_EXPR)
+    {
+      tree arg;
+      call_expr_arg_iterator iter;
+      FOR_EACH_CALL_EXPR_ARG (arg, iter, expr)
+	if (contains_abnormal_ssa_name_p (arg))
+	  return true;
+      return false;
+    }
+
   if (code == SSA_NAME)
     return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
 
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 7a54c5f..8422d92 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -63,6 +63,10 @@ struct bounds
   mpz_t below, up;
 };
 
+static bool number_of_iterations_popcount (loop_p loop, edge exit,
+					   enum tree_code code,
+					   struct tree_niter_desc *niter);
+
 
 /* Splits expression EXPR to a variable part VAR and constant OFFSET.  */
 
@@ -2357,7 +2361,7 @@ number_of_iterations_exit_assumptions (struct loop *loop, edge exit,
   tree iv0_niters = NULL_TREE;
   if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
 			      op0, &iv0, safe ? &iv0_niters : NULL, false))
-    return false;
+    return number_of_iterations_popcount (loop, exit, code, niter);
   tree iv1_niters = NULL_TREE;
   if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
 			      op1, &iv1, safe ? &iv1_niters : NULL, false))
@@ -2430,6 +2434,176 @@ number_of_iterations_exit_assumptions (struct loop *loop, edge exit,
   return (!integer_zerop (niter->assumptions));
 }
 
+
+/* Utility function to check if OP is defined by a stmt
+   that is a val - 1.  */
+
+static bool
+ssa_defined_by_minus_one_stmt_p (tree op, tree val)
+{
+  gimple *stmt;
+  return (TREE_CODE (op) == SSA_NAME
+	  && (stmt = SSA_NAME_DEF_STMT (op))
+	  && is_gimple_assign (stmt)
+	  && (gimple_assign_rhs_code (stmt) == PLUS_EXPR)
+	  && val == gimple_assign_rhs1 (stmt)
+	  && integer_minus_onep (gimple_assign_rhs2 (stmt)));
+}
+
+
+/* See if LOOP is a popcout implementation, determine NITER for the loop
+
+   We match:
+   <bb 2>
+   goto <bb 4>
+
+   <bb 3>
+   _1 = b_11 + -1
+   b_6 = _1 & b_11
+
+   <bb 4>
+   b_11 = PHI <b_5(D)(2), b_6(3)>
+
+   exit block
+   if (b_11 != 0)
+	goto <bb 3>
+   else
+	goto <bb 5>
+
+   OR we match copy-header version:
+   if (b_5 != 0)
+	goto <bb 3>
+   else
+	goto <bb 4>
+
+   <bb 3>
+   b_11 = PHI <b_5(2), b_6(3)>
+   _1 = b_11 + -1
+   b_6 = _1 & b_11
+
+   exit block
+   if (b_6 != 0)
+	goto <bb 3>
+   else
+	goto <bb 4>
+
+   If popcount pattern, update NITER accordingly.
+   i.e., set NITER to  __builtin_popcount (b)
+   return true if we did, false otherwise.
+
+ */
+
+static bool
+number_of_iterations_popcount (loop_p loop, edge exit,
+			       enum tree_code code,
+			       struct tree_niter_desc *niter)
+{
+  bool adjust = true;
+  tree iter;
+  HOST_WIDE_INT max;
+  adjust = true;
+  tree fn = NULL_TREE;
+
+  /* Check loop terminating branch is like
+     if (b != 0).  */
+  gimple *stmt = last_stmt (exit->src);
+  if (!stmt
+      || gimple_code (stmt) != GIMPLE_COND
+      || code != NE_EXPR
+      || !integer_zerop (gimple_cond_rhs (stmt))
+      || TREE_CODE (gimple_cond_lhs (stmt)) != SSA_NAME)
+    return false;
+
+  gimple *and_stmt = SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt));
+
+  /* Depending on copy-header is performed, feeding PHI stmts might be in
+     the loop header or loop latch, handle this.  */
+  if (gimple_code (and_stmt) == GIMPLE_PHI
+      && gimple_bb (and_stmt) == loop->header
+      && gimple_phi_num_args (and_stmt) == 2
+      && (TREE_CODE (gimple_phi_arg_def (and_stmt,
+					 loop_latch_edge (loop)->dest_idx))
+	  == SSA_NAME))
+    {
+      /* SSA used in exit condition is defined by PHI stmt
+	b_11 = PHI <b_5(D)(2), b_6(3)>
+	from the PHI stmt, get the and_stmt
+	b_6 = _1 & b_11.  */
+      tree t = gimple_phi_arg_def (and_stmt, loop_latch_edge (loop)->dest_idx);
+      and_stmt = SSA_NAME_DEF_STMT (t);
+      adjust = false;
+    }
+
+  /* Make sure it is indeed an and stmt (b_6 = _1 & b_11).  */
+  if (!is_gimple_assign (and_stmt)
+      || gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR)
+    return false;
+
+  tree b_11 = gimple_assign_rhs1 (and_stmt);
+  tree _1 = gimple_assign_rhs2 (and_stmt);
+
+  /* Check that _1 is defined by _b11 + -1 (_1 = b_11 + -1).
+     Also make sure that b_11 is the same in and_stmt and _1 defining stmt.
+     Also canonicalize if _1 and _b11 are revrsed.  */
+  if (ssa_defined_by_minus_one_stmt_p (b_11, _1))
+    std::swap (b_11, _1);
+  else if (ssa_defined_by_minus_one_stmt_p (_1, b_11))
+    ;
+  else
+    return false;
+  /* Check the recurrence:
+   ... = PHI <b_5(2), b_6(3)>.  */
+  gimple *phi = SSA_NAME_DEF_STMT (b_11);
+  if (gimple_code (phi) != GIMPLE_PHI
+      || (gimple_assign_lhs (and_stmt)
+	  != gimple_phi_arg_def (phi, loop_latch_edge (loop)->dest_idx)))
+    return false;
+
+  /* We found a match. Get the corresponding popcount builtin.  */
+  tree src = gimple_phi_arg_def (phi, loop_preheader_edge (loop)->dest_idx);
+  if (TYPE_PRECISION (TREE_TYPE (src)) == TYPE_PRECISION (integer_type_node))
+    fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
+  else if (TYPE_PRECISION (TREE_TYPE (src)) == TYPE_PRECISION
+	   (long_integer_type_node))
+    fn = builtin_decl_implicit (BUILT_IN_POPCOUNTL);
+  else if (TYPE_PRECISION (TREE_TYPE (src)) == TYPE_PRECISION
+	   (long_long_integer_type_node))
+    fn = builtin_decl_implicit (BUILT_IN_POPCOUNTLL);
+
+  /* ??? Support promoting char/short to int.  */
+  if (!fn)
+    return false;
+
+  /* Update NITER params accordingly  */
+  max = int_cst_value (TYPE_MAX_VALUE (TREE_TYPE (src)));
+  tree utype = unsigned_type_for (TREE_TYPE (src));
+  src = fold_convert (utype, src);
+  tree call = fold_convert (utype, build_call_expr (fn, 1, src));
+  if (adjust)
+    iter = fold_build2 (MINUS_EXPR, utype,
+			call,
+			build_int_cst (utype, 1));
+  else
+    iter = call;
+  if (adjust)
+    max = max - 1;
+
+  niter->niter = iter;
+  niter->assumptions = boolean_true_node;
+  if (adjust)
+    niter->may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src,
+				      build_zero_cst
+				      (TREE_TYPE (src)));
+  else
+    niter->may_be_zero = boolean_false_node;
+
+  niter->max = max;
+  niter->bound = NULL_TREE;
+  niter->cmp = ERROR_MARK;
+  return true;
+}
+
+
 /* Like number_of_iterations_exit_assumptions, but return TRUE only if
    the niter information holds unconditionally.  */
 
-- 
2.7.4


  reply	other threads:[~2018-06-12  3:10 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-24 22:17 Kugan Vivekanandarajah
2018-01-25 10:24 ` Richard Biener
2018-01-31 10:41   ` Kugan Vivekanandarajah
2018-01-31 11:05     ` Richard Biener
2018-02-01  4:08       ` Kugan Vivekanandarajah
2018-02-01 12:21         ` Richard Biener
2018-02-08  0:41           ` Kugan Vivekanandarajah
2018-03-05 15:25             ` Richard Biener
2018-03-06 16:20               ` Bin.Cheng
2018-03-07  8:26                 ` Richard Biener
2018-03-07 11:25                   ` Bin.Cheng
2018-03-08 22:07                     ` Kugan Vivekanandarajah
2018-05-17  2:16               ` Kugan Vivekanandarajah
2018-05-17 10:05                 ` Bin.Cheng
2018-05-31  6:52                   ` Kugan Vivekanandarajah
2018-05-31 17:53                     ` Bin.Cheng
2018-06-01  9:57                       ` Kugan Vivekanandarajah
2018-06-01 10:06                         ` Bin.Cheng
2018-06-01 13:12                           ` Richard Biener
2018-06-05  9:02                             ` Kugan Vivekanandarajah
2018-06-05 11:25                               ` Richard Biener
2018-06-07  0:52                                 ` Kugan Vivekanandarajah
2018-06-07  9:21                                   ` Richard Biener
2018-06-12  3:10                                     ` Kugan Vivekanandarajah [this message]
2018-06-14 13:57                                       ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAELXzTOGg5zk__gJJX=6_r5TugKYEW9g6ngug3iAUk7Pdq9mrQ@mail.gmail.com' \
    --to=kugan.vivekanandarajah@linaro.org \
    --cc=amker.cheng@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).