From: "Bin.Cheng" <amker.cheng@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>,
GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [RFC][PR82479] missing popcount builtin detection
Date: Tue, 06 Mar 2018 16:20:00 -0000 [thread overview]
Message-ID: <CAHFci29sjfrPBU+mindJpuJOU1KUS0B3mFdFe8JOY978Ro00UA@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc0E5RmjBw17MfDkr9eEhJfgL_=NhBSD5DmmRw4Gr=LRiw@mail.gmail.com>
On Mon, Mar 5, 2018 at 3:24 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Feb 8, 2018 at 1:41 AM, Kugan Vivekanandarajah
> <kugan.vivekanandarajah@linaro.org> wrote:
>> Hi Richard,
>>
>> On 1 February 2018 at 23:21, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah
>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>> Hi Richard,
>>>>
>>>> On 31 January 2018 at 21:39, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>> Hi Richard,
>>>>>>
>>>>>> Thanks for the review.
>>>>>> On 25 January 2018 at 20:04, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>>> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
>>>>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Here is a patch for popcount builtin detection similar to LLVM. I
>>>>>>>> would like to queue this for review for next stage 1.
>>>>>>>>
>>>>>>>> 1. This is done part of loop-distribution and effective for -O3 and above.
>>>>>>>> 2. This does not distribute loop to detect popcount (like
>>>>>>>> memcpy/memmove). I dont think that happens in practice. Please correct
>>>>>>>> me if I am wrong.
>>>>>>>
>>>>>>> But then it has no business inside loop distribution but instead is
>>>>>>> doing final value
>>>>>>> replacement, right? You are pattern-matching the whole loop after all. I think
>>>>>>> final value replacement would already do the correct thing if you
>>>>>>> teached number of
>>>>>>> iteration analysis that niter for
>>>>>>>
>>>>>>> <bb 3> [local count: 955630224]:
>>>>>>> # b_11 = PHI <b_5(5), b_8(6)>
>>>>>>> _1 = b_11 + -1;
>>>>>>> b_8 = _1 & b_11;
>>>>>>> if (b_8 != 0)
>>>>>>> goto <bb 6>; [89.00%]
>>>>>>> else
>>>>>>> goto <bb 8>; [11.00%]
>>>>>>>
>>>>>>> <bb 6> [local count: 850510900]:
>>>>>>> goto <bb 3>; [100.00%]
>>>>>>
>>>>>> I am looking into this approach. What should be the scalar evolution
>>>>>> for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
>>>>>> and can this be represented with the scev?
>>>>>
>>>>> No, it's not affine and thus cannot be represented. You only need the
>>>>> scalar evolution of the counting IV which is already handled and
>>>>> the number of iteration analysis needs to handle the above IV - this
>>>>> is the missing part.
>>>> Thanks for the clarification. I am now matching this loop pattern in
>>>> number_of_iterations_exit when number_of_iterations_exit_assumptions
>>>> fails. If the pattern matches, I am inserting the _builtin_popcount in
>>>> the loop preheater and setting the loop niter with this. This will be
>>>> used by the final value replacement. Is this what you wanted?
>>>
>>> No, you shouldn't insert a popcount stmt but instead the niter
>>> GENERIC tree should be a CALL_EXPR to popcount with the
>>> appropriate argument.
>>
>> Thats what I tried earlier but ran into some ICEs. I wasn't sure if
>> niter in tree_niter_desc can take such.
>>
>> Attached patch now does this. Also had to add support for CALL_EXPR in
>> few places to handle niter with CALL_EXPR. Does this look OK?
>
> Overall this looks ok - the patch includes changes in places that I don't think
> need changes such as chrec_convert_1 or extract_ops_from_tree.
> The expression_expensive_p change should be more specific than making
> all calls inexpensive as well.
>
> The verify_ssa change looks bogus, you do
>
> + dest = gimple_phi_result (count_phi);
> + tree var = make_ssa_name (TREE_TYPE (dest), NULL);
> + tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
> +
> + var = build_call_expr (fn, 1, src);
> + *niter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest), var,
> + build_int_cst (TREE_TYPE (dest), 1));
>
> why do you allocate a new SSA name here? It seems unused
> as you overwrive 'var' with the CALL_EXPR immediately.
>
> I didn't review the pattern matching thoroughly nor the exact place you
> call it. But
>
> + if (check_popcount_pattern (loop, &count))
> + {
> + niter->assumptions = boolean_false_node;
> + niter->control.base = NULL_TREE;
> + niter->control.step = NULL_TREE;
> + niter->control.no_overflow = false;
> + niter->niter = count;
> + niter->assumptions = boolean_true_node;
> + niter->may_be_zero = boolean_false_node;
> + niter->max = -1;
> + niter->bound = NULL_TREE;
> + niter->cmp = ERROR_MARK;
> + return true;
> + }
>
> simply setting may_be_zero to false looks fishy. Try
> with -fno-tree-loop-ch. Also max should not be negative,
> it should be the number of bits in the IV type?
>
> A related testcase could be that we can completely peel
> a loop like the following which iterates at most 8 times:
>
> int a[8];
> void foo (unsigned char ctrl)
> {
> int c = 0;
> while (ctrl)
> {
> ctrl = ctrl & (ctrl - 1);
> a[c++] = ctrl;
> }
> }
>
> This is now stage1 material so please update and re-post. Maybe Bin has
> further suggestions as well.
Sorry for being late on this. Actually I thought about popcount in
distribution before. More like the first patch, but handled as
another distribution pattern rather than a special case. It's a bit
strange to compute and store the info in niters. It's also not
straight forward when/where the transformation is finally done.
I haven't looked into the details so not sure how appropriate it will
be as a distribution pattern (current ones are only about data
references). So I am okay with this version if it's more appropriate.
Thanks,
bin
>
> Thanks,
> Richard.
>
>> Thanks,
>> Kugan
>>
>>
>> gcc/ChangeLog:
>>
>> 2018-02-08 Kugan Vivekanandarajah <kuganv@linaro.org>
>>
>> * gimple-expr.c (extract_ops_from_tree): Handle CALL_EXPR.
>> * tree-chrec.c (chrec_convert_1): Likewise.
>> * tree-scalar-evolution.c (expression_expensive_p): Likewise.
>> * tree-ssa-loop-ivopts.c (contains_abnormal_ssa_name_p): Likewise.
>> * tree-ssa-loop-niter.c (check_popcount_pattern): New.
>> (number_of_iterations_exit): Record niter for popcount patern.
>> * tree-ssa.c (verify_ssa): Check stmt to be non NULL.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-02-08 Kugan Vivekanandarajah <kuganv@linaro.org>
>>
>> * gcc.dg/tree-ssa/popcount.c: New test.
>>
>>
>>>
>>>> In general, there is also a condition gating the loop like
>>>>
>>>> if (b_4 != 0)
>>>> goto loop;
>>>> else
>>>> end:
>>>>
>>>> This of course will not be removed by final value replacement. Since
>>>> popcount (0) is defined, this is redundant and could be removed but
>>>> not removed.
>>>
>>> Yeah, that's probably sth for another pass though. I suppose the
>>> end: case just uses zero in which case you'll have a PHI and you
>>> can optimize
>>>
>>> if (b != 0)
>>> return popcount (b);
>>> return 0;
>>>
>>> in phiopt.
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Kugan
>>>>
>>>>>
>>>>> Richard.
>>>>>
>>>>>> Thanks,
>>>>>> Kugan
>>>>>>>
>>>>>>> is related to popcount (b_5).
>>>>>>>
>>>>>>> Richard.
>>>>>>>
>>>>>>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new regressions.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Kugan
>>>>>>>>
>>>>>>>> gcc/ChangeLog:
>>>>>>>>
>>>>>>>> 2018-01-25 Kugan Vivekanandarajah <kuganv@linaro.org>
>>>>>>>>
>>>>>>>> PR middle-end/82479
>>>>>>>> * tree-loop-distribution.c (handle_popcount): New.
>>>>>>>> (pass_loop_distribution::execute): Use handle_popcount.
>>>>>>>>
>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>
>>>>>>>> 2018-01-25 Kugan Vivekanandarajah <kuganv@linaro.org>
>>>>>>>>
>>>>>>>> PR middle-end/82479
>>>>>>>> * gcc.dg/tree-ssa/popcount.c: New test.
next prev parent reply other threads:[~2018-03-06 16:20 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-24 22:17 Kugan Vivekanandarajah
2018-01-25 10:24 ` Richard Biener
2018-01-31 10:41 ` Kugan Vivekanandarajah
2018-01-31 11:05 ` Richard Biener
2018-02-01 4:08 ` Kugan Vivekanandarajah
2018-02-01 12:21 ` Richard Biener
2018-02-08 0:41 ` Kugan Vivekanandarajah
2018-03-05 15:25 ` Richard Biener
2018-03-06 16:20 ` Bin.Cheng [this message]
2018-03-07 8:26 ` Richard Biener
2018-03-07 11:25 ` Bin.Cheng
2018-03-08 22:07 ` Kugan Vivekanandarajah
2018-05-17 2:16 ` Kugan Vivekanandarajah
2018-05-17 10:05 ` Bin.Cheng
2018-05-31 6:52 ` Kugan Vivekanandarajah
2018-05-31 17:53 ` Bin.Cheng
2018-06-01 9:57 ` Kugan Vivekanandarajah
2018-06-01 10:06 ` Bin.Cheng
2018-06-01 13:12 ` Richard Biener
2018-06-05 9:02 ` Kugan Vivekanandarajah
2018-06-05 11:25 ` Richard Biener
2018-06-07 0:52 ` Kugan Vivekanandarajah
2018-06-07 9:21 ` Richard Biener
2018-06-12 3:10 ` Kugan Vivekanandarajah
2018-06-14 13:57 ` Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHFci29sjfrPBU+mindJpuJOU1KUS0B3mFdFe8JOY978Ro00UA@mail.gmail.com \
--to=amker.cheng@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=kugan.vivekanandarajah@linaro.org \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).