public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
@ 2021-08-10 23:15 ` pinskia at gcc dot gnu.org
  2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-10 23:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
           Keywords|                            |missed-optimization
          Component|middle-end                  |tree-optimization

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
  2021-08-10 23:15 ` [Bug tree-optimization/90693] Missing popcount simplifications pinskia at gcc dot gnu.org
@ 2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
  2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-20  9:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:d0b6b7f8a6b8115b033441590a6304fb088d193c

commit r14-5612-gd0b6b7f8a6b8115b033441590a6304fb088d193c
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Nov 20 10:00:09 2023 +0100

    tree-ssa-math-opts: popcount (X) == 1 to (X ^ (X - 1)) > (X - 1)
optimization [PR90693]

    Per the earlier discussions on this PR, the following patch folds
    popcount (x) == 1 (and != 1) into (x ^ (x - 1)) > x - 1 (or <=)
    if the corresponding popcount optab isn't implemented (I think any
    double-word popcount or call will be necessarily slower than the
    above cheap 3 op check and even for -Os larger or same size).

    I've noticed e.g. C++ aligned new starts with std::has_single_bit
    which does popcount (x) == 1.

    As a follow-up, I'm considering changing in this routine the popcount
    call to IFN_POPCOUNT with 2 arguments and during expansion test costs.

    2023-11-20  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/90693
            * tree-ssa-math-opts.cc (match_single_bit_test): New function.
            (math_opts_dom_walker::after_dom_children): Call it for EQ_EXPR
            and NE_EXPR assignments and GIMPLE_CONDs.

            * gcc.target/i386/pr90693.c: New test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
  2021-08-10 23:15 ` [Bug tree-optimization/90693] Missing popcount simplifications pinskia at gcc dot gnu.org
  2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
@ 2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
  2023-11-20 14:10 ` wilco at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-20  9:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:103a3966bc7b68f91b6cd3c5beb330c4b8570c90

commit r14-5613-g103a3966bc7b68f91b6cd3c5beb330c4b8570c90
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Nov 20 10:03:20 2023 +0100

    tree-ssa-math-opts: popcount (X) == 1 to (X ^ (X - 1)) > (X - 1)
optimization for direct optab [PR90693]

    On Fri, Nov 17, 2023 at 03:01:04PM +0100, Jakub Jelinek wrote:
    > As a follow-up, I'm considering changing in this routine the popcount
    > call to IFN_POPCOUNT with 2 arguments and during expansion test costs.

    Here is the follow-up which does the rtx costs testing.

    2023-11-20  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/90693
            * tree-ssa-math-opts.cc (match_single_bit_test): Mark POPCOUNT with
            result only used in equality comparison against 1 with direct optab
            support as .POPCOUNT call with 2 arguments.
            * internal-fn.h (expand_POPCOUNT): Declare.
            * internal-fn.def (DEF_INTERNAL_INT_EXT_FN): New macro, document
it,
            undefine at the end.
            (POPCOUNT): Use it instead of DEF_INTERNAL_INT_FN.
            * internal-fn.cc (DEF_INTERNAL_INT_EXT_FN): Define to nothing
before
            inclusion to define expanders.
            (expand_POPCOUNT): New function.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
@ 2023-11-20 14:10 ` wilco at gcc dot gnu.org
  2023-11-20 17:01 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: wilco at gcc dot gnu.org @ 2023-11-20 14:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #6 from Wilco <wilco at gcc dot gnu.org> ---
Thanks Jakub - with the 2nd patch we get the expected sequence on AArch64:

        sub     x1, x0, #1
        eor     x0, x0, x1
        cmp     x0, x1
        cset    x0, hi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2023-11-20 14:10 ` wilco at gcc dot gnu.org
@ 2023-11-20 17:01 ` jakub at gcc dot gnu.org
  2024-01-01 12:41 ` piotrsiupa at gmail dot com
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-11-20 17:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess now the important question is if what we produce with the patch is
faster (or at least not slower) than what we used to generate on all targets;
though, if that isn't the case, best fix might be to adjust target costs.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2023-11-20 17:01 ` jakub at gcc dot gnu.org
@ 2024-01-01 12:41 ` piotrsiupa at gmail dot com
  2024-01-03 15:34 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: piotrsiupa at gmail dot com @ 2024-01-01 12:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

Piotr Siupa <piotrsiupa at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |piotrsiupa at gmail dot com

--- Comment #8 from Piotr Siupa <piotrsiupa at gmail dot com> ---
I did a benchmark and (x & (x-1)) == 0 and it seems to be about 2x as fast as
the currently generated code (at least on my AMD64 processor).

Maybe it should be used if x is guaranteed to not be zero, e.g. if (x == 0)
std::unreachable().

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2024-01-01 12:41 ` piotrsiupa at gmail dot com
@ 2024-01-03 15:34 ` jakub at gcc dot gnu.org
  2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-01-03 15:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 56984
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56984&action=edit
gcc14-pr90693.patch

Untested patch to do that.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2024-01-03 15:34 ` jakub at gcc dot gnu.org
@ 2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
  2024-01-07 12:56 ` piotrsiupa at gmail dot com
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-05 10:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:0152637c74c9eab7658483b1cca4e3d584dd4262

commit r14-6940-g0152637c74c9eab7658483b1cca4e3d584dd4262
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Jan 5 11:16:58 2024 +0100

    Improve __builtin_popcount* (x) == 1 generation if x is known != 0
[PR90693]

    We expand __builtin_popcount* (x) == 1 as
    x ^ (x - 1) > x - 1, either unconditionally in tree-ssa-math-opts.cc
    if we don't have a direct optab support for popcount, or during
    expansion where we compare the costs of comparison of the popcount
    against one vs. the above expression.
    As mentioned in the PR, if we know from ranger that the argument is
    not zero, we can emit x & (x - 1) == 0 test which is same number of
    GIMPLE statements, but on many targets cheaper (e.g. whenever an AND
    instruction can also set flags on whether result was zero or not).

    The following patch does that.

    2024-01-05  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/90693
            * tree-ssa-math-opts.cc (match_single_bit_test): If
            tree_expr_nonzero_p (arg), remember it in the second argument to
            IFN_POPCOUNT or lower it as arg & (arg - 1) == 0 rather than
            arg ^ (arg - 1) > arg - 1.
            * internal-fn.cc (expand_POPCOUNT): If second argument to
            IFN_POPCOUNT suggests arg is non-zero, try to expand it as
            arg & (arg - 1) == 0 rather than arg ^ (arg - 1) > arg - 1.

            * gcc.target/i386/pr90693-2.c: New test.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
@ 2024-01-07 12:56 ` piotrsiupa at gmail dot com
  2024-03-04  7:42 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 12+ messages in thread
From: piotrsiupa at gmail dot com @ 2024-01-07 12:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #11 from Piotr Siupa <piotrsiupa at gmail dot com> ---
Thanks! Now the generated assembly is one instruction shorter.

It works for:
bool foo(unsigned x)
{
    [[assume(x != 0)]];
    return std::has_single_bit(x);
}
and for:
bool foo(unsigned x)
{
    if (x == 0)
        std::unreachable();
    else
        return std::has_single_bit(x);
}

However, I've noticed that:
bool foo(unsigned x)
{
    if (x == 0)
        return true;
    else
        return std::has_single_bit(x);
}
still uses (X ^ (X - 1)) > (X - 1).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2024-01-07 12:56 ` piotrsiupa at gmail dot com
@ 2024-03-04  7:42 ` pinskia at gcc dot gnu.org
  2024-03-04 16:41 ` pinskia at gcc dot gnu.org
  2024-03-04 22:55 ` pinskia at gcc dot gnu.org
  11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-04  7:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Piotr Siupa from comment #11)
> However, I've noticed that:
> bool foo(unsigned x)
> {
>     if (x == 0)
>         return true;
>     else
>         return std::has_single_bit(x);
> }


Oh that is because expand does not use flow sensitive ranges/non-zero bits
there. There is talk about adding the ability for that but nothing has been
done yet.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2024-03-04  7:42 ` pinskia at gcc dot gnu.org
@ 2024-03-04 16:41 ` pinskia at gcc dot gnu.org
  2024-03-04 22:55 ` pinskia at gcc dot gnu.org
  11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-04 16:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

--- Comment #13 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #12)
> (In reply to Piotr Siupa from comment #11)
> > However, I've noticed that:
> > bool foo(unsigned x)
> > {
> >     if (x == 0)
> >         return true;
> >     else
> >         return std::has_single_bit(x);
> > }
> 
> 
> Oh that is because expand does not use flow sensitive ranges/non-zero bits
> there. There is talk about adding the ability for that but nothing has been
> done yet.

Well that also should be transformed into `__builtin_popcount(a) <= 1` which
then gets expanded into `(v & (v - 1)) == 0`. I will be handling both of those
via PR 94787 .

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Bug tree-optimization/90693] Missing popcount simplifications
       [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2024-03-04 16:41 ` pinskia at gcc dot gnu.org
@ 2024-03-04 22:55 ` pinskia at gcc dot gnu.org
  11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-04 22:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|wilco at gcc dot gnu.org           |pinskia at gcc dot gnu.org

--- Comment #14 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> __builtin_popcount (x) == 1 into x == (x & -x)

Actually that should be `__builtin_popcount (x) <= 1`

Anyways I am going to implement the rest here due to PR 94787 .

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-03-04 22:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
2021-08-10 23:15 ` [Bug tree-optimization/90693] Missing popcount simplifications pinskia at gcc dot gnu.org
2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
2023-11-20  9:04 ` cvs-commit at gcc dot gnu.org
2023-11-20 14:10 ` wilco at gcc dot gnu.org
2023-11-20 17:01 ` jakub at gcc dot gnu.org
2024-01-01 12:41 ` piotrsiupa at gmail dot com
2024-01-03 15:34 ` jakub at gcc dot gnu.org
2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
2024-01-07 12:56 ` piotrsiupa at gmail dot com
2024-03-04  7:42 ` pinskia at gcc dot gnu.org
2024-03-04 16:41 ` pinskia at gcc dot gnu.org
2024-03-04 22:55 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).