public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
@ 2021-08-10 23:15 ` pinskia at gcc dot gnu.org
2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
` (10 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-10 23:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Keywords| |missed-optimization
Component|middle-end |tree-optimization
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
2021-08-10 23:15 ` [Bug tree-optimization/90693] Missing popcount simplifications pinskia at gcc dot gnu.org
@ 2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
` (9 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-20 9:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:d0b6b7f8a6b8115b033441590a6304fb088d193c
commit r14-5612-gd0b6b7f8a6b8115b033441590a6304fb088d193c
Author: Jakub Jelinek <jakub@redhat.com>
Date: Mon Nov 20 10:00:09 2023 +0100
tree-ssa-math-opts: popcount (X) == 1 to (X ^ (X - 1)) > (X - 1)
optimization [PR90693]
Per the earlier discussions on this PR, the following patch folds
popcount (x) == 1 (and != 1) into (x ^ (x - 1)) > x - 1 (or <=)
if the corresponding popcount optab isn't implemented (I think any
double-word popcount or call will be necessarily slower than the
above cheap 3 op check and even for -Os larger or same size).
I've noticed e.g. C++ aligned new starts with std::has_single_bit
which does popcount (x) == 1.
As a follow-up, I'm considering changing in this routine the popcount
call to IFN_POPCOUNT with 2 arguments and during expansion test costs.
2023-11-20 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/90693
* tree-ssa-math-opts.cc (match_single_bit_test): New function.
(math_opts_dom_walker::after_dom_children): Call it for EQ_EXPR
and NE_EXPR assignments and GIMPLE_CONDs.
* gcc.target/i386/pr90693.c: New test.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
2021-08-10 23:15 ` [Bug tree-optimization/90693] Missing popcount simplifications pinskia at gcc dot gnu.org
2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
@ 2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
2023-11-20 14:10 ` wilco at gcc dot gnu.org
` (8 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-20 9:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:103a3966bc7b68f91b6cd3c5beb330c4b8570c90
commit r14-5613-g103a3966bc7b68f91b6cd3c5beb330c4b8570c90
Author: Jakub Jelinek <jakub@redhat.com>
Date: Mon Nov 20 10:03:20 2023 +0100
tree-ssa-math-opts: popcount (X) == 1 to (X ^ (X - 1)) > (X - 1)
optimization for direct optab [PR90693]
On Fri, Nov 17, 2023 at 03:01:04PM +0100, Jakub Jelinek wrote:
> As a follow-up, I'm considering changing in this routine the popcount
> call to IFN_POPCOUNT with 2 arguments and during expansion test costs.
Here is the follow-up which does the rtx costs testing.
2023-11-20 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/90693
* tree-ssa-math-opts.cc (match_single_bit_test): Mark POPCOUNT with
result only used in equality comparison against 1 with direct optab
support as .POPCOUNT call with 2 arguments.
* internal-fn.h (expand_POPCOUNT): Declare.
* internal-fn.def (DEF_INTERNAL_INT_EXT_FN): New macro, document
it,
undefine at the end.
(POPCOUNT): Use it instead of DEF_INTERNAL_INT_FN.
* internal-fn.cc (DEF_INTERNAL_INT_EXT_FN): Define to nothing
before
inclusion to define expanders.
(expand_POPCOUNT): New function.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
@ 2023-11-20 14:10 ` wilco at gcc dot gnu.org
2023-11-20 17:01 ` jakub at gcc dot gnu.org
` (7 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: wilco at gcc dot gnu.org @ 2023-11-20 14:10 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #6 from Wilco <wilco at gcc dot gnu.org> ---
Thanks Jakub - with the 2nd patch we get the expected sequence on AArch64:
sub x1, x0, #1
eor x0, x0, x1
cmp x0, x1
cset x0, hi
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2023-11-20 14:10 ` wilco at gcc dot gnu.org
@ 2023-11-20 17:01 ` jakub at gcc dot gnu.org
2024-01-01 12:41 ` piotrsiupa at gmail dot com
` (6 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-11-20 17:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I guess now the important question is if what we produce with the patch is
faster (or at least not slower) than what we used to generate on all targets;
though, if that isn't the case, best fix might be to adjust target costs.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2023-11-20 17:01 ` jakub at gcc dot gnu.org
@ 2024-01-01 12:41 ` piotrsiupa at gmail dot com
2024-01-03 15:34 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: piotrsiupa at gmail dot com @ 2024-01-01 12:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
Piotr Siupa <piotrsiupa at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |piotrsiupa at gmail dot com
--- Comment #8 from Piotr Siupa <piotrsiupa at gmail dot com> ---
I did a benchmark and (x & (x-1)) == 0 and it seems to be about 2x as fast as
the currently generated code (at least on my AMD64 processor).
Maybe it should be used if x is guaranteed to not be zero, e.g. if (x == 0)
std::unreachable().
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2024-01-01 12:41 ` piotrsiupa at gmail dot com
@ 2024-01-03 15:34 ` jakub at gcc dot gnu.org
2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-01-03 15:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 56984
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56984&action=edit
gcc14-pr90693.patch
Untested patch to do that.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2024-01-03 15:34 ` jakub at gcc dot gnu.org
@ 2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
2024-01-07 12:56 ` piotrsiupa at gmail dot com
` (3 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-05 10:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:0152637c74c9eab7658483b1cca4e3d584dd4262
commit r14-6940-g0152637c74c9eab7658483b1cca4e3d584dd4262
Author: Jakub Jelinek <jakub@redhat.com>
Date: Fri Jan 5 11:16:58 2024 +0100
Improve __builtin_popcount* (x) == 1 generation if x is known != 0
[PR90693]
We expand __builtin_popcount* (x) == 1 as
x ^ (x - 1) > x - 1, either unconditionally in tree-ssa-math-opts.cc
if we don't have a direct optab support for popcount, or during
expansion where we compare the costs of comparison of the popcount
against one vs. the above expression.
As mentioned in the PR, if we know from ranger that the argument is
not zero, we can emit x & (x - 1) == 0 test which is same number of
GIMPLE statements, but on many targets cheaper (e.g. whenever an AND
instruction can also set flags on whether result was zero or not).
The following patch does that.
2024-01-05 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/90693
* tree-ssa-math-opts.cc (match_single_bit_test): If
tree_expr_nonzero_p (arg), remember it in the second argument to
IFN_POPCOUNT or lower it as arg & (arg - 1) == 0 rather than
arg ^ (arg - 1) > arg - 1.
* internal-fn.cc (expand_POPCOUNT): If second argument to
IFN_POPCOUNT suggests arg is non-zero, try to expand it as
arg & (arg - 1) == 0 rather than arg ^ (arg - 1) > arg - 1.
* gcc.target/i386/pr90693-2.c: New test.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
@ 2024-01-07 12:56 ` piotrsiupa at gmail dot com
2024-03-04 7:42 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
11 siblings, 0 replies; 12+ messages in thread
From: piotrsiupa at gmail dot com @ 2024-01-07 12:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #11 from Piotr Siupa <piotrsiupa at gmail dot com> ---
Thanks! Now the generated assembly is one instruction shorter.
It works for:
bool foo(unsigned x)
{
[[assume(x != 0)]];
return std::has_single_bit(x);
}
and for:
bool foo(unsigned x)
{
if (x == 0)
std::unreachable();
else
return std::has_single_bit(x);
}
However, I've noticed that:
bool foo(unsigned x)
{
if (x == 0)
return true;
else
return std::has_single_bit(x);
}
still uses (X ^ (X - 1)) > (X - 1).
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (8 preceding siblings ...)
2024-01-07 12:56 ` piotrsiupa at gmail dot com
@ 2024-03-04 7:42 ` pinskia at gcc dot gnu.org
2024-03-04 16:41 ` pinskia at gcc dot gnu.org
2024-03-04 22:55 ` pinskia at gcc dot gnu.org
11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-04 7:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Piotr Siupa from comment #11)
> However, I've noticed that:
> bool foo(unsigned x)
> {
> if (x == 0)
> return true;
> else
> return std::has_single_bit(x);
> }
Oh that is because expand does not use flow sensitive ranges/non-zero bits
there. There is talk about adding the ability for that but nothing has been
done yet.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (9 preceding siblings ...)
2024-03-04 7:42 ` pinskia at gcc dot gnu.org
@ 2024-03-04 16:41 ` pinskia at gcc dot gnu.org
2024-03-04 22:55 ` pinskia at gcc dot gnu.org
11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-04 16:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #13 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #12)
> (In reply to Piotr Siupa from comment #11)
> > However, I've noticed that:
> > bool foo(unsigned x)
> > {
> > if (x == 0)
> > return true;
> > else
> > return std::has_single_bit(x);
> > }
>
>
> Oh that is because expand does not use flow sensitive ranges/non-zero bits
> there. There is talk about adding the ability for that but nothing has been
> done yet.
Well that also should be transformed into `__builtin_popcount(a) <= 1` which
then gets expanded into `(v & (v - 1)) == 0`. I will be handling both of those
via PR 94787 .
^ permalink raw reply [flat|nested] 12+ messages in thread
* [Bug tree-optimization/90693] Missing popcount simplifications
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
` (10 preceding siblings ...)
2024-03-04 16:41 ` pinskia at gcc dot gnu.org
@ 2024-03-04 22:55 ` pinskia at gcc dot gnu.org
11 siblings, 0 replies; 12+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-04 22:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|wilco at gcc dot gnu.org |pinskia at gcc dot gnu.org
--- Comment #14 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> __builtin_popcount (x) == 1 into x == (x & -x)
Actually that should be `__builtin_popcount (x) <= 1`
Anyways I am going to implement the rest here due to PR 94787 .
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-03-04 22:55 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-90693-4@http.gcc.gnu.org/bugzilla/>
2021-08-10 23:15 ` [Bug tree-optimization/90693] Missing popcount simplifications pinskia at gcc dot gnu.org
2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
2023-11-20 9:04 ` cvs-commit at gcc dot gnu.org
2023-11-20 14:10 ` wilco at gcc dot gnu.org
2023-11-20 17:01 ` jakub at gcc dot gnu.org
2024-01-01 12:41 ` piotrsiupa at gmail dot com
2024-01-03 15:34 ` jakub at gcc dot gnu.org
2024-01-05 10:17 ` cvs-commit at gcc dot gnu.org
2024-01-07 12:56 ` piotrsiupa at gmail dot com
2024-03-04 7:42 ` pinskia at gcc dot gnu.org
2024-03-04 16:41 ` pinskia at gcc dot gnu.org
2024-03-04 22:55 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).