public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible
@ 2022-02-02 23:40 gabravier at gmail dot com
2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: gabravier at gmail dot com @ 2022-02-02 23:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357
Bug ID: 104357
Summary: [Aarch64] Failure to use csinv instead of mvn+csel
where possible
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
unsigned char stbi__clamp(int x)
{
if ((unsigned)x > 255) {
if (x < 0) return 0;
if (x > 255) return 255;
}
return x;
}
With -O3, GCC outputs this (on aarch64):
stbi__clamp(int):
mvn w1, w0
cmp w0, 256
and w0, w0, 255
asr w1, w1, 31
and w1, w1, 255
csel w0, w0, w1, cc
ret
LLVM instead outputs this:
stbi__clamp(int):
asr w8, w0, #31
cmp w0, #255
csinv w0, w0, w8, ls
ret
I don't know if the `and`s are there because of ABI differences, but it seems
to me like the `mvn` can definitely be replaced by using `csinv` instead of
`csel`.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] [Aarch64] Failure to use csinv instead of mvn+csel where possible
2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com
@ 2022-02-02 23:51 ` pinskia at gcc dot gnu.org
2022-02-03 0:02 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-02 23:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2022-02-02
Component|target |tree-optimization
Ever confirmed|0 |1
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This will get GCC closer to what clang/LLVM produces:
unsigned char stbi__clamp(int x)
{
int t = x;
if ((unsigned)x > 255) {
if (x < 0) t = 0;
else if (x > 255) t = -1;
}
return t;
}
---- CUT ----
The zero-extends are due to the cast not being outside of the csel and the RTL
level is not really good at cross bb optimizations.
The gimple level looks like:
<bb 2> [local count: 1073741824]:
x.0_1 = (unsigned int) x_3(D);
if (x.0_1 > 255)
goto <bb 3>; [50.00%]
else
goto <bb 4>; [50.00%]
<bb 3> [local count: 536870913]:
_7 = x_3(D) >= 0;
_6 = (unsigned char) _7;
_8 = -_6;
goto <bb 5>; [100.00%]
<bb 4> [local count: 536870913]:
_4 = (unsigned char) x_3(D);
<bb 5> [local count: 1073741824]:
# _2 = PHI <_8(3), _4(4)>
return _2;
Which in theory could be improved to the what I gave above.
The gimple level has no knowledge of the rtl/target level that to do - in
unsigned, you need to a zero extend still.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] [Aarch64] Failure to use csinv instead of mvn+csel where possible
2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com
2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org
@ 2022-02-03 0:02 ` pinskia at gcc dot gnu.org
2022-02-03 0:05 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org
2023-05-04 5:37 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-03 0:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
One thing I should note:
_7 = x_3(D) >= 0;
_6 = (unsigned char) _7;
_8 = -_6;
Should be done on the gimple level as:
t = x_3(D) >> (sizeof(x_3(D))*8 - 1)
_8 = (unsigned char)t;
And then we can factor out the cast and I think it will produce the same code.
And yes it does, that is:
unsigned char stbi__clamp(int x)
{
int t = x;
if ((unsigned)x > 255) {
t = x >> 31;
}
return t;
}
So Mine for GCC 13.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] Failure to use csinv instead of mvn+csel where possible
2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com
2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org
2022-02-03 0:02 ` pinskia at gcc dot gnu.org
@ 2022-02-03 0:05 ` pinskia at gcc dot gnu.org
2023-05-04 5:37 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-02-03 0:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|[Aarch64] Failure to use |Failure to use csinv
|csinv instead of mvn+csel |instead of mvn+csel where
|where possible |possible
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Oh by the way it looks like LLVM does not do a good job on x86 either. But with
my idea, GCC will do better than LLVM on x86 even.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] Failure to use csinv instead of mvn+csel where possible
2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com
` (2 preceding siblings ...)
2022-02-03 0:05 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org
@ 2023-05-04 5:37 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-04 5:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> One thing I should note:
> _7 = x_3(D) >= 0;
> _6 = (unsigned char) _7;
> _8 = -_6;
>
> Should be done on the gimple level as:
> t = x_3(D) >> (sizeof(x_3(D))*8 - 1)
> _8 = (unsigned char)t;
Actually it is:
_2 = x_3(D) >> 31;
_5 = ~_2;
_8 = (unsigned char) _5;
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-04 5:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com
2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org
2022-02-03 0:02 ` pinskia at gcc dot gnu.org
2022-02-03 0:05 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org
2023-05-04 5:37 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).