public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible @ 2022-02-02 23:40 gabravier at gmail dot com 2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org ` (3 more replies) 0 siblings, 4 replies; 5+ messages in thread From: gabravier at gmail dot com @ 2022-02-02 23:40 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357 Bug ID: 104357 Summary: [Aarch64] Failure to use csinv instead of mvn+csel where possible Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: gabravier at gmail dot com Target Milestone: --- unsigned char stbi__clamp(int x) { if ((unsigned)x > 255) { if (x < 0) return 0; if (x > 255) return 255; } return x; } With -O3, GCC outputs this (on aarch64): stbi__clamp(int): mvn w1, w0 cmp w0, 256 and w0, w0, 255 asr w1, w1, 31 and w1, w1, 255 csel w0, w0, w1, cc ret LLVM instead outputs this: stbi__clamp(int): asr w8, w0, #31 cmp w0, #255 csinv w0, w0, w8, ls ret I don't know if the `and`s are there because of ABI differences, but it seems to me like the `mvn` can definitely be replaced by using `csinv` instead of `csel`. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] [Aarch64] Failure to use csinv instead of mvn+csel where possible 2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com @ 2022-02-02 23:51 ` pinskia at gcc dot gnu.org 2022-02-03 0:02 ` pinskia at gcc dot gnu.org ` (2 subsequent siblings) 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2022-02-02 23:51 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Last reconfirmed| |2022-02-02 Component|target |tree-optimization Ever confirmed|0 |1 --- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> --- This will get GCC closer to what clang/LLVM produces: unsigned char stbi__clamp(int x) { int t = x; if ((unsigned)x > 255) { if (x < 0) t = 0; else if (x > 255) t = -1; } return t; } ---- CUT ---- The zero-extends are due to the cast not being outside of the csel and the RTL level is not really good at cross bb optimizations. The gimple level looks like: <bb 2> [local count: 1073741824]: x.0_1 = (unsigned int) x_3(D); if (x.0_1 > 255) goto <bb 3>; [50.00%] else goto <bb 4>; [50.00%] <bb 3> [local count: 536870913]: _7 = x_3(D) >= 0; _6 = (unsigned char) _7; _8 = -_6; goto <bb 5>; [100.00%] <bb 4> [local count: 536870913]: _4 = (unsigned char) x_3(D); <bb 5> [local count: 1073741824]: # _2 = PHI <_8(3), _4(4)> return _2; Which in theory could be improved to the what I gave above. The gimple level has no knowledge of the rtl/target level that to do - in unsigned, you need to a zero extend still. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] [Aarch64] Failure to use csinv instead of mvn+csel where possible 2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com 2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org @ 2022-02-03 0:02 ` pinskia at gcc dot gnu.org 2022-02-03 0:05 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org 2023-05-04 5:37 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2022-02-03 0:02 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement Assignee|unassigned at gcc dot gnu.org |pinskia at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- One thing I should note: _7 = x_3(D) >= 0; _6 = (unsigned char) _7; _8 = -_6; Should be done on the gimple level as: t = x_3(D) >> (sizeof(x_3(D))*8 - 1) _8 = (unsigned char)t; And then we can factor out the cast and I think it will produce the same code. And yes it does, that is: unsigned char stbi__clamp(int x) { int t = x; if ((unsigned)x > 255) { t = x >> 31; } return t; } So Mine for GCC 13. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] Failure to use csinv instead of mvn+csel where possible 2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com 2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org 2022-02-03 0:02 ` pinskia at gcc dot gnu.org @ 2022-02-03 0:05 ` pinskia at gcc dot gnu.org 2023-05-04 5:37 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2022-02-03 0:05 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357 Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|[Aarch64] Failure to use |Failure to use csinv |csinv instead of mvn+csel |instead of mvn+csel where |where possible |possible --- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Oh by the way it looks like LLVM does not do a good job on x86 either. But with my idea, GCC will do better than LLVM on x86 even. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/104357] Failure to use csinv instead of mvn+csel where possible 2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com ` (2 preceding siblings ...) 2022-02-03 0:05 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org @ 2023-05-04 5:37 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2023-05-04 5:37 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104357 --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Andrew Pinski from comment #2) > One thing I should note: > _7 = x_3(D) >= 0; > _6 = (unsigned char) _7; > _8 = -_6; > > Should be done on the gimple level as: > t = x_3(D) >> (sizeof(x_3(D))*8 - 1) > _8 = (unsigned char)t; Actually it is: _2 = x_3(D) >> 31; _5 = ~_2; _8 = (unsigned char) _5; ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-05-04 5:37 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-02-02 23:40 [Bug target/104357] New: [Aarch64] Failure to use csinv instead of mvn+csel where possible gabravier at gmail dot com 2022-02-02 23:51 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org 2022-02-03 0:02 ` pinskia at gcc dot gnu.org 2022-02-03 0:05 ` [Bug tree-optimization/104357] " pinskia at gcc dot gnu.org 2023-05-04 5:37 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).