public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal @ 2021-07-13 10:55 ubizjak at gmail dot com 2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org ` (3 more replies) 0 siblings, 4 replies; 5+ messages in thread From: ubizjak at gmail dot com @ 2021-07-13 10:55 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434 Bug ID: 101434 Summary: vector-by-vector left shift expansion for char/short is not optimal Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcase: --cut here-- short r[8], a[8], b[8]; void f1 (void) { int i; for (i = 0; i < 8; i++) r[i] = a[i] << b[i]; } --cut here-- compiles with -O2 -ftree-vectorize -mxop to: vmovdqa a(%rip), %xmm0 vmovdqa b(%rip), %xmm1 vpmovsxwd %xmm0, %xmm2 vpsrldq $8, %xmm0, %xmm0 vpmovsxwd %xmm1, %xmm3 vpsrldq $8, %xmm1, %xmm1 vpshad %xmm3, %xmm2, %xmm2 vpmovsxwd %xmm0, %xmm0 vpmovsxwd %xmm1, %xmm1 vpshad %xmm1, %xmm0, %xmm0 vpperm .LC0(%rip), %xmm0, %xmm2, %xmm2 vmovdqa %xmm2, r(%rip) ret SImode vpshad is used together with lots of other instructions, but a HImode vpshaw should be emitted instead. Similar testcase: --cut here-- short r[8], a[8], b[8]; void f2 (void) { int i; for (i = 0; i < 8; i++) r[i] = a[i] >> b[i]; } --cut here-- results in expected HImode vect-by-vect shift insn: vpxor %xmm0, %xmm0, %xmm0 vpsubw b(%rip), %xmm0, %xmm0 vpshaw %xmm0, a(%rip), %xmm0 vmovdqa %xmm0, r(%rip) ret (do not bother with vpxor and vpsubw, these are just one of XOP peculiarities.) ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal 2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com @ 2021-07-13 12:15 ` rguenth at gcc dot gnu.org 2021-07-13 12:20 ` rguenth at gcc dot gnu.org ` (2 subsequent siblings) 3 siblings, 0 replies; 5+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-07-13 12:15 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2021-07-13 Target| |x86_64-*-* i?86-*-* Blocks| |53947 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Keywords| |missed-optimization --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- Probably low priority if not doable nicely w/o XOP. Note this is mainly due to integer promotion rules (we see shifts of int by int) and fear of introducing undefined behavior (the int by int shift has larger valid ranges for the RHS than a truncated one). There must be a duplicate bugreport. IMHO we might consider to make shifts of smaller than int types with out of bound shift amounts well-defined. I think there's no way to rewrite types to avoid the undefined behavior like we can do with signed arithmetic -> unsigned arithmetic (besides division by -1 where the sign matters). Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal 2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com 2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org @ 2021-07-13 12:20 ` rguenth at gcc dot gnu.org 2021-07-13 12:23 ` ubizjak at gmail dot com 2021-08-25 3:27 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-07-13 12:20 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- So technically (int)short-var << a -> short-var << (min (a, 15)) we know a is <= 31 because of the int shift (and >= 0) but we cannot simply emit short-var << a because how the target behaves is not well-defined (SHIFT_COUNT_TRUNCATED) but the behavior is well-defined for the int << int shift. Pattern recog has code to deal with this in theory but it gives up here and does not bother to emit a min (). ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal 2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com 2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org 2021-07-13 12:20 ` rguenth at gcc dot gnu.org @ 2021-07-13 12:23 ` ubizjak at gmail dot com 2021-08-25 3:27 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: ubizjak at gmail dot com @ 2021-07-13 12:23 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434 --- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Richard Biener from comment #1) > Probably low priority if not doable nicely w/o XOP. -mxop can be substituted with -mavx512bw -mavx512vl for the same effect. ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal 2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com ` (2 preceding siblings ...) 2021-07-13 12:23 ` ubizjak at gmail dot com @ 2021-08-25 3:27 ` pinskia at gcc dot gnu.org 3 siblings, 0 replies; 5+ messages in thread From: pinskia at gcc dot gnu.org @ 2021-08-25 3:27 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434 --- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Uroš Bizjak from comment #3) > (In reply to Richard Biener from comment #1) > > Probably low priority if not doable nicely w/o XOP. > > -mxop can be substituted with -mavx512bw -mavx512vl for the same effect. or -mavx2. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-25 3:27 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com 2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org 2021-07-13 12:20 ` rguenth at gcc dot gnu.org 2021-07-13 12:23 ` ubizjak at gmail dot com 2021-08-25 3:27 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).