public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal
@ 2021-07-13 10:55 ubizjak at gmail dot com
2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: ubizjak at gmail dot com @ 2021-07-13 10:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434
Bug ID: 101434
Summary: vector-by-vector left shift expansion for char/short
is not optimal
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase:
--cut here--
short r[8], a[8], b[8];
void f1 (void)
{
int i;
for (i = 0; i < 8; i++)
r[i] = a[i] << b[i];
}
--cut here--
compiles with -O2 -ftree-vectorize -mxop to:
vmovdqa a(%rip), %xmm0
vmovdqa b(%rip), %xmm1
vpmovsxwd %xmm0, %xmm2
vpsrldq $8, %xmm0, %xmm0
vpmovsxwd %xmm1, %xmm3
vpsrldq $8, %xmm1, %xmm1
vpshad %xmm3, %xmm2, %xmm2
vpmovsxwd %xmm0, %xmm0
vpmovsxwd %xmm1, %xmm1
vpshad %xmm1, %xmm0, %xmm0
vpperm .LC0(%rip), %xmm0, %xmm2, %xmm2
vmovdqa %xmm2, r(%rip)
ret
SImode vpshad is used together with lots of other instructions, but a HImode
vpshaw should be emitted instead.
Similar testcase:
--cut here--
short r[8], a[8], b[8];
void f2 (void)
{
int i;
for (i = 0; i < 8; i++)
r[i] = a[i] >> b[i];
}
--cut here--
results in expected HImode vect-by-vect shift insn:
vpxor %xmm0, %xmm0, %xmm0
vpsubw b(%rip), %xmm0, %xmm0
vpshaw %xmm0, a(%rip), %xmm0
vmovdqa %xmm0, r(%rip)
ret
(do not bother with vpxor and vpsubw, these are just one of XOP peculiarities.)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal
2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com
@ 2021-07-13 12:15 ` rguenth at gcc dot gnu.org
2021-07-13 12:20 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-13 12:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2021-07-13
Target| |x86_64-*-* i?86-*-*
Blocks| |53947
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Keywords| |missed-optimization
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Probably low priority if not doable nicely w/o XOP.
Note this is mainly due to integer promotion rules (we see shifts of int by
int)
and fear of introducing undefined behavior (the int by int shift has larger
valid ranges for the RHS than a truncated one).
There must be a duplicate bugreport.
IMHO we might consider to make shifts of smaller than int types with
out of bound shift amounts well-defined. I think there's no way to
rewrite types to avoid the undefined behavior like we can do with
signed arithmetic -> unsigned arithmetic (besides division by -1 where
the sign matters).
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal
2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com
2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org
@ 2021-07-13 12:20 ` rguenth at gcc dot gnu.org
2021-07-13 12:23 ` ubizjak at gmail dot com
2021-08-25 3:27 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-13 12:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
So technically
(int)short-var << a
-> short-var << (min (a, 15))
we know a is <= 31 because of the int shift (and >= 0) but we cannot simply
emit short-var << a because how the target behaves is not well-defined
(SHIFT_COUNT_TRUNCATED) but the behavior is well-defined for the int << int
shift. Pattern recog has code to deal with this in theory but it gives up
here and does not bother to emit a min ().
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal
2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com
2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org
2021-07-13 12:20 ` rguenth at gcc dot gnu.org
@ 2021-07-13 12:23 ` ubizjak at gmail dot com
2021-08-25 3:27 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: ubizjak at gmail dot com @ 2021-07-13 12:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434
--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> Probably low priority if not doable nicely w/o XOP.
-mxop can be substituted with -mavx512bw -mavx512vl for the same effect.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug tree-optimization/101434] vector-by-vector left shift expansion for char/short is not optimal
2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com
` (2 preceding siblings ...)
2021-07-13 12:23 ` ubizjak at gmail dot com
@ 2021-08-25 3:27 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-25 3:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101434
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #3)
> (In reply to Richard Biener from comment #1)
> > Probably low priority if not doable nicely w/o XOP.
>
> -mxop can be substituted with -mavx512bw -mavx512vl for the same effect.
or -mavx2.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-25 3:27 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-13 10:55 [Bug tree-optimization/101434] New: vector-by-vector left shift expansion for char/short is not optimal ubizjak at gmail dot com
2021-07-13 12:15 ` [Bug tree-optimization/101434] " rguenth at gcc dot gnu.org
2021-07-13 12:20 ` rguenth at gcc dot gnu.org
2021-07-13 12:23 ` ubizjak at gmail dot com
2021-08-25 3:27 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).