public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2
@ 2020-06-19 9:35 gabravier at gmail dot com
2020-06-19 9:45 ` [Bug target/95762] " rguenth at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: gabravier at gmail dot com @ 2020-06-19 9:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
Bug ID: 95762
Summary: Failure to optimize __builtin_convertvector from
vector of 16 chars to vector of 16 shorts in a single
instruction on AVX2
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
typedef int8_t v16i8 __attribute__((vector_size(16)));
typedef int16_t v16i16 __attribute__((vector_size(32)));
auto f(v16i8 a)
{
return __builtin_convertvector(a, v16i16);
}
With -O3 -mavx2, LLVM outputs this :
f(signed char __vector(16)):
vpmovsxbw ymm0, xmm0
ret
GCC outputs this :
f(signed char __vector(16)):
vpmovsxbw xmm1, xmm0
vpsrldq xmm0, xmm0, 8
vpmovsxbw xmm0, xmm0
vinserti128 ymm0, ymm1, xmm0, 0x1
ret
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2
2020-06-19 9:35 [Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2 gabravier at gmail dot com
@ 2020-06-19 9:45 ` rguenth at gcc dot gnu.org
2020-06-19 10:09 ` jakub at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-06-19 9:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2020-06-19
Status|UNCONFIRMED |NEW
CC| |jakub at gcc dot gnu.org,
| |rguenth at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
We're currently representing this as a .VEC_CONVERT IFN lowered at veclower
time to
_4 = [vec_unpack_lo_expr] a_1(D);
_5 = [vec_unpack_hi_expr] a_1(D);
_2 = {_4, _5};
rather than using a NOP_EXPR as would be possible now. I suppose we should
remove .VEC_CONVERT again for vector integer conversions and directly
use NOP_EXPRs plus make sure to lower those when not supported. Not
sure if __builtin_convertvector also supports integer<->float conversions.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2
2020-06-19 9:35 [Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2 gabravier at gmail dot com
2020-06-19 9:45 ` [Bug target/95762] " rguenth at gcc dot gnu.org
@ 2020-06-19 10:09 ` jakub at gcc dot gnu.org
2020-06-19 10:22 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-19 10:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> We're currently representing this as a .VEC_CONVERT IFN lowered at veclower
> time to
>
> _4 = [vec_unpack_lo_expr] a_1(D);
> _5 = [vec_unpack_hi_expr] a_1(D);
> _2 = {_4, _5};
>
> rather than using a NOP_EXPR as would be possible now. I suppose we should
> remove .VEC_CONVERT again for vector integer conversions and directly
> use NOP_EXPRs plus make sure to lower those when not supported. Not
> sure if __builtin_convertvector also supports integer<->float conversions.
__builtin_convertvector does support integer<->float conversions too.
I'd say we should just fold .VEC_CONVERT to something more appropriate if the
conditions are right (e.g. if an optab says it is possible to do it in a
different way that will also survive veclower) and otherwise keep it as is.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2
2020-06-19 9:35 [Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2 gabravier at gmail dot com
2020-06-19 9:45 ` [Bug target/95762] " rguenth at gcc dot gnu.org
2020-06-19 10:09 ` jakub at gcc dot gnu.org
@ 2020-06-19 10:22 ` jakub at gcc dot gnu.org
2020-06-19 10:43 ` rguenther at suse dot de
2020-06-19 11:05 ` jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-19 10:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
And I should note, because of offloading, it would be better to do that kind of
folding only after_inlining.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2
2020-06-19 9:35 [Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2 gabravier at gmail dot com
` (2 preceding siblings ...)
2020-06-19 10:22 ` jakub at gcc dot gnu.org
@ 2020-06-19 10:43 ` rguenther at suse dot de
2020-06-19 11:05 ` jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenther at suse dot de @ 2020-06-19 10:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
--- Comment #4 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 19 Jun 2020, jakub at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
>
> --- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> And I should note, because of offloading, it would be better to do that kind of
> folding only after_inlining.
Hmm, does that also hold for all the vector permute/ctor "optimizations"
forwprop does?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/95762] Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2
2020-06-19 9:35 [Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2 gabravier at gmail dot com
` (3 preceding siblings ...)
2020-06-19 10:43 ` rguenther at suse dot de
@ 2020-06-19 11:05 ` jakub at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-06-19 11:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95762
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I'd say anything that depends on optabs if possible.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-06-19 11:05 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-19 9:35 [Bug target/95762] New: Failure to optimize __builtin_convertvector from vector of 16 chars to vector of 16 shorts in a single instruction on AVX2 gabravier at gmail dot com
2020-06-19 9:45 ` [Bug target/95762] " rguenth at gcc dot gnu.org
2020-06-19 10:09 ` jakub at gcc dot gnu.org
2020-06-19 10:22 ` jakub at gcc dot gnu.org
2020-06-19 10:43 ` rguenther at suse dot de
2020-06-19 11:05 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).