public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16
@ 2020-06-26 1:11 gabravier at gmail dot com
2020-06-26 1:14 ` [Bug target/95905] " gabravier at gmail dot com
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: gabravier at gmail dot com @ 2020-06-26 1:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95905
Bug ID: 95905
Summary: Failure to optimize _mm_unpacklo_epi8 with 0 as right
operand to _mm_cvtepu8_epi16
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
__m128i f(__m128i a)
{
return _mm_unpacklo_epi8(a, _mm_setzero_si128());
}
This can be optimized to `return _mm_cvtepu8_epi16(a);` (with -msse4`). LLVM
does this transformation, but GCC does not.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/95905] Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16
2020-06-26 1:11 [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16 gabravier at gmail dot com
@ 2020-06-26 1:14 ` gabravier at gmail dot com
2021-01-13 7:06 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: gabravier at gmail dot com @ 2020-06-26 1:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95905
--- Comment #1 from Gabriel Ravier <gabravier at gmail dot com> ---
The same pattern with _mm_unpacklo_epi16/32 and the corresponding SSE4
intrinsics can also be optimized in the same way.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/95905] Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16
2020-06-26 1:11 [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16 gabravier at gmail dot com
2020-06-26 1:14 ` [Bug target/95905] " gabravier at gmail dot com
@ 2021-01-13 7:06 ` cvs-commit at gcc dot gnu.org
2021-01-13 10:36 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-13 7:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95905
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:b668a06e37f72fd96bacd6769990ec97dac4ac6d
commit r11-6628-gb668a06e37f72fd96bacd6769990ec97dac4ac6d
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Jan 13 08:02:54 2021 +0100
i386: Optimize _mm_unpacklo_epi8 of 0 vector as second argument or similar
VEC_PERM_EXPRs into pmovzx [PR95905]
The following patch adds patterns (so far 128-bit only) for permutations
like { 0 16 1 17 2 18 3 19 4 20 5 21 6 22 7 23 } where the second
operand is CONST0_RTX CONST_VECTOR to be emitted as pmovzx.
2021-01-13 Jakub Jelinek <jakub@redhat.com>
PR target/95905
* config/i386/predicates.md (pmovzx_parallel): New predicate.
* config/i386/sse.md (*sse4_1_zero_extendv8qiv8hi2_3): New
define_insn_and_split pattern.
(*sse4_1_zero_extendv4hiv4si2_3): Likewise.
(*sse4_1_zero_extendv2siv2di2_3): Likewise.
* gcc.target/i386/pr95905-1.c: New test.
* gcc.target/i386/pr95905-2.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/95905] Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16
2020-06-26 1:11 [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16 gabravier at gmail dot com
2020-06-26 1:14 ` [Bug target/95905] " gabravier at gmail dot com
2021-01-13 7:06 ` cvs-commit at gcc dot gnu.org
@ 2021-01-13 10:36 ` cvs-commit at gcc dot gnu.org
2021-01-13 10:37 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-13 10:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95905
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:b1d1e2b54c6b9cf13f021176ba37d24cc4dc2fe1
commit r11-6636-gb1d1e2b54c6b9cf13f021176ba37d24cc4dc2fe1
Author: Jakub Jelinek <jakub@redhat.com>
Date: Wed Jan 13 11:28:48 2021 +0100
i386, expand: Optimize also 256-bit and 512-bit permutatations as vpmovzx
if possible [PR95905]
The following patch implements what I've talked about, i.e. to no longer
force operands of vec_perm_const into registers in the generic code, but
let
each of the (currently 8) targets force it into registers individually,
giving the targets better control on if it does that and when and allowing
them to do something special with some particular operands.
And then defines the define_insn_and_split for the 256-bit and 512-bit
permutations into vpmovzx* (only the bw, wd and dq cases, in theory we
could
add define_insn_and_split patterns also for the bd, bq and wq).
2021-01-13 Jakub Jelinek <jakub@redhat.com>
PR target/95905
* optabs.c (expand_vec_perm_const): Don't force v0 and v1 into
registers before calling targetm.vectorize.vec_perm_const, only
after
that.
* config/i386/i386-expand.c (ix86_vectorize_vec_perm_const): Handle
two argument permutation when one operand is zero vector and only
after that force operands into registers.
* config/i386/sse.md (*avx2_zero_extendv16qiv16hi2_1): New
define_insn_and_split pattern.
(*avx512bw_zero_extendv32qiv32hi2_1): Likewise.
(*avx512f_zero_extendv16hiv16si2_1): Likewise.
(*avx2_zero_extendv8hiv8si2_1): Likewise.
(*avx512f_zero_extendv8siv8di2_1): Likewise.
(*avx2_zero_extendv4siv4di2_1): Likewise.
* config/mips/mips.c (mips_vectorize_vec_perm_const): Force
operands
into registers.
* config/arm/arm.c (arm_vectorize_vec_perm_const): Likewise.
* config/sparc/sparc.c (sparc_vectorize_vec_perm_const): Likewise.
* config/ia64/ia64.c (ia64_vectorize_vec_perm_const): Likewise.
* config/aarch64/aarch64.c (aarch64_vectorize_vec_perm_const):
Likewise.
* config/rs6000/rs6000.c (rs6000_vectorize_vec_perm_const):
Likewise.
* config/gcn/gcn.c (gcn_vectorize_vec_perm_const): Likewise. Use
std::swap.
* gcc.target/i386/pr95905-2.c: Use scan-assembler-times instead of
scan-assembler. Add tests with zero vector as first
__builtin_shuffle
operand.
* gcc.target/i386/pr95905-3.c: New test.
* gcc.target/i386/pr95905-4.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/95905] Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16
2020-06-26 1:11 [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16 gabravier at gmail dot com
` (2 preceding siblings ...)
2021-01-13 10:36 ` cvs-commit at gcc dot gnu.org
@ 2021-01-13 10:37 ` jakub at gcc dot gnu.org
2021-08-21 18:30 ` pinskia at gcc dot gnu.org
2021-08-21 18:30 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-13 10:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95905
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
CC| |jakub at gcc dot gnu.org
Status|UNCONFIRMED |RESOLVED
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/95905] Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16
2020-06-26 1:11 [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16 gabravier at gmail dot com
` (3 preceding siblings ...)
2021-01-13 10:37 ` jakub at gcc dot gnu.org
@ 2021-08-21 18:30 ` pinskia at gcc dot gnu.org
2021-08-21 18:30 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-21 18:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95905
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/95905] Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16
2020-06-26 1:11 [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16 gabravier at gmail dot com
` (4 preceding siblings ...)
2021-08-21 18:30 ` pinskia at gcc dot gnu.org
@ 2021-08-21 18:30 ` pinskia at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-21 18:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95905
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |linux at carewolf dot com
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 78563 has been marked as a duplicate of this bug. ***
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-08-21 18:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-26 1:11 [Bug target/95905] New: Failure to optimize _mm_unpacklo_epi8 with 0 as right operand to _mm_cvtepu8_epi16 gabravier at gmail dot com
2020-06-26 1:14 ` [Bug target/95905] " gabravier at gmail dot com
2021-01-13 7:06 ` cvs-commit at gcc dot gnu.org
2021-01-13 10:36 ` cvs-commit at gcc dot gnu.org
2021-01-13 10:37 ` jakub at gcc dot gnu.org
2021-08-21 18:30 ` pinskia at gcc dot gnu.org
2021-08-21 18:30 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).