public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
@ 2020-03-26 15:24 kretz at kde dot org
2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
` (18 more replies)
0 siblings, 19 replies; 20+ messages in thread
From: kretz at kde dot org @ 2020-03-26 15:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
Bug ID: 94343
Summary: [10 Regression] invalid AVX512VL vpternlogd
instruction emitted for -march=knl
Product: gcc
Version: 10.0
Status: UNCONFIRMED
Keywords: missed-optimization, wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: kretz at kde dot org
Target Milestone: ---
Target: i386,x86-64
Test case (`-O1 -march=knl`, cf. https://godbolt.org/z/qQc3Sf):
using W [[gnu::vector_size(16)]] = long long;
using V [[gnu::vector_size(16)]] = int;
auto f(V a) {
return __builtin_ia32_pandn128(reinterpret_cast<W>(~V() ^ a), ~W());
}
This emits a XMM variant of `vpternlogd` which requires AVX512VL. But it was
supposed to compile for KNL.
Besides the bug, there's also a missed optimization here: `~V() ^ a` flips all
bits and pandn flips all bits again. Thus it should compile to a single `ret`
instruction. Note that the variation:
auto f(V a) {
return ~reinterpret_cast<W>(~V() ^ a) & ~W();
}
compiles to
vpternlogd xmm0, xmm0, xmm0, 0x55
vpternlogq xmm0, xmm0, xmm0, 0x55
for KNL.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
@ 2020-03-26 15:39 ` marxin at gcc dot gnu.org
2020-03-26 15:58 ` jakub at gcc dot gnu.org
` (17 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-03-26 15:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jbeulich at suse dot com,
| |marxin at gcc dot gnu.org
Last reconfirmed| |2020-03-26
Ever confirmed|0 |1
Target Milestone|--- |10.0
Status|UNCONFIRMED |NEW
Priority|P3 |P1
Known to work| |9.3.0
Known to fail| |10.0
--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
You are right, AVX512VL is not supported with knl.
It started with r10-2016-gff8f129bc2f57fdf:
vpternlogd $0x55, %xmm0, %xmm0, %xmm0
vpcmpeqd %xmm1, %xmm1, %xmm1
vpandn %xmm1, %xmm0, %xmm0
ret
before the revision we emitted:
vpcmpeqd %xmm1, %xmm1, %xmm1
vpxor %xmm1, %xmm0, %xmm0
vpcmpeqd %xmm1, %xmm1, %xmm1
vpandn %xmm1, %xmm0, %xmm0
ret
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
@ 2020-03-26 15:58 ` jakub at gcc dot gnu.org
2020-03-26 16:34 ` hjl.tools at gmail dot com
` (16 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 15:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Looking.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
2020-03-26 15:58 ` jakub at gcc dot gnu.org
@ 2020-03-26 16:34 ` hjl.tools at gmail dot com
2020-03-26 16:41 ` jakub at gcc dot gnu.org
` (15 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: hjl.tools at gmail dot com @ 2020-03-26 16:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 48126
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48126&action=edit
A patch
Jakub, this is what I have. Feel free to ignore it.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (2 preceding siblings ...)
2020-03-26 16:34 ` hjl.tools at gmail dot com
@ 2020-03-26 16:41 ` jakub at gcc dot gnu.org
2020-03-26 17:08 ` jakub at gcc dot gnu.org
` (14 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 16:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I was thinking about
--- gcc/config/i386/sse.md.jj 2020-03-06 11:35:46.284074858 +0100
+++ gcc/config/i386/sse.md 2020-03-26 17:35:23.690515228 +0100
@@ -12800,10 +12800,18 @@
(xor:VI (match_operand:VI 1 "nonimmediate_operand" "vm")
(match_operand:VI 2 "vector_all_ones_operand" "BC")))]
"TARGET_AVX512F"
- "vpternlog<ternlogsuffix>\t{$0x55, %1, %0,
%0<mask_operand3>|%0<mask_operand3>, %0, %1, 0x55}"
+{
+ if (TARGET_AVX512VL)
+ return "vpternlog<ternlogsuffix>\t{$0x55, %1, %0,
%0<mask_operand3>|%0<mask_operand3>, %0, %1, 0x55}";
+ else
+ return "vpternlog<ternlogsuffix>\t{$0x55, %g1, %g0,
%g0<mask_operand3>|%g0<mask_operand3>, %g0, %g1, 0x55}";
+}
[(set_attr "type" "sselog")
(set_attr "prefix" "evex")
- (set_attr "mode" "<sseinsnmode>")])
+ (set (attr "mode")
+ (if_then_else (match_test "TARGET_AVX512VL")
+ (const_string "<sseinsnmode>")
+ (const_string "XI")))])
(define_expand "<sse2_avx2>_andnot<mode>3"
[(set (match_operand:VI_AVX2 0 "register_operand")
instead. I'm aware that from performance POV we are trying to avoid 512-bit
vectors, but don't all such affected CPUs support AVX512VL already? Does KNL
care if it will do a 512-bit operation instead of 128-bit or 256-bit?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (3 preceding siblings ...)
2020-03-26 16:41 ` jakub at gcc dot gnu.org
@ 2020-03-26 17:08 ` jakub at gcc dot gnu.org
2020-03-26 17:14 ` hjl.tools at gmail dot com
` (13 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 17:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48127
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48127&action=edit
gcc10-pr94343.patch
That of course doesn't work if the input operand is memory. This should.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (4 preceding siblings ...)
2020-03-26 17:08 ` jakub at gcc dot gnu.org
@ 2020-03-26 17:14 ` hjl.tools at gmail dot com
2020-03-26 17:52 ` jakub at gcc dot gnu.org
` (12 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: hjl.tools at gmail dot com @ 2020-03-26 17:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Jakub Jelinek from comment #5)
> Created attachment 48127 [details]
> gcc10-pr94343.patch
>
> That of course doesn't work if the input operand is memory. This should.
LGTM. Thanks.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (5 preceding siblings ...)
2020-03-26 17:14 ` hjl.tools at gmail dot com
@ 2020-03-26 17:52 ` jakub at gcc dot gnu.org
2020-03-26 17:59 ` jakub at gcc dot gnu.org
` (11 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 17:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Though, there are other issues. There is only vpternlog{d,q}, so for
V*[QH]Imode we shouldn't pretend we have masking support.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (6 preceding siblings ...)
2020-03-26 17:52 ` jakub at gcc dot gnu.org
@ 2020-03-26 17:59 ` jakub at gcc dot gnu.org
2020-03-26 19:30 ` kretz at kde dot org
` (10 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 17:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #48127|0 |1
is obsolete| |
--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48128
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48128&action=edit
gcc10-pr94343.patch
Updated patch. A variant to that would be 4 separate patterns I guess, not
sure if that would be shorter.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (7 preceding siblings ...)
2020-03-26 17:59 ` jakub at gcc dot gnu.org
@ 2020-03-26 19:30 ` kretz at kde dot org
2020-03-26 19:44 ` jakub at gcc dot gnu.org
` (9 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: kretz at kde dot org @ 2020-03-26 19:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #9 from Matthias Kretz (Vir) <kretz at kde dot org> ---
(In reply to Jakub Jelinek from comment #8)
> Created attachment 48128 [details]
> gcc10-pr94343.patch
The avx512vl-pr94343.c test should ideally fail because `_mm_andnot_si128
((__m128i) (~v ^ a), (__m128i) ~w)` is equal to `a`. Maybe the test should
simply `return ~v ^ a;` and thus emit a single vpternlogd?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (8 preceding siblings ...)
2020-03-26 19:30 ` kretz at kde dot org
@ 2020-03-26 19:44 ` jakub at gcc dot gnu.org
2020-03-27 7:00 ` jbeulich at suse dot com
` (8 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 19:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
--- gcc/testsuite/gcc.target/i386/avx512f-pr94343.c.jj 2020-03-26
17:47:40.008654504 +0100
+++ gcc/testsuite/gcc.target/i386/avx512f-pr94343.c 2020-03-26
17:48:37.169811375 +0100
@@ -0,0 +1,12 @@
+/* PR target/94343 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mno-avx512vl" } */
+/* { dg-final { scan-assembler-not "vpternlogd\[^\n\r]*xmm\[0-9]*" } } */
+
+typedef int __v4si __attribute__((vector_size (16)));
+
+__v4si
+foo (__v4si a)
+{
+ return ~a;
+}
--- gcc/testsuite/gcc.target/i386/avx512vl-pr94343.c.jj 2020-03-26
17:48:53.232573115 +0100
+++ gcc/testsuite/gcc.target/i386/avx512vl-pr94343.c 2020-03-26
17:49:08.034352968 +0100
@@ -0,0 +1,12 @@
+/* PR target/94343 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl" } */
+/* { dg-final { scan-assembler "vpternlogd\[^\n\r]*xmm\[0-9]*" } } */
+
+typedef int __v4si __attribute__((vector_size (16)));
+
+__v4si
+foo (__v4si a)
+{
+ return ~a;
+}
is enough.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (9 preceding siblings ...)
2020-03-26 19:44 ` jakub at gcc dot gnu.org
@ 2020-03-27 7:00 ` jbeulich at suse dot com
2020-03-27 7:07 ` jbeulich at suse dot com
` (7 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27 7:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #11 from jbeulich at suse dot com ---
(In reply to Jakub Jelinek from comment #7)
> Though, there are other issues. There is only vpternlog{d,q}, so for
> V*[QH]Imode we shouldn't pretend we have masking support.
Why would this be? The element mode doesn't matter at all for bitwise
operations. Just like there's no VPANDB / VPANDW, but VPANDD/VPANDQ are quite
fine to use on vectors of QI or HI. Afaict the existence of VPAND{D,Q} in
AVX512 (as opposed to {,V}PAND in MMX/SSE2/AVX) is merely an oddity resulting
from EVEX.W handling (besides of course the element width's effect on embedded
broadcasting).
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (10 preceding siblings ...)
2020-03-27 7:00 ` jbeulich at suse dot com
@ 2020-03-27 7:07 ` jbeulich at suse dot com
2020-03-27 7:31 ` jbeulich at suse dot com
` (6 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27 7:07 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #12 from jbeulich at suse dot com ---
(In reply to Jakub Jelinek from comment #8)
> Created attachment 48128 [details]
> gcc10-pr94343.patch
>
> Updated patch. A variant to that would be 4 separate patterns I guess, not
> sure if that would be shorter.
May I suggest that the testcases also check for the 0x55 immediate? The
potential of VPTERNLOG allows for much wider use (with different immediates) in
principle, so the correct choice of immediate would imo better be validated
here as well.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (11 preceding siblings ...)
2020-03-27 7:07 ` jbeulich at suse dot com
@ 2020-03-27 7:31 ` jbeulich at suse dot com
2020-03-27 8:45 ` jakub at gcc dot gnu.org
` (5 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27 7:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #13 from jbeulich at suse dot com ---
As to using 512-bit operations even on more narrow input types - is this
correct when e.g. subsequently source code upcasts the vector? I.e. would such
an upcast be carried out by emitting an insn to zero the upper portion, rather
than simply considering this a re-interpretation of the same register (with no
insn emitted at all)? In which case the fix would apparently boil down to using
a mode iterator different from VI (VI48_AVX512VL if you really mean to exclude
vectors of QI/HI, or a combination of this and VI12_AVX512VL otherwise).
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (12 preceding siblings ...)
2020-03-27 7:31 ` jbeulich at suse dot com
@ 2020-03-27 8:45 ` jakub at gcc dot gnu.org
2020-03-27 9:11 ` jakub at gcc dot gnu.org
` (4 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-27 8:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to jbeulich from comment #13)
> As to using 512-bit operations even on more narrow input types - is this
> correct when e.g. subsequently source code upcasts the vector? I.e. would
> such an upcast be carried out by emitting an insn to zero the upper portion,
> rather than simply considering this a re-interpretation of the same register
> (with no insn emitted at all)? In which case the fix would apparently boil
> down to using a mode iterator different from VI (VI48_AVX512VL if you really
> mean to exclude vectors of QI/HI, or a combination of this and VI12_AVX512VL
> otherwise).
Yes, in RTL a store to a register in some machine mode when the hw register has
wider mode leaves the upper bits undefined, not zero initialized (like e.g.
most of the 32-bit instructions do on 64-bit general purpose registers or many
AVX etc. instructions do on vector registers), we do have a REE pass to handle
some most common cases.
If we added some machine specific pass that would do something like that for
most instructions, we'd need some attribute to mark instructions that don't
have that property, sure, but we don't ATM.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (13 preceding siblings ...)
2020-03-27 8:45 ` jakub at gcc dot gnu.org
@ 2020-03-27 9:11 ` jakub at gcc dot gnu.org
2020-03-27 9:17 ` jbeulich at suse dot com
` (3 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-27 9:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to jbeulich from comment #11)
> (In reply to Jakub Jelinek from comment #7)
> > Though, there are other issues. There is only vpternlog{d,q}, so for
> > V*[QH]Imode we shouldn't pretend we have masking support.
>
> Why would this be? The element mode doesn't matter at all for bitwise
> operations. Just like there's no VPANDB / VPANDW, but VPANDD/VPANDQ are
> quite fine to use on vectors of QI or HI. Afaict the existence of VPAND{D,Q}
> in AVX512 (as opposed to {,V}PAND in MMX/SSE2/AVX) is merely an oddity
> resulting from EVEX.W handling (besides of course the element width's effect
> on embedded broadcasting).
For masked instructions, the element mode is significant, it determines which
bits of the mask register apply to which bits in the destination register.
So, if the masked variant is ever matched (e.g. by combine), then it will
expect to do something different from what the insn will actually do.
(define_insn ("one_cmplv64qi2_mask")
[
(set (match_operand:V64QI 0 ("register_operand") ("=v"))
(vec_merge:V64QI (xor:V64QI (match_operand:V64QI 1
("nonimmediate_operand") ("vm"))
(match_operand:V64QI 2 ("vector_all_ones_operand") ("BC")))
(match_operand:V64QI 3 ("nonimm_or_0_operand") ("0C"))
(match_operand:DI 4 ("register_operand") ("Yk"))))
] ("(TARGET_AVX512F) && ((TARGET_AVX512F) && (TARGET_AVX512BW))")
("vpternlogd\t{$0x55, %1, %0, %0%{%4%}%N3|%0%{%4%}%N3, %0, %1, 0x55}")
[
(set_attr ("type") ("sselog"))
(set_attr ("prefix") ("evex"))
(set_attr ("mode") ("XI"))
(set_attr ("mask") ("no"))
])
The above says in RTL that for say the last operand of 0x5555555555555555ULL,
first 8 bits in the vector will be the result of the ternlog operation, next 8
bits will be cleared or unmodified (depending on operand 3), etc.
The instruction used for that will do something different, will ignore the
upper 48 bits of the mask register and the low 32 bits of the destination will
be the result of the ternlog operation, next 32 bits will be cleared or
unmodified, etc.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (14 preceding siblings ...)
2020-03-27 9:11 ` jakub at gcc dot gnu.org
@ 2020-03-27 9:17 ` jbeulich at suse dot com
2020-03-27 9:21 ` jakub at gcc dot gnu.org
` (2 subsequent siblings)
18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27 9:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #16 from jbeulich at suse dot com ---
Oh, right - I didn't pay attention to the insn permitting masking to be used.
Then still non-masked uses ought to be permitted, perhaps by having a
VI48_AVX512VL one permitting masking and a VI12_AVX512VL one which doesn't?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (15 preceding siblings ...)
2020-03-27 9:17 ` jbeulich at suse dot com
@ 2020-03-27 9:21 ` jakub at gcc dot gnu.org
2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
2020-03-30 16:06 ` jakub at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-27 9:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The patch I've posted -
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542774.html - solves it by
disabling the problematic masked cases in the condition. Or we could just
disable the masking on the insn altogether, the expander will not need those.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (16 preceding siblings ...)
2020-03-27 9:21 ` jakub at gcc dot gnu.org
@ 2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
2020-03-30 16:06 ` jakub at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-03-30 16:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:
https://gcc.gnu.org/g:5abbfd3cd36342df530410033844584d8b85e187
commit r10-7460-g5abbfd3cd36342df530410033844584d8b85e187
Author: Jakub Jelinek <jakub@redhat.com>
Date: Mon Mar 30 18:05:01 2020 +0200
i386: Fix up *one_cmplv*2* insn with avx512f [PR94343]
This define_insn has two issues.
One is that with -mavx512f -mno-avx512vl it can emit an AVX512VL-only insn
- 128-bit or 256-bit EVEX encoded vpternlog{d,q}.
Another one is that because there is no vpternlog{b,w}, we emit vpternlogd
instead, but then we shouldn't pretend we support masking of that, because
we don't.
The first one can be fixed by forcing the use of %zmm* registers instead of
%xmm* or %ymm* if AVX512F but not AVX512VL, like we do for a couple of
other
insns (although that is primarily done in order to support %xmm16+ regs).
But we need to make sure that in that case the input operand isn't memory,
because while we can read and store the higher bits of registers, we don't
want to read from memory more bytes than what we should read.
A variant to these two if_then_else set attrs, condition in the output and
larger condition would be 4 different define_insns (one with something like
VI48_AVX512VL iterator, masking, no g modifiers and "vm" input constraint,
another one with VI48_AVX iterator, !TARGET_AVX512VL in condition,
no masking, g modifiers and "v" input constraint, one with VI12_AVX512VL
iterator, no masking, no g modifiers and "vm" input constraint and last one
with
VI12_AVX2 iterator, !TARGET_AVX512VL in condition, no masking, g modifiers
and "v" input constraint, but I think having one pattern is shorter than
that.
2020-03-30 Jakub Jelinek <jakub@redhat.com>
PR target/94343
* config/i386/sse.md (<mask_codefor>one_cmpl<mode>2<mask_name>): If
!TARGET_AVX512VL, use 512-bit vpternlog and make sure the input
operand is a register. Don't enable masked variants for
V*[QH]Imode.
* gcc.target/i386/avx512f-pr94343.c: New test.
* gcc.target/i386/avx512vl-pr94343.c: New test.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
` (17 preceding siblings ...)
2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
@ 2020-03-30 16:06 ` jakub at gcc dot gnu.org
18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-30 16:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-03-30 16:06 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
2020-03-26 15:58 ` jakub at gcc dot gnu.org
2020-03-26 16:34 ` hjl.tools at gmail dot com
2020-03-26 16:41 ` jakub at gcc dot gnu.org
2020-03-26 17:08 ` jakub at gcc dot gnu.org
2020-03-26 17:14 ` hjl.tools at gmail dot com
2020-03-26 17:52 ` jakub at gcc dot gnu.org
2020-03-26 17:59 ` jakub at gcc dot gnu.org
2020-03-26 19:30 ` kretz at kde dot org
2020-03-26 19:44 ` jakub at gcc dot gnu.org
2020-03-27 7:00 ` jbeulich at suse dot com
2020-03-27 7:07 ` jbeulich at suse dot com
2020-03-27 7:31 ` jbeulich at suse dot com
2020-03-27 8:45 ` jakub at gcc dot gnu.org
2020-03-27 9:11 ` jakub at gcc dot gnu.org
2020-03-27 9:17 ` jbeulich at suse dot com
2020-03-27 9:21 ` jakub at gcc dot gnu.org
2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
2020-03-30 16:06 ` jakub at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).