public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
@ 2020-03-26 15:24 kretz at kde dot org
  2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: kretz at kde dot org @ 2020-03-26 15:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

            Bug ID: 94343
           Summary: [10 Regression] invalid AVX512VL vpternlogd
                    instruction emitted for -march=knl
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kretz at kde dot org
  Target Milestone: ---
            Target: i386,x86-64

Test case (`-O1 -march=knl`, cf. https://godbolt.org/z/qQc3Sf):

using W [[gnu::vector_size(16)]] = long long;
using V [[gnu::vector_size(16)]] = int;

auto f(V a) {
    return __builtin_ia32_pandn128(reinterpret_cast<W>(~V() ^ a), ~W());
}


This emits a XMM variant of `vpternlogd` which requires AVX512VL. But it was
supposed to compile for KNL.

Besides the bug, there's also a missed optimization here: `~V() ^ a` flips all
bits and pandn flips all bits again. Thus it should compile to a single `ret`
instruction. Note that the variation:

auto f(V a) {
    return ~reinterpret_cast<W>(~V() ^ a) & ~W();
}

compiles to

  vpternlogd xmm0, xmm0, xmm0, 0x55
  vpternlogq xmm0, xmm0, xmm0, 0x55

for KNL.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
@ 2020-03-26 15:39 ` marxin at gcc dot gnu.org
  2020-03-26 15:58 ` jakub at gcc dot gnu.org
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-03-26 15:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jbeulich at suse dot com,
                   |                            |marxin at gcc dot gnu.org
   Last reconfirmed|                            |2020-03-26
     Ever confirmed|0                           |1
   Target Milestone|---                         |10.0
             Status|UNCONFIRMED                 |NEW
           Priority|P3                          |P1
      Known to work|                            |9.3.0
      Known to fail|                            |10.0

--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
You are right, AVX512VL is not supported with knl.
It started with r10-2016-gff8f129bc2f57fdf:

        vpternlogd      $0x55, %xmm0, %xmm0, %xmm0
        vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpandn  %xmm1, %xmm0, %xmm0
        ret

before the revision we emitted:

        vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpxor   %xmm1, %xmm0, %xmm0
        vpcmpeqd        %xmm1, %xmm1, %xmm1
        vpandn  %xmm1, %xmm0, %xmm0
        ret

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
  2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
@ 2020-03-26 15:58 ` jakub at gcc dot gnu.org
  2020-03-26 16:34 ` hjl.tools at gmail dot com
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 15:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Looking.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
  2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
  2020-03-26 15:58 ` jakub at gcc dot gnu.org
@ 2020-03-26 16:34 ` hjl.tools at gmail dot com
  2020-03-26 16:41 ` jakub at gcc dot gnu.org
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hjl.tools at gmail dot com @ 2020-03-26 16:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 48126
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48126&action=edit
A patch

Jakub, this is what I have. Feel free to ignore it.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (2 preceding siblings ...)
  2020-03-26 16:34 ` hjl.tools at gmail dot com
@ 2020-03-26 16:41 ` jakub at gcc dot gnu.org
  2020-03-26 17:08 ` jakub at gcc dot gnu.org
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 16:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I was thinking about
--- gcc/config/i386/sse.md.jj   2020-03-06 11:35:46.284074858 +0100
+++ gcc/config/i386/sse.md      2020-03-26 17:35:23.690515228 +0100
@@ -12800,10 +12800,18 @@
        (xor:VI (match_operand:VI 1 "nonimmediate_operand" "vm")
                (match_operand:VI 2 "vector_all_ones_operand" "BC")))]
   "TARGET_AVX512F"
-  "vpternlog<ternlogsuffix>\t{$0x55, %1, %0,
%0<mask_operand3>|%0<mask_operand3>, %0, %1, 0x55}"
+{
+  if (TARGET_AVX512VL)
+    return "vpternlog<ternlogsuffix>\t{$0x55, %1, %0,
%0<mask_operand3>|%0<mask_operand3>, %0, %1, 0x55}";
+  else
+    return "vpternlog<ternlogsuffix>\t{$0x55, %g1, %g0,
%g0<mask_operand3>|%g0<mask_operand3>, %g0, %g1, 0x55}";
+}
   [(set_attr "type" "sselog")
    (set_attr "prefix" "evex")
-   (set_attr "mode" "<sseinsnmode>")])
+   (set (attr "mode")
+        (if_then_else (match_test "TARGET_AVX512VL")
+                     (const_string "<sseinsnmode>")
+                     (const_string "XI")))])

 (define_expand "<sse2_avx2>_andnot<mode>3"
   [(set (match_operand:VI_AVX2 0 "register_operand")
instead.  I'm aware that from performance POV we are trying to avoid 512-bit
vectors, but don't all such affected CPUs support AVX512VL already?  Does KNL
care if it will do a 512-bit operation instead of 128-bit or 256-bit?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (3 preceding siblings ...)
  2020-03-26 16:41 ` jakub at gcc dot gnu.org
@ 2020-03-26 17:08 ` jakub at gcc dot gnu.org
  2020-03-26 17:14 ` hjl.tools at gmail dot com
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 17:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48127
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48127&action=edit
gcc10-pr94343.patch

That of course doesn't work if the input operand is memory.  This should.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (4 preceding siblings ...)
  2020-03-26 17:08 ` jakub at gcc dot gnu.org
@ 2020-03-26 17:14 ` hjl.tools at gmail dot com
  2020-03-26 17:52 ` jakub at gcc dot gnu.org
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: hjl.tools at gmail dot com @ 2020-03-26 17:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #6 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Jakub Jelinek from comment #5)
> Created attachment 48127 [details]
> gcc10-pr94343.patch
> 
> That of course doesn't work if the input operand is memory.  This should.

LGTM.  Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (5 preceding siblings ...)
  2020-03-26 17:14 ` hjl.tools at gmail dot com
@ 2020-03-26 17:52 ` jakub at gcc dot gnu.org
  2020-03-26 17:59 ` jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 17:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Though, there are other issues.  There is only vpternlog{d,q}, so for
V*[QH]Imode we shouldn't pretend we have masking support.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (6 preceding siblings ...)
  2020-03-26 17:52 ` jakub at gcc dot gnu.org
@ 2020-03-26 17:59 ` jakub at gcc dot gnu.org
  2020-03-26 19:30 ` kretz at kde dot org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 17:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #48127|0                           |1
        is obsolete|                            |

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 48128
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48128&action=edit
gcc10-pr94343.patch

Updated patch.  A variant to that would be 4 separate patterns I guess, not
sure if that would be shorter.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (7 preceding siblings ...)
  2020-03-26 17:59 ` jakub at gcc dot gnu.org
@ 2020-03-26 19:30 ` kretz at kde dot org
  2020-03-26 19:44 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: kretz at kde dot org @ 2020-03-26 19:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #9 from Matthias Kretz (Vir) <kretz at kde dot org> ---
(In reply to Jakub Jelinek from comment #8)
> Created attachment 48128 [details]
> gcc10-pr94343.patch

The avx512vl-pr94343.c test should ideally fail because `_mm_andnot_si128
((__m128i) (~v ^ a), (__m128i) ~w)` is equal to `a`. Maybe the test should
simply `return ~v ^ a;` and thus emit a single vpternlogd?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (8 preceding siblings ...)
  2020-03-26 19:30 ` kretz at kde dot org
@ 2020-03-26 19:44 ` jakub at gcc dot gnu.org
  2020-03-27  7:00 ` jbeulich at suse dot com
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-26 19:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
--- gcc/testsuite/gcc.target/i386/avx512f-pr94343.c.jj  2020-03-26
17:47:40.008654504 +0100
+++ gcc/testsuite/gcc.target/i386/avx512f-pr94343.c     2020-03-26
17:48:37.169811375 +0100
@@ -0,0 +1,12 @@
+/* PR target/94343 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512f -mno-avx512vl" } */
+/* { dg-final { scan-assembler-not "vpternlogd\[^\n\r]*xmm\[0-9]*" } } */
+
+typedef int __v4si __attribute__((vector_size (16)));
+
+__v4si
+foo (__v4si a)
+{
+  return ~a;
+}
--- gcc/testsuite/gcc.target/i386/avx512vl-pr94343.c.jj 2020-03-26
17:48:53.232573115 +0100
+++ gcc/testsuite/gcc.target/i386/avx512vl-pr94343.c    2020-03-26
17:49:08.034352968 +0100
@@ -0,0 +1,12 @@
+/* PR target/94343 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512vl" } */
+/* { dg-final { scan-assembler "vpternlogd\[^\n\r]*xmm\[0-9]*" } } */
+
+typedef int __v4si __attribute__((vector_size (16)));
+
+__v4si
+foo (__v4si a)
+{
+  return ~a;
+}

is enough.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (9 preceding siblings ...)
  2020-03-26 19:44 ` jakub at gcc dot gnu.org
@ 2020-03-27  7:00 ` jbeulich at suse dot com
  2020-03-27  7:07 ` jbeulich at suse dot com
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27  7:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #11 from jbeulich at suse dot com ---
(In reply to Jakub Jelinek from comment #7)
> Though, there are other issues.  There is only vpternlog{d,q}, so for
> V*[QH]Imode we shouldn't pretend we have masking support.

Why would this be? The element mode doesn't matter at all for bitwise
operations. Just like there's no VPANDB / VPANDW, but VPANDD/VPANDQ are quite
fine to use on vectors of QI or HI. Afaict the existence of VPAND{D,Q} in
AVX512 (as opposed to {,V}PAND in MMX/SSE2/AVX) is merely an oddity resulting
from EVEX.W handling (besides of course the element width's effect on embedded
broadcasting).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (10 preceding siblings ...)
  2020-03-27  7:00 ` jbeulich at suse dot com
@ 2020-03-27  7:07 ` jbeulich at suse dot com
  2020-03-27  7:31 ` jbeulich at suse dot com
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27  7:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #12 from jbeulich at suse dot com ---
(In reply to Jakub Jelinek from comment #8)
> Created attachment 48128 [details]
> gcc10-pr94343.patch
> 
> Updated patch.  A variant to that would be 4 separate patterns I guess, not
> sure if that would be shorter.

May I suggest that the testcases also check for the 0x55 immediate? The
potential of VPTERNLOG allows for much wider use (with different immediates) in
principle, so the correct choice of immediate would imo better be validated
here as well.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (11 preceding siblings ...)
  2020-03-27  7:07 ` jbeulich at suse dot com
@ 2020-03-27  7:31 ` jbeulich at suse dot com
  2020-03-27  8:45 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27  7:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #13 from jbeulich at suse dot com ---
As to using 512-bit operations even on more narrow input types - is this
correct when e.g. subsequently source code upcasts the vector? I.e. would such
an upcast be carried out by emitting an insn to zero the upper portion, rather
than simply considering this a re-interpretation of the same register (with no
insn emitted at all)? In which case the fix would apparently boil down to using
a mode iterator different from VI (VI48_AVX512VL if you really mean to exclude
vectors of QI/HI, or a combination of this and VI12_AVX512VL otherwise).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (12 preceding siblings ...)
  2020-03-27  7:31 ` jbeulich at suse dot com
@ 2020-03-27  8:45 ` jakub at gcc dot gnu.org
  2020-03-27  9:11 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-27  8:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to jbeulich from comment #13)
> As to using 512-bit operations even on more narrow input types - is this
> correct when e.g. subsequently source code upcasts the vector? I.e. would
> such an upcast be carried out by emitting an insn to zero the upper portion,
> rather than simply considering this a re-interpretation of the same register
> (with no insn emitted at all)? In which case the fix would apparently boil
> down to using a mode iterator different from VI (VI48_AVX512VL if you really
> mean to exclude vectors of QI/HI, or a combination of this and VI12_AVX512VL
> otherwise).

Yes, in RTL a store to a register in some machine mode when the hw register has
wider mode leaves the upper bits undefined, not zero initialized (like e.g.
most of the 32-bit instructions do on 64-bit general purpose registers or many
AVX etc. instructions do on vector registers), we do have a REE pass to handle
some most common cases.
If we added some machine specific pass that would do something like that for
most instructions, we'd need some attribute to mark instructions that don't
have that property, sure, but we don't ATM.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (13 preceding siblings ...)
  2020-03-27  8:45 ` jakub at gcc dot gnu.org
@ 2020-03-27  9:11 ` jakub at gcc dot gnu.org
  2020-03-27  9:17 ` jbeulich at suse dot com
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-27  9:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #15 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to jbeulich from comment #11)
> (In reply to Jakub Jelinek from comment #7)
> > Though, there are other issues.  There is only vpternlog{d,q}, so for
> > V*[QH]Imode we shouldn't pretend we have masking support.
> 
> Why would this be? The element mode doesn't matter at all for bitwise
> operations. Just like there's no VPANDB / VPANDW, but VPANDD/VPANDQ are
> quite fine to use on vectors of QI or HI. Afaict the existence of VPAND{D,Q}
> in AVX512 (as opposed to {,V}PAND in MMX/SSE2/AVX) is merely an oddity
> resulting from EVEX.W handling (besides of course the element width's effect
> on embedded broadcasting).

For masked instructions, the element mode is significant, it determines which
bits of the mask register apply to which bits in the destination register.
So, if the masked variant is ever matched (e.g. by combine), then it will
expect to do something different from what the insn will actually do.
(define_insn ("one_cmplv64qi2_mask")
     [
        (set (match_operand:V64QI 0 ("register_operand") ("=v"))
            (vec_merge:V64QI (xor:V64QI (match_operand:V64QI 1
("nonimmediate_operand") ("vm"))
                    (match_operand:V64QI 2 ("vector_all_ones_operand") ("BC")))
                (match_operand:V64QI 3 ("nonimm_or_0_operand") ("0C"))
                (match_operand:DI 4 ("register_operand") ("Yk"))))
    ] ("(TARGET_AVX512F) && ((TARGET_AVX512F) && (TARGET_AVX512BW))")
("vpternlogd\t{$0x55, %1, %0, %0%{%4%}%N3|%0%{%4%}%N3, %0, %1, 0x55}")
     [
        (set_attr ("type") ("sselog"))
        (set_attr ("prefix") ("evex"))
        (set_attr ("mode") ("XI"))
        (set_attr ("mask") ("no"))
    ])
The above says in RTL that for say the last operand of 0x5555555555555555ULL,
first 8 bits in the vector will be the result of the ternlog operation, next 8
bits will be cleared or unmodified (depending on operand 3), etc.
The instruction used for that will do something different, will ignore the
upper 48 bits of the mask register and the low 32 bits of the destination will
be the result of the ternlog operation, next 32 bits will be cleared or
unmodified, etc.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (14 preceding siblings ...)
  2020-03-27  9:11 ` jakub at gcc dot gnu.org
@ 2020-03-27  9:17 ` jbeulich at suse dot com
  2020-03-27  9:21 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jbeulich at suse dot com @ 2020-03-27  9:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #16 from jbeulich at suse dot com ---
Oh, right - I didn't pay attention to the insn permitting masking to be used.
Then still non-masked uses ought to be permitted, perhaps by having a
VI48_AVX512VL one permitting masking and a VI12_AVX512VL one which doesn't?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (15 preceding siblings ...)
  2020-03-27  9:17 ` jbeulich at suse dot com
@ 2020-03-27  9:21 ` jakub at gcc dot gnu.org
  2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
  2020-03-30 16:06 ` jakub at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-27  9:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The patch I've posted -
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/542774.html - solves it by
disabling the problematic masked cases in the condition.  Or we could just
disable the masking on the insn altogether, the expander will not need those.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (16 preceding siblings ...)
  2020-03-27  9:21 ` jakub at gcc dot gnu.org
@ 2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
  2020-03-30 16:06 ` jakub at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-03-30 16:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:5abbfd3cd36342df530410033844584d8b85e187

commit r10-7460-g5abbfd3cd36342df530410033844584d8b85e187
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Mar 30 18:05:01 2020 +0200

    i386: Fix up *one_cmplv*2* insn with avx512f [PR94343]

    This define_insn has two issues.
    One is that with -mavx512f -mno-avx512vl it can emit an AVX512VL-only insn
    - 128-bit or 256-bit EVEX encoded vpternlog{d,q}.
    Another one is that because there is no vpternlog{b,w}, we emit vpternlogd
    instead, but then we shouldn't pretend we support masking of that, because
    we don't.
    The first one can be fixed by forcing the use of %zmm* registers instead of
    %xmm* or %ymm* if AVX512F but not AVX512VL, like we do for a couple of
other
    insns (although that is primarily done in order to support %xmm16+ regs).
    But we need to make sure that in that case the input operand isn't memory,
    because while we can read and store the higher bits of registers, we don't
    want to read from memory more bytes than what we should read.

    A variant to these two if_then_else set attrs, condition in the output and
    larger condition would be 4 different define_insns (one with something like
    VI48_AVX512VL iterator, masking, no g modifiers and "vm" input constraint,
    another one with VI48_AVX iterator, !TARGET_AVX512VL in condition,
    no masking, g modifiers and "v" input constraint, one with VI12_AVX512VL
    iterator, no masking, no g modifiers and "vm" input constraint and last one
with
    VI12_AVX2 iterator, !TARGET_AVX512VL in condition, no masking, g modifiers
    and "v" input constraint, but I think having one pattern is shorter than
    that.

    2020-03-30  Jakub Jelinek  <jakub@redhat.com>

            PR target/94343
            * config/i386/sse.md (<mask_codefor>one_cmpl<mode>2<mask_name>): If
            !TARGET_AVX512VL, use 512-bit vpternlog and make sure the input
            operand is a register.  Don't enable masked variants for
V*[QH]Imode.

            * gcc.target/i386/avx512f-pr94343.c: New test.
            * gcc.target/i386/avx512vl-pr94343.c: New test.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/94343] [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl
  2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
                   ` (17 preceding siblings ...)
  2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
@ 2020-03-30 16:06 ` jakub at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-03-30 16:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94343

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-03-30 16:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-26 15:24 [Bug target/94343] New: [10 Regression] invalid AVX512VL vpternlogd instruction emitted for -march=knl kretz at kde dot org
2020-03-26 15:39 ` [Bug target/94343] " marxin at gcc dot gnu.org
2020-03-26 15:58 ` jakub at gcc dot gnu.org
2020-03-26 16:34 ` hjl.tools at gmail dot com
2020-03-26 16:41 ` jakub at gcc dot gnu.org
2020-03-26 17:08 ` jakub at gcc dot gnu.org
2020-03-26 17:14 ` hjl.tools at gmail dot com
2020-03-26 17:52 ` jakub at gcc dot gnu.org
2020-03-26 17:59 ` jakub at gcc dot gnu.org
2020-03-26 19:30 ` kretz at kde dot org
2020-03-26 19:44 ` jakub at gcc dot gnu.org
2020-03-27  7:00 ` jbeulich at suse dot com
2020-03-27  7:07 ` jbeulich at suse dot com
2020-03-27  7:31 ` jbeulich at suse dot com
2020-03-27  8:45 ` jakub at gcc dot gnu.org
2020-03-27  9:11 ` jakub at gcc dot gnu.org
2020-03-27  9:17 ` jbeulich at suse dot com
2020-03-27  9:21 ` jakub at gcc dot gnu.org
2020-03-30 16:05 ` cvs-commit at gcc dot gnu.org
2020-03-30 16:06 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).