public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters
@ 2021-07-16 12:17 dlustig at nvidia dot com
  2021-07-20  3:34 ` [Bug target/101472] " crazylht at gmail dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: dlustig at nvidia dot com @ 2021-07-16 12:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472

            Bug ID: 101472
           Summary: AVX-512 wrong code for consecutive masked scatters
           Product: gcc
           Version: 11.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dlustig at nvidia dot com
  Target Milestone: ---

Created attachment 51162
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51162&action=edit
Test case

$ cat two_scatters.c
#include <immintrin.h>

void two_scatters(void* base_addr, __mmask8 k1, __mmask8 k2, __m512i vindex,
__m256i a) {
    _mm512_mask_i64scatter_epi32(base_addr, k1, vindex, a, 1);
    _mm512_mask_i64scatter_epi32(base_addr, k2, vindex, a, 1);
}

$ g++-11 -S -O3 -march=skylake-avx512 -Wall -Wextra -fno-strict-aliasing
-fwrapv -fno-aggressive-loop-optimizations two_scatters.c -o -
...
_Z12two_scattersPvhhDv8_xDv4_x:
        kmovb   %edx, %k2
        vpscatterqd     %ymm1, (%rdi,%zmm0,1){%k2}
        ret
...

Only one vpscatterqd instruction is generated, even though I would expect two. 
The optimizer seems to think the first store is redundant with the second due
to matching addresses, and hence optimizes it away.  However, since two
different masks are being used, the scatters are not actually redundant.

Perturbing the example by passing two different base addresses, or inserting an
asm("nop"); in between, etc., will cause both scatters to get emitted:
https://godbolt.org/z/3b8v86on4

Stand-alone executable test case attached as well.

GCC version info:

$ gcc-11 -v
Using built-in specs.
COLLECT_GCC=gcc-11
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
11.1.0-1ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --disable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-11-YRKbe7/gcc-11-11.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-YRKbe7/gcc-11-11.1.0/debian/tmp-gcn/usr
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.1.0 (Ubuntu 11.1.0-1ubuntu1~18.04.1)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
  2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
@ 2021-07-20  3:34 ` crazylht at gmail dot com
  2021-08-27  5:20 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2021-07-20  3:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wwwhhhyyy333 at gmail dot com

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
pattern is defined as 

(define_insn "*avx512f_scatterdi<VI48F:mode>"
  [(set (match_operator:VI48F 5 "vsib_mem_operator"
          [(unspec:P
             [(match_operand:P 0 "vsib_address_operand" "Tv")
              (match_operand:<VEC_GATHER_IDXDI> 2 "register_operand" "v")
              (match_operand:SI 4 "const1248_operand" "n")]
             UNSPEC_VSIBADDR)])
        (unspec:VI48F
          [(match_operand:QI 6 "register_operand" "1")
           (match_operand:<VEC_GATHER_SRCDI> 3 "register_operand" "v")]
          UNSPEC_SCATTER))
   (clobber (match_scratch:QI 1 "=&Yk"))]
  "TARGET_AVX512F"
;; %X5 so that we don't emit any *WORD PTR for -masm=intel, as
;; gas changed what it requires incompatibly.
  "%M0v<sseintprefix>scatterq<ssemodesuffix>\t{%3, %5%{%1%}|%X5%{%1%}, %3}"
  [(set_attr "type" "ssemov")
   (set_attr "prefix" "evex")
   (set_attr "mode" "<sseinsnmode>")])

mask register only affect set src, w/ same dest, gcc thinks the first set will
be overlapped, and optimizes the first vpscatterqpd away

we need to refine the pattern to let gcc know mask register also affect the
dest. maybe just put mask operation into UNSPEC_VSIBADDR?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
  2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
  2021-07-20  3:34 ` [Bug target/101472] " crazylht at gmail dot com
@ 2021-08-27  5:20 ` cvs-commit at gcc dot gnu.org
  2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-27  5:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:44a545a6abdd330083c1d12ad70092defbba702a

commit r12-3181-g44a545a6abdd330083c1d12ad70092defbba702a
Author: konglin1 <lingling.kong@intel.com>
Date:   Mon Aug 9 11:37:52 2021 +0800

    i386: Fix wrong optimization for consecutive masked scatters [PR 101472]

    gcc/ChangeLog:

            PR target/101472
            * config/i386/sse.md: (<avx512>scattersi<mode>): Add mask operand
to
            UNSPEC_VSIBADDR.
            (<avx512>scattersi<mode>): Likewise.
            (*avx512f_scattersi<VI48F:mode>): Merge mask operand to set_dest.
            (*avx512f_scatterdi<VI48F:mode>): Likewise

    gcc/testsuite/ChangeLog:

            PR target/101472
            * gcc.target/i386/avx512f-pr101472.c: New test.
            * gcc.target/i386/avx512vl-pr101472.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
  2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
  2021-07-20  3:34 ` [Bug target/101472] " crazylht at gmail dot com
  2021-08-27  5:20 ` cvs-commit at gcc dot gnu.org
@ 2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
  2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-27  5:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by hongtao Liu
<liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:b186040b468f6da512b9b123e1d4549f44396993

commit r11-8934-gb186040b468f6da512b9b123e1d4549f44396993
Author: konglin1 <lingling.kong@intel.com>
Date:   Mon Aug 9 11:37:52 2021 +0800

    i386: Fix wrong optimization for consecutive masked scatters [PR 101472]

    gcc/ChangeLog:

            PR target/101472
            * config/i386/sse.md: (<avx512>scattersi<mode>): Add mask operand
to
            UNSPEC_VSIBADDR.
            (<avx512>scattersi<mode>): Likewise.
            (*avx512f_scattersi<VI48F:mode>): Merge mask operand to set_dest.
            (*avx512f_scatterdi<VI48F:mode>): Likewise

    gcc/testsuite/ChangeLog:

            PR target/101472
            * gcc.target/i386/avx512f-pr101472.c: New test.
            * gcc.target/i386/avx512vl-pr101472.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
  2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
                   ` (2 preceding siblings ...)
  2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
@ 2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
  2021-08-27  5:23 ` crazylht at gmail dot com
  2023-01-18 10:57 ` rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-27  5:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by hongtao Liu
<liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:fab014ecf9f7faf3b607a1e0892d0aeabe556661

commit r10-10073-gfab014ecf9f7faf3b607a1e0892d0aeabe556661
Author: konglin1 <lingling.kong@intel.com>
Date:   Mon Aug 9 11:37:52 2021 +0800

    i386: Fix wrong optimization for consecutive masked scatters [PR 101472]

    gcc/ChangeLog:

            PR target/101472
            * config/i386/sse.md: (<avx512>scattersi<mode>): Add mask operand
to
            UNSPEC_VSIBADDR.
            (<avx512>scattersi<mode>): Likewise.
            (*avx512f_scattersi<VI48F:mode>): Merge mask operand to set_dest.
            (*avx512f_scatterdi<VI48F:mode>): Likewise

    gcc/testsuite/ChangeLog:

            PR target/101472
            * gcc.target/i386/avx512f-pr101472.c: New test.
            * gcc.target/i386/avx512vl-pr101472.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
  2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
                   ` (3 preceding siblings ...)
  2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
@ 2021-08-27  5:23 ` crazylht at gmail dot com
  2023-01-18 10:57 ` rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2021-08-27  5:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12, backport to GCC11 and GCC10.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
  2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
                   ` (4 preceding siblings ...)
  2021-08-27  5:23 ` crazylht at gmail dot com
@ 2023-01-18 10:57 ` rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-18 10:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |10.4
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED
      Known to work|                            |10.4.0, 11.3.0, 12.1.0

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-01-18 10:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
2021-07-20  3:34 ` [Bug target/101472] " crazylht at gmail dot com
2021-08-27  5:20 ` cvs-commit at gcc dot gnu.org
2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
2021-08-27  5:22 ` cvs-commit at gcc dot gnu.org
2021-08-27  5:23 ` crazylht at gmail dot com
2023-01-18 10:57 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).