public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters
@ 2021-07-16 12:17 dlustig at nvidia dot com
2021-07-20 3:34 ` [Bug target/101472] " crazylht at gmail dot com
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: dlustig at nvidia dot com @ 2021-07-16 12:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472
Bug ID: 101472
Summary: AVX-512 wrong code for consecutive masked scatters
Product: gcc
Version: 11.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: dlustig at nvidia dot com
Target Milestone: ---
Created attachment 51162
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51162&action=edit
Test case
$ cat two_scatters.c
#include <immintrin.h>
void two_scatters(void* base_addr, __mmask8 k1, __mmask8 k2, __m512i vindex,
__m256i a) {
_mm512_mask_i64scatter_epi32(base_addr, k1, vindex, a, 1);
_mm512_mask_i64scatter_epi32(base_addr, k2, vindex, a, 1);
}
$ g++-11 -S -O3 -march=skylake-avx512 -Wall -Wextra -fno-strict-aliasing
-fwrapv -fno-aggressive-loop-optimizations two_scatters.c -o -
...
_Z12two_scattersPvhhDv8_xDv4_x:
kmovb %edx, %k2
vpscatterqd %ymm1, (%rdi,%zmm0,1){%k2}
ret
...
Only one vpscatterqd instruction is generated, even though I would expect two.
The optimizer seems to think the first store is redundant with the second due
to matching addresses, and hence optimizes it away. However, since two
different masks are being used, the scatters are not actually redundant.
Perturbing the example by passing two different base addresses, or inserting an
asm("nop"); in between, etc., will cause both scatters to get emitted:
https://godbolt.org/z/3b8v86on4
Stand-alone executable test case attached as well.
GCC version info:
$ gcc-11 -v
Using built-in specs.
COLLECT_GCC=gcc-11
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
11.1.0-1ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-11
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --disable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-11-YRKbe7/gcc-11-11.1.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-YRKbe7/gcc-11-11.1.0/debian/tmp-gcn/usr
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.1.0 (Ubuntu 11.1.0-1ubuntu1~18.04.1)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
@ 2021-07-20 3:34 ` crazylht at gmail dot com
2021-08-27 5:20 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2021-07-20 3:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |wwwhhhyyy333 at gmail dot com
--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
pattern is defined as
(define_insn "*avx512f_scatterdi<VI48F:mode>"
[(set (match_operator:VI48F 5 "vsib_mem_operator"
[(unspec:P
[(match_operand:P 0 "vsib_address_operand" "Tv")
(match_operand:<VEC_GATHER_IDXDI> 2 "register_operand" "v")
(match_operand:SI 4 "const1248_operand" "n")]
UNSPEC_VSIBADDR)])
(unspec:VI48F
[(match_operand:QI 6 "register_operand" "1")
(match_operand:<VEC_GATHER_SRCDI> 3 "register_operand" "v")]
UNSPEC_SCATTER))
(clobber (match_scratch:QI 1 "=&Yk"))]
"TARGET_AVX512F"
;; %X5 so that we don't emit any *WORD PTR for -masm=intel, as
;; gas changed what it requires incompatibly.
"%M0v<sseintprefix>scatterq<ssemodesuffix>\t{%3, %5%{%1%}|%X5%{%1%}, %3}"
[(set_attr "type" "ssemov")
(set_attr "prefix" "evex")
(set_attr "mode" "<sseinsnmode>")])
mask register only affect set src, w/ same dest, gcc thinks the first set will
be overlapped, and optimizes the first vpscatterqpd away
we need to refine the pattern to let gcc know mask register also affect the
dest. maybe just put mask operation into UNSPEC_VSIBADDR?
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
2021-07-20 3:34 ` [Bug target/101472] " crazylht at gmail dot com
@ 2021-08-27 5:20 ` cvs-commit at gcc dot gnu.org
2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-27 5:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:44a545a6abdd330083c1d12ad70092defbba702a
commit r12-3181-g44a545a6abdd330083c1d12ad70092defbba702a
Author: konglin1 <lingling.kong@intel.com>
Date: Mon Aug 9 11:37:52 2021 +0800
i386: Fix wrong optimization for consecutive masked scatters [PR 101472]
gcc/ChangeLog:
PR target/101472
* config/i386/sse.md: (<avx512>scattersi<mode>): Add mask operand
to
UNSPEC_VSIBADDR.
(<avx512>scattersi<mode>): Likewise.
(*avx512f_scattersi<VI48F:mode>): Merge mask operand to set_dest.
(*avx512f_scatterdi<VI48F:mode>): Likewise
gcc/testsuite/ChangeLog:
PR target/101472
* gcc.target/i386/avx512f-pr101472.c: New test.
* gcc.target/i386/avx512vl-pr101472.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
2021-07-20 3:34 ` [Bug target/101472] " crazylht at gmail dot com
2021-08-27 5:20 ` cvs-commit at gcc dot gnu.org
@ 2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-27 5:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by hongtao Liu
<liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:b186040b468f6da512b9b123e1d4549f44396993
commit r11-8934-gb186040b468f6da512b9b123e1d4549f44396993
Author: konglin1 <lingling.kong@intel.com>
Date: Mon Aug 9 11:37:52 2021 +0800
i386: Fix wrong optimization for consecutive masked scatters [PR 101472]
gcc/ChangeLog:
PR target/101472
* config/i386/sse.md: (<avx512>scattersi<mode>): Add mask operand
to
UNSPEC_VSIBADDR.
(<avx512>scattersi<mode>): Likewise.
(*avx512f_scattersi<VI48F:mode>): Merge mask operand to set_dest.
(*avx512f_scatterdi<VI48F:mode>): Likewise
gcc/testsuite/ChangeLog:
PR target/101472
* gcc.target/i386/avx512f-pr101472.c: New test.
* gcc.target/i386/avx512vl-pr101472.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
` (2 preceding siblings ...)
2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
@ 2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
2021-08-27 5:23 ` crazylht at gmail dot com
2023-01-18 10:57 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-27 5:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by hongtao Liu
<liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:fab014ecf9f7faf3b607a1e0892d0aeabe556661
commit r10-10073-gfab014ecf9f7faf3b607a1e0892d0aeabe556661
Author: konglin1 <lingling.kong@intel.com>
Date: Mon Aug 9 11:37:52 2021 +0800
i386: Fix wrong optimization for consecutive masked scatters [PR 101472]
gcc/ChangeLog:
PR target/101472
* config/i386/sse.md: (<avx512>scattersi<mode>): Add mask operand
to
UNSPEC_VSIBADDR.
(<avx512>scattersi<mode>): Likewise.
(*avx512f_scattersi<VI48F:mode>): Merge mask operand to set_dest.
(*avx512f_scatterdi<VI48F:mode>): Likewise
gcc/testsuite/ChangeLog:
PR target/101472
* gcc.target/i386/avx512f-pr101472.c: New test.
* gcc.target/i386/avx512vl-pr101472.c: New test.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
` (3 preceding siblings ...)
2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
@ 2021-08-27 5:23 ` crazylht at gmail dot com
2023-01-18 10:57 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2021-08-27 5:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12, backport to GCC11 and GCC10.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug target/101472] AVX-512 wrong code for consecutive masked scatters
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
` (4 preceding siblings ...)
2021-08-27 5:23 ` crazylht at gmail dot com
@ 2023-01-18 10:57 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-01-18 10:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |10.4
Resolution|--- |FIXED
Status|UNCONFIRMED |RESOLVED
Known to work| |10.4.0, 11.3.0, 12.1.0
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-01-18 10:58 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-16 12:17 [Bug target/101472] New: AVX-512 wrong code for consecutive masked scatters dlustig at nvidia dot com
2021-07-20 3:34 ` [Bug target/101472] " crazylht at gmail dot com
2021-08-27 5:20 ` cvs-commit at gcc dot gnu.org
2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
2021-08-27 5:22 ` cvs-commit at gcc dot gnu.org
2021-08-27 5:23 ` crazylht at gmail dot com
2023-01-18 10:57 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).