* [Bug target/114098] _tile_loadconfig doesn't work
2024-02-25 15:03 [Bug target/114098] New: _tile_loadconfig doesn't work hjl.tools at gmail dot com
@ 2024-02-25 15:11 ` hjl.tools at gmail dot com
2024-02-25 15:57 ` hjl.tools at gmail dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-25 15:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
The problem is that in
extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
__asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}
only 8 bytes are used.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/114098] _tile_loadconfig doesn't work
2024-02-25 15:03 [Bug target/114098] New: _tile_loadconfig doesn't work hjl.tools at gmail dot com
2024-02-25 15:11 ` [Bug target/114098] " hjl.tools at gmail dot com
@ 2024-02-25 15:57 ` hjl.tools at gmail dot com
2024-02-26 4:26 ` cvs-commit at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-25 15:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-02-25
--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
We should tell GCC that 64 bytes will be accessed by ldtilecfg and sttilecfg.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/114098] _tile_loadconfig doesn't work
2024-02-25 15:03 [Bug target/114098] New: _tile_loadconfig doesn't work hjl.tools at gmail dot com
2024-02-25 15:11 ` [Bug target/114098] " hjl.tools at gmail dot com
2024-02-25 15:57 ` hjl.tools at gmail dot com
@ 2024-02-26 4:26 ` cvs-commit at gcc dot gnu.org
2024-02-27 3:47 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-26 4:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
--- Comment #3 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:4972f97a265c574d51e20373ddefd66576051e5c
commit r14-9171-g4972f97a265c574d51e20373ddefd66576051e5c
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Feb 25 10:21:04 2024 -0800
x86: Properly implement AMX-TILE load/store intrinsics
ldtilecfg and sttilecfg take a 512-byte memory block. With
_tile_loadconfig implemented as
extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
__asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}
GCC sees:
(parallel [
(asm_operands/v ("ldtilecfg %X0") ("") 0
[(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -64 [0xffffffffffffffc0])) [1
MEM[(const void * *)&tile_data]+0 S8 A128])]
[(asm_input:DI ("m"))]
(clobber (reg:CC 17 flags))])
and the memory operand size is 1 byte. As the result, the rest of 511
bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.
gcc/ChangeLog:
PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.
gcc/testsuite/ChangeLog:
PR target/114098
* gcc.target/i386/amxtile-4.c: New test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/114098] _tile_loadconfig doesn't work
2024-02-25 15:03 [Bug target/114098] New: _tile_loadconfig doesn't work hjl.tools at gmail dot com
` (2 preceding siblings ...)
2024-02-26 4:26 ` cvs-commit at gcc dot gnu.org
@ 2024-02-27 3:47 ` cvs-commit at gcc dot gnu.org
2024-02-27 3:49 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-27 3:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:2b3ecdf4fb13471b69d80583e10c5baedfe84d7c
commit r13-8365-g2b3ecdf4fb13471b69d80583e10c5baedfe84d7c
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Feb 25 10:21:04 2024 -0800
x86: Properly implement AMX-TILE load/store intrinsics
ldtilecfg and sttilecfg take a 512-byte memory block. With
_tile_loadconfig implemented as
extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
__asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}
GCC sees:
(parallel [
(asm_operands/v ("ldtilecfg %X0") ("") 0
[(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -64 [0xffffffffffffffc0])) [1
MEM[(const void * *)&tile_data]+0 S8 A128])]
[(asm_input:DI ("m"))]
(clobber (reg:CC 17 flags))])
and the memory operand size is 1 byte. As the result, the rest of 511
bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.
gcc/ChangeLog:
PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.
gcc/testsuite/ChangeLog:
PR target/114098
* gcc.target/i386/amxtile-4.c: New test.
(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/114098] _tile_loadconfig doesn't work
2024-02-25 15:03 [Bug target/114098] New: _tile_loadconfig doesn't work hjl.tools at gmail dot com
` (3 preceding siblings ...)
2024-02-27 3:47 ` cvs-commit at gcc dot gnu.org
@ 2024-02-27 3:49 ` cvs-commit at gcc dot gnu.org
2024-02-27 10:33 ` cvs-commit at gcc dot gnu.org
2024-02-27 10:37 ` hjl.tools at gmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-27 3:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95
commit r12-10180-g23f4aa6c68e24a76d3784bcfdad5a53e46cd8f95
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Feb 25 10:21:04 2024 -0800
x86: Properly implement AMX-TILE load/store intrinsics
ldtilecfg and sttilecfg take a 512-byte memory block. With
_tile_loadconfig implemented as
extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
__asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}
GCC sees:
(parallel [
(asm_operands/v ("ldtilecfg %X0") ("") 0
[(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -64 [0xffffffffffffffc0])) [1
MEM[(const void * *)&tile_data]+0 S8 A128])]
[(asm_input:DI ("m"))]
(clobber (reg:CC 17 flags))])
and the memory operand size is 1 byte. As the result, the rest of 511
bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.
gcc/ChangeLog:
PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.
gcc/testsuite/ChangeLog:
PR target/114098
* gcc.target/i386/amxtile-4.c: New test.
(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/114098] _tile_loadconfig doesn't work
2024-02-25 15:03 [Bug target/114098] New: _tile_loadconfig doesn't work hjl.tools at gmail dot com
` (4 preceding siblings ...)
2024-02-27 3:49 ` cvs-commit at gcc dot gnu.org
@ 2024-02-27 10:33 ` cvs-commit at gcc dot gnu.org
2024-02-27 10:37 ` hjl.tools at gmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-02-27 10:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:26b1012c26c4b4de0b4561e74b856a7f7d259a48
commit r11-11258-g26b1012c26c4b4de0b4561e74b856a7f7d259a48
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Sun Feb 25 10:21:04 2024 -0800
x86: Properly implement AMX-TILE load/store intrinsics
ldtilecfg and sttilecfg take a 512-byte memory block. With
_tile_loadconfig implemented as
extern __inline void
__attribute__((__gnu_inline__, __always_inline__, __artificial__))
_tile_loadconfig (const void *__config)
{
__asm__ volatile ("ldtilecfg\t%X0" :: "m" (*((const void **)__config)));
}
GCC sees:
(parallel [
(asm_operands/v ("ldtilecfg %X0") ("") 0
[(mem/f/c:DI (plus:DI (reg/f:DI 77 virtual-stack-vars)
(const_int -64 [0xffffffffffffffc0])) [1
MEM[(const void * *)&tile_data]+0 S8 A128])]
[(asm_input:DI ("m"))]
(clobber (reg:CC 17 flags))])
and the memory operand size is 1 byte. As the result, the rest of 511
bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.
gcc/ChangeLog:
PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.c (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.
gcc/testsuite/ChangeLog:
PR target/114098
* gcc.target/i386/amxtile-4.c: New test.
(cherry picked from commit 4972f97a265c574d51e20373ddefd66576051e5c)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/114098] _tile_loadconfig doesn't work
2024-02-25 15:03 [Bug target/114098] New: _tile_loadconfig doesn't work hjl.tools at gmail dot com
` (5 preceding siblings ...)
2024-02-27 10:33 ` cvs-commit at gcc dot gnu.org
@ 2024-02-27 10:37 ` hjl.tools at gmail dot com
6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2024-02-27 10:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114098
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.5
Resolution|--- |FIXED
Status|NEW |RESOLVED
--- Comment #7 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for 11.5, 12.4, 13.3 and 14.
^ permalink raw reply [flat|nested] 8+ messages in thread