public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
@ 2024-03-28 15:52 jswinney at amazon dot com
2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: jswinney at amazon dot com @ 2024-03-28 15:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
Bug ID: 114521
Summary: aarch64: wrong code with Neon ld1/st1x4 intrinsics
gcc-11 and earlier
Product: gcc
Version: 11.4.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: jswinney at amazon dot com
Target Milestone: ---
Created attachment 57831
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57831&action=edit
patch to fix the broken test
Using a half-width 4-register load aarch64 Neon intrinsic results in incorrect
stack spill, or at least incorrect offsetting into the resulting stack spill.
This happens at any level of optimization, -O0..3.
```
#include <arm_neon.h>
#include <inttypes.h>
#include <stdio.h>
uint8x8_t global[4] = {0};
void test(const uint8_t* arr)
{
const uint8x8x4_t parr = vld1_u8_x4(arr);
global[0] = parr.val[0];
global[1] = parr.val[1];
global[2] = parr.val[2];
global[3] = parr.val[3];
}
int main()
{
const uint8_t arr[32] = {
0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
};
for (int i = 0; i < 4; i++) {
printf("%llx ", (uint64_t) global[i]);
}
printf("\n");
test(arr);
for (int i = 0; i < 4; i++) {
printf("%llx ", (uint64_t) global[i]);
}
printf("\n");
return 0;
}
```
From the compiled "test" function above, the compiler emits the correct
half-width load instruction followed by a full-width store:
```
test(unsigned char const*):
ld1 {v0.8b - v3.8b}, [x0]
sub sp, sp, #64
...
st1 {v0.16b - v3.16b}, [sp]
```
This issue is corrected by a change in gcc-12 in:
66f206b85395c273980e2b81a54dbddc4897e4a7
Additionally the test used to verify this code silently ignores the error. I
have attached a patch which fixes the test.
```
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-amazon-linux/11/lto-wrapper
Target: aarch64-amazon-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info
--with-bugurl=https://github.com/amazonlinux/amazon-linux-2022 --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin
--enable-initfini-array
--with-isl=/builddir/build/BUILD/gcc-11.4.1-20230605/obj-aarch64-amazon-linux/isl-install
--enable-gnu-indirect-function --with-tune=neoverse-n1
--with-arch=armv8.2-a+crypto --build=aarch64-amazon-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.1 20230605 (Red Hat 11.4.1-2) (GCC)
```
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/114521] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
@ 2024-03-28 15:56 ` pinskia at gcc dot gnu.org
2024-03-28 15:57 ` [Bug target/114521] [11 only] " pinskia at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-28 15:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
g:66f206b85395c273980e2b81a54dbddc4897e4a7
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/114521] [11 only] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
@ 2024-03-28 15:57 ` pinskia at gcc dot gnu.org
2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
2024-03-28 17:45 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-28 15:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |11.5
Summary|aarch64: wrong code with |[11 only] aarch64: wrong
|Neon ld1/st1x4 intrinsics |code with Neon ld1/st1x4
|gcc-11 and earlier |intrinsics gcc-11 and
| |earlier
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/114521] [11 only] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
2024-03-28 15:57 ` [Bug target/114521] [11 only] " pinskia at gcc dot gnu.org
@ 2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
2024-03-28 17:45 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2024-03-28 17:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
Richard Sandiford <rsandifo at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsandifo at gcc dot gnu.org
--- Comment #2 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
Oops. I was going to upload a patch for the bug here, but it looks like I
accidentally committed it while backporting PR97696 to GCC 11. The patch was
g:daee0409d195d346562e423da783d5d1cf8ea175.
I'm not sure what to do now. Perhaps we should leave it in?
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug target/114521] [11 only] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
` (2 preceding siblings ...)
2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
@ 2024-03-28 17:45 ` pinskia at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-28 17:45 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |11.4.1, 12.1.0
Status|UNCONFIRMED |RESOLVED
Known to fail| |10.1.0, 11.1.0, 11.4.0
Resolution|--- |FIXED
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed for GCC 11.5.0 then ...
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-28 17:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
2024-03-28 15:57 ` [Bug target/114521] [11 only] " pinskia at gcc dot gnu.org
2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
2024-03-28 17:45 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).