public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
@ 2024-03-28 15:52 jswinney at amazon dot com
  2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: jswinney at amazon dot com @ 2024-03-28 15:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521

            Bug ID: 114521
           Summary: aarch64: wrong code with Neon ld1/st1x4 intrinsics
                    gcc-11 and earlier
           Product: gcc
           Version: 11.4.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jswinney at amazon dot com
  Target Milestone: ---

Created attachment 57831
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57831&action=edit
patch to fix the broken test

Using a half-width 4-register load aarch64 Neon intrinsic results in incorrect
stack spill, or at least incorrect offsetting into the resulting stack spill.
This happens at any level of optimization, -O0..3.


```
#include <arm_neon.h>
#include <inttypes.h>
#include <stdio.h>

uint8x8_t global[4] = {0};

void test(const uint8_t* arr)
{
  const uint8x8x4_t parr = vld1_u8_x4(arr);

  global[0] = parr.val[0];
  global[1] = parr.val[1];
  global[2] = parr.val[2];
  global[3] = parr.val[3];
}

int main()
{
  const uint8_t arr[32] = {
    0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
    0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
    0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
    0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B, 0x0A, 0x0B,
  };

  for (int i = 0; i < 4; i++) {
    printf("%llx ", (uint64_t) global[i]);
  }
  printf("\n");

  test(arr);

  for (int i = 0; i < 4; i++) {
    printf("%llx ", (uint64_t) global[i]);
  }
  printf("\n");

  return 0;
}
```

From the compiled "test" function above, the compiler emits the correct
half-width load instruction followed by a full-width store:
```
test(unsigned char const*):
        ld1     {v0.8b - v3.8b}, [x0]
        sub     sp, sp, #64
...
        st1     {v0.16b - v3.16b}, [sp]
```

This issue is corrected by a change in gcc-12 in:
66f206b85395c273980e2b81a54dbddc4897e4a7

Additionally the test used to verify this code silently ignores the error. I
have attached a patch which fixes the test.


```
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-amazon-linux/11/lto-wrapper
Target: aarch64-amazon-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info
--with-bugurl=https://github.com/amazonlinux/amazon-linux-2022 --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin
--enable-initfini-array
--with-isl=/builddir/build/BUILD/gcc-11.4.1-20230605/obj-aarch64-amazon-linux/isl-install
--enable-gnu-indirect-function --with-tune=neoverse-n1
--with-arch=armv8.2-a+crypto --build=aarch64-amazon-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.1 20230605 (Red Hat 11.4.1-2) (GCC)
```

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/114521] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
  2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
@ 2024-03-28 15:56 ` pinskia at gcc dot gnu.org
  2024-03-28 15:57 ` [Bug target/114521] [11 only] " pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-28 15:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
g:66f206b85395c273980e2b81a54dbddc4897e4a7

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/114521] [11 only] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
  2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
  2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
@ 2024-03-28 15:57 ` pinskia at gcc dot gnu.org
  2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
  2024-03-28 17:45 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-28 15:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.5
            Summary|aarch64: wrong code with    |[11 only] aarch64: wrong
                   |Neon ld1/st1x4 intrinsics   |code with Neon ld1/st1x4
                   |gcc-11 and earlier          |intrinsics gcc-11 and
                   |                            |earlier

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/114521] [11 only] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
  2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
  2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
  2024-03-28 15:57 ` [Bug target/114521] [11 only] " pinskia at gcc dot gnu.org
@ 2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
  2024-03-28 17:45 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2024-03-28 17:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521

Richard Sandiford <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #2 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
Oops.  I was going to upload a patch for the bug here, but it looks like I
accidentally committed it while backporting PR97696 to GCC 11.  The patch was
g:daee0409d195d346562e423da783d5d1cf8ea175.

I'm not sure what to do now.  Perhaps we should leave it in?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/114521] [11 only] aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier
  2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
                   ` (2 preceding siblings ...)
  2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
@ 2024-03-28 17:45 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-28 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |11.4.1, 12.1.0
             Status|UNCONFIRMED                 |RESOLVED
      Known to fail|                            |10.1.0, 11.1.0, 11.4.0
         Resolution|---                         |FIXED

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed for GCC 11.5.0 then ...

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-03-28 17:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-28 15:52 [Bug c/114521] New: aarch64: wrong code with Neon ld1/st1x4 intrinsics gcc-11 and earlier jswinney at amazon dot com
2024-03-28 15:56 ` [Bug target/114521] " pinskia at gcc dot gnu.org
2024-03-28 15:57 ` [Bug target/114521] [11 only] " pinskia at gcc dot gnu.org
2024-03-28 17:16 ` rsandifo at gcc dot gnu.org
2024-03-28 17:45 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).