[Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000
@ 2024-08-04 18:09 zsojka at seznam dot cz
  2024-08-04 22:27 ` [Bug target/116229] [15 Regression] " pinskia at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: zsojka at seznam dot cz @ 2024-08-04 18:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116229

            Bug ID: 116229
           Summary: wrong code at -Ofast aarch64 due to missing fneg to
                    generate 0x8000000000000000
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zsojka at seznam dot cz
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: aarch64-unknown-linux-gnu

Created attachment 58828
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58828&action=edit
reduced testcase

Output:
$ aarch64-unknown-linux-gnu-gcc -Ofast testcase.c -static
$ qemu-aarch64 -- ./a.out 
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted

Diff from -O3 to -Ofast shows the 0x8000000000000000 constant is not generated
(as -0.0):

$ diff -u a-testcase.s-O3 a-testcase.s-Ofast 
--- a-testcase.s-O3     2024-08-04 20:07:06.417271286 +0200
+++ a-testcase.s-Ofast  2024-08-04 20:07:11.877241325 +0200
@@ -24,7 +24,7 @@
 main:
 .LFB1:
        .cfi_startproc
-       movi    v0.4s, 0
+       movi    v0.2d, 0
        stp     x29, x30, [sp, -48]!
        .cfi_def_cfa_offset 48
        .cfi_offset 29, -48
@@ -35,7 +35,6 @@
        .cfi_offset 20, -24
        adrp    x20, .LC0
        add     x19, sp, 40
-       fneg    v0.2d, v0.2d
        add     x20, x20, :lo12:.LC0
        bl      foo
        str     d0, [sp, 40]

$ aarch64-unknown-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-aarch64/bin/aarch64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r15-2709-20240804001730-g7cd71c88637-checking-yes-rtl-df-extra-nobootstrap-aarch64/bin/../libexec/gcc/aarch64-unknown-linux-gnu/15.0.0/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--with-sysroot=/usr/aarch64-unknown-linux-gnu --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=aarch64-unknown-linux-gnu
--with-ld=/usr/bin/aarch64-unknown-linux-gnu-ld
--with-as=/usr/bin/aarch64-unknown-linux-gnu-as --enable-libsanitizer
--disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r15-2709-20240804001730-g7cd71c88637-checking-yes-rtl-df-extra-nobootstrap-aarch64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.0.0 20240804 (experimental) (GCC)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/116229] [15 Regression] wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000
  2024-08-04 18:09 [Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000 zsojka at seznam dot cz
@ 2024-08-04 22:27 ` pinskia at gcc dot gnu.org
  2024-08-04 22:38 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-08-04 22:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116229

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-08-04
     Ever confirmed|0                           |1
   Target Milestone|---                         |15.0
             Status|UNCONFIRMED                 |NEW
            Summary|wrong code at -Ofast        |[15 Regression] wrong code
                   |aarch64 due to missing fneg |at -Ofast aarch64 due to
                   |to generate                 |missing fneg to generate
                   |0x8000000000000000          |0x8000000000000000

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Reduced a little further:
```
typedef __attribute__((__vector_size__ (8))) unsigned long V;

V __attribute__((__noipa__))
foo (void)
{
  return (V){ 0x8000000000000000 };
}

V ref = (V){ 0x8000000000000000 };

int
main ()
{
  V v = foo ();
  if (v[0] != ref[0])
    __builtin_abort();
}
```


Late_combine2 does:
```
trying to combine definition of r32 in:
   13: v0:V4SI=const_vector
into:
   14: v0:V2DF=-v0:V2DF
successfully matched this instruction to *aarch64_simd_movv2df:
(set (reg:V2DF 32 v0)
    (const_vector:V2DF [
            (const_double:DF -0.0 [-0x0.0p+0]) repeated x2
        ]))
```

Which is correct if it was V2DF but the issue is how split of:
```
(insn 10 5 11 2 (set (reg:DI 32 v0)
        (const_int -9223372036854775808 [0x8000000000000000]))
"/app/example.cpp":7:1 -1
     (expr_list:REG_EQUAL (const_int -9223372036854775808 [0x8000000000000000])
        (nil)))
```

into:
```
(insn 13 5 14 2 (set (reg:V4SI 32 v0)
        (const_vector:V4SI [
                (const_int 0 [0]) repeated x4
            ])) "/app/example.cpp":7:1 -1
     (nil))
(insn 14 13 11 2 (set (reg:V2DF 32 v0)
        (neg:V2DF (reg:V2DF 32 v0))) "/app/example.cpp":7:1 -1
     (nil))
```

Via `Splitting with gen_split_10 (aarch64.md:1488)` .

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/116229] [15 Regression] wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000
  2024-08-04 18:09 [Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000 zsojka at seznam dot cz
  2024-08-04 22:27 ` [Bug target/116229] [15 Regression] " pinskia at gcc dot gnu.org
@ 2024-08-04 22:38 ` pinskia at gcc dot gnu.org
  2024-08-06  6:23 ` tnfchris at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-08-04 22:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116229

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
  /* For Advanced SIMD we can create an integer with only the top bit set
     using fneg (0.0f).  */

is wrong in aarch64_maybe_generate_simd_constant.

it should use either an unspec here or an XOR instead of fneg here I think
especially for -ffast-math reasons.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/116229] [15 Regression] wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000
  2024-08-04 18:09 [Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000 zsojka at seznam dot cz
  2024-08-04 22:27 ` [Bug target/116229] [15 Regression] " pinskia at gcc dot gnu.org
  2024-08-04 22:38 ` pinskia at gcc dot gnu.org
@ 2024-08-06  6:23 ` tnfchris at gcc dot gnu.org
  2024-08-08 17:52 ` cvs-commit at gcc dot gnu.org
  2024-08-08 17:55 ` tnfchris at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-08-06  6:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116229

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |tnfchris at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
>   /* For Advanced SIMD we can create an integer with only the top bit set
>      using fneg (0.0f).  */
> 
> is wrong in aarch64_maybe_generate_simd_constant.
> 
> it should use either an unspec here or an XOR instead of fneg here I think
> especially for -ffast-math reasons.

XOR would defeat the point of the optimization. The original expression is fine
but relied on nothing in the late pipeline being able to fold the zero constant
back in.

It was for this reason that we explicitly forced it to a separate register.
Late combine is just doing something not possible before. I'll fix it.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/116229] [15 Regression] wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000
  2024-08-04 18:09 [Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000 zsojka at seznam dot cz
                   ` (2 preceding siblings ...)
  2024-08-06  6:23 ` tnfchris at gcc dot gnu.org
@ 2024-08-08 17:52 ` cvs-commit at gcc dot gnu.org
  2024-08-08 17:55 ` tnfchris at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-08-08 17:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116229

--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Tamar Christina <tnfchris@gcc.gnu.org>:

https://gcc.gnu.org/g:2c24e0568392e51a77ebdaab629d631969ce8966

commit r15-2839-g2c24e0568392e51a77ebdaab629d631969ce8966
Author: Tamar Christina <tamar.christina@arm.com>
Date:   Thu Aug 8 18:51:30 2024 +0100

    AArch64: Fix signbit mask creation after late combine [PR116229]

    The optimization to generate a Di signbit constant by using fneg was
relying
    on nothing being able to push the constant into the negate.  It's run quite
    late for this reason.

    However late combine now runs after it and triggers RTL simplification
based on
    the neg.  When -fno-signed-zeros this ends up dropping the - from the -0.0
and
    thus producing incorrect code.

    This change adds a new unspec FNEG on DI mode which prevents this
simplication.

    gcc/ChangeLog:

            PR target/116229
            * config/aarch64/aarch64-simd.md (aarch64_fnegv2di2<vczle><vczbe>):
New.
            * config/aarch64/aarch64.cc (aarch64_maybe_generate_simd_constant):
            Update call to gen_aarch64_fnegv2di2.
            * config/aarch64/iterators.md: New UNSPEC_FNEG.

    gcc/testsuite/ChangeLog:

            PR target/116229
            * gcc.target/aarch64/pr116229.c: New test.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/116229] [15 Regression] wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000
  2024-08-04 18:09 [Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000 zsojka at seznam dot cz
                   ` (3 preceding siblings ...)
  2024-08-08 17:52 ` cvs-commit at gcc dot gnu.org
@ 2024-08-08 17:55 ` tnfchris at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-08-08 17:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116229

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
Fixed, thanks for the report!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-08-08 17:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-04 18:09 [Bug target/116229] New: wrong code at -Ofast aarch64 due to missing fneg to generate 0x8000000000000000 zsojka at seznam dot cz
2024-08-04 22:27 ` [Bug target/116229] [15 Regression] " pinskia at gcc dot gnu.org
2024-08-04 22:38 ` pinskia at gcc dot gnu.org
2024-08-06  6:23 ` tnfchris at gcc dot gnu.org
2024-08-08 17:52 ` cvs-commit at gcc dot gnu.org
2024-08-08 17:55 ` tnfchris at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).