public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
@ 2023-03-23  3:14 wumingchuan1992 at foxmail dot com
  2023-03-23  3:23 ` [Bug target/109254] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: wumingchuan1992 at foxmail dot com @ 2023-03-23  3:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

            Bug ID: 109254
           Summary: Bug in gcc (13.0.1) support for ARM SVE, which
                    randomly modifies the prediction register
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wumingchuan1992 at foxmail dot com
  Target Milestone: ---

func_demo.c
#include <stdio.h>
#include <arm_sve.h>

svfloat32_t func_demo(svfloat32_t x, svfloat32_t y, svbool_t pg)
{
svfloat32_t z = svadd_f32_x(pg, x, svdup_f32(0x1.800fep19f));
svbool_t cmp = svcmplt_f32(pg, z, svdup_f32(0.0f));
svfloat32_t zM1 = svsub_f32_x(pg, z, svdup_n_f32(1.0f));
z = svsel_f32(cmp, zM1, z);
svfloat32_t sum = svadd_f32_x(pg, z, y);
return sum;
}

Run gcc(10.3.0 and 13.0.1 20230314) with the following command:
$ gcc -std=c99 -O2 -funroll-loops -march=armv8.3-a+fp+sve -o func_demo.o -c
func_demo.c

After objdump func_demo.o, the result is as follows:
0000000000000000 <func_demo>:
0: 90000000 adrp x0, 0 <func_demo>
4: 2518e3e1 ptrue p1.b
8: 91000001 add x1, x0, #0x0
c: 8540c422 ld1rw {z2.s}, p1/z, [x1]
10: 65808002 fadd z2.s, p0/m, z2.s, z0.s
14: 65912040 fcmlt p0.s, p0/z, z2.s, #0.0
18: 0420bc43 movprfx z3, z2
1c: 65998023 fsub z3.s, p0/m, z3.s, #1.0
20: 05a2c060 sel z0.s, p0, z3.s, z2.s
24: 65808020 fadd z0.s, p0/m, z0.s, z1.s
28: d65f03c0 ret

In the compilation of the 14 lines, the fcmlt operation covers the value of the
p0 register, resulting in subsequent fsubs and fadd using the wrong prediction
register. This results in an error in the program result.

For comparison, use llvm for compilation:
clang -std=c99 -O2 -funroll-loops -march=armv8.3-a+fp+sve -o func_demo1.o -c
func_demo.c

the result is as follows:
0000000000000000 <func_demo>:
0: 5280fe08 mov w8, #0x7f0 // #2032
4: 72a92808 movk w8, #0x4940, lsl #16
8: 05a03902 mov z2.s, w8
c: 65808040 fadd z0.s, p0/m, z0.s, z2.s
10: 04603002 mov z2.d, z0.d
14: 65912001 fcmlt p1.s, p0/z, z0.s, #0.0
18: 65998022 fsub z2.s, p0/m, z2.s, #1.0
1c: 05a0c440 mov z0.s, p1/m, z2.s
20: 65808020 fadd z0.s, p0/m, z0.s, z1.s
24: d65f03c0 ret
Line 14 is correct.

Any suggestions to proceed?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
@ 2023-03-23  3:23 ` pinskia at gcc dot gnu.org
  2023-03-23 14:43 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-23  3:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
-fno-rename-registers is the workaround

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
  2023-03-23  3:23 ` [Bug target/109254] " pinskia at gcc dot gnu.org
@ 2023-03-23 14:43 ` jakub at gcc dot gnu.org
  2023-03-23 16:08 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-23 14:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Yeah, before rnreg pass the fcmlt has p1 as destination and fsub/fsel use p0
(i.e. the parameter) while sel in between them uses p1.
I see this behavior already in r10-5107-gb789efeae8c0620b8 and shortly before
that
it has been rejected with "variable-sized object may not be initialized"
errors.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
  2023-03-23  3:23 ` [Bug target/109254] " pinskia at gcc dot gnu.org
  2023-03-23 14:43 ` jakub at gcc dot gnu.org
@ 2023-03-23 16:08 ` jakub at gcc dot gnu.org
  2023-03-23 17:13 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-23 16:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
2023-03-23  Jakub Jelinek  <jakub@redhat.com>

        PR target/109254
        * config/aarch64/aarch64.cc (aarch64_function_arg_regno_p): Also
        return true for p0-p3.

--- gcc/config/aarch64/aarch64.cc.jj    2023-03-13 00:11:52.328213351 +0100
+++ gcc/config/aarch64/aarch64.cc       2023-03-23 16:57:29.957866005 +0100
@@ -7959,7 +7959,8 @@ bool
 aarch64_function_arg_regno_p (unsigned regno)
 {
   return ((GP_REGNUM_P (regno) && regno < R0_REGNUM + NUM_ARG_REGS)
-         || (FP_REGNUM_P (regno) && regno < V0_REGNUM + NUM_FP_ARG_REGS));
+         || (FP_REGNUM_P (regno) && regno < V0_REGNUM + NUM_FP_ARG_REGS)
+         || (PR_REGNUM_P (regno) && regno < P0_REGNUM + NUM_PR_ARG_REGS));
 }

 /* Implement FUNCTION_ARG_BOUNDARY.  Every parameter gets at least

fixes this.  Or do we want to return true for p0-p3 only if SVE is enabled?
Not familiar with SVE enough to turn the testcase into gcc.target/aarch64/sve
runtime tests (bet we need __attribute__((noipa)) on the function, but unsure
how to initialize the arguments in the caller and how to verify the result is
correct in it after the call.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
                   ` (2 preceding siblings ...)
  2023-03-23 16:08 ` jakub at gcc dot gnu.org
@ 2023-03-23 17:13 ` jakub at gcc dot gnu.org
  2023-03-24 12:24 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-23 17:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
And
--- gcc/config/aarch64/aarch64.cc.jj    2023-03-13 00:11:52.328213351 +0100
+++ gcc/config/aarch64/aarch64.cc       2023-03-23 16:57:29.957866005 +0100
@@ -7388,6 +7388,9 @@ aarch64_function_value_regno_p (const un
   if (regno >= V0_REGNUM && regno < V0_REGNUM + HA_MAX_NUM_FLDS)
     return TARGET_FLOAT;

+  if (regno == P0_REGNUM)
+    return TARGET_SVE;
+
   return false;
 }

Or can one actually return in more than p0?  Tried struct S { svbool_t a, b; };
but that is rejected...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
                   ` (3 preceding siblings ...)
  2023-03-23 17:13 ` jakub at gcc dot gnu.org
@ 2023-03-24 12:24 ` jakub at gcc dot gnu.org
  2023-03-27  7:53 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-03-24 12:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 54741
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54741&action=edit
gcc13-pr109254.patch

Those patches break bootstrap though (in libobjc) and regress some
__builtin_apply*/__builtin_return* testcases.
Trying this now instead.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
                   ` (4 preceding siblings ...)
  2023-03-24 12:24 ` jakub at gcc dot gnu.org
@ 2023-03-27  7:53 ` pinskia at gcc dot gnu.org
  2023-04-01  6:58 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-03-27  7:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-03-27
             Status|UNCONFIRMED                 |NEW

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
                   ` (5 preceding siblings ...)
  2023-03-27  7:53 ` pinskia at gcc dot gnu.org
@ 2023-04-01  6:58 ` cvs-commit at gcc dot gnu.org
  2023-04-03 13:18 ` jakub at gcc dot gnu.org
  2024-02-28  6:48 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-04-01  6:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:b1f6cb2cc3aad0521ad3181d5107e52be155fd18

commit r13-6965-gb1f6cb2cc3aad0521ad3181d5107e52be155fd18
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Sat Apr 1 08:55:55 2023 +0200

    aarch64, builtins: Include PR registers in FUNCTION_ARG_REGNO_P etc.
[PR109254]

    The following testcase is miscompiled on aarch64-linux in the regname pass,
    because while the function takes arguments in the p0 register,
    FUNCTION_ARG_REGNO_P doesn't reflect that, so DF doesn't know the register
is
    used in register passing. It sees 2 chains with p1 register and wants to
    replace the second one and as DF doesn't know p0 is live at the start of
the
    function, it will happily use p0 register even when it is used in
subsequent
    instructions.

    The following patch fixes that.  FUNCTION_ARG_REGNO_P returns non-zero
    for p0-p3 (unconditionally, seems for the floating/vector registers it
    doesn't conditionalize them on TARGET_FLOAT either, but if you want,
    I can conditionalize p0-p3 on TARGET_SVE), similarly
    targetm.calls.function_value_regno_p returns true for p0-p3 registers
    if TARGET_SVE (again for consistency, that function conditionalizes
    the float/vector on TARGET_FLOAT).

    Now, that change broke bootstrap in libobjc and some
    __builtin_apply_args/__builtin_apply/__builtin_return tests.  The
    aarch64_get_reg_raw_mode hook already documents that SVE scalable
arg/return
    passing is fundamentally incompatible with those builtins, but unlike
    the floating/vector regs where it forces a fixed vector mode, I think
    there is no fixed mode which could be used for p0-p3.  So, I have tweaked
    the generic code so that it uses VOIDmode return from that hook to signal
    that a register shouldn't be touched by
    __builtin_apply_args/__builtin_apply/__builtin_return
    despite being mentioned in FUNCTION_ARG_REGNO_P or
    targetm.calls.function_value_regno_p.

    gcc/
    2023-04-01  Jakub Jelinek  <jakub@redhat.com>

            PR target/109254
            * builtins.cc (apply_args_size): If targetm.calls.get_raw_arg_mode
            returns VOIDmode, handle it like if the register isn't used for
            passing arguments at all.
            (apply_result_size): If targetm.calls.get_raw_result_mode returns
            VOIDmode, handle it like if the register isn't used for returning
            results at all.
            * target.def (get_raw_result_mode, get_raw_arg_mode): Document what
it
            means to return VOIDmode.
            * doc/tm.texi: Regenerated.
            * config/aarch64/aarch64.cc (aarch64_function_value_regno_p):
Return
            TARGET_SVE for P0_REGNUM.
            (aarch64_function_arg_regno_p): Also return true for p0-p3.
            (aarch64_get_reg_raw_mode): Return VOIDmode for PR_REGNUM_P regs.

    gcc/testsuite/
    2023-04-01  Jakub Jelinek  <jakub@redhat.com>
                Richard Sandiford  <richard.sandiford@arm.com>

            PR target/109254
            * gcc.target/aarch64/sve/pr109254.c: New test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
                   ` (6 preceding siblings ...)
  2023-04-01  6:58 ` cvs-commit at gcc dot gnu.org
@ 2023-04-03 13:18 ` jakub at gcc dot gnu.org
  2024-02-28  6:48 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2023-04-03 13:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Should be fixed on the trunk so far.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/109254] Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register
  2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
                   ` (7 preceding siblings ...)
  2023-04-03 13:18 ` jakub at gcc dot gnu.org
@ 2024-02-28  6:48 ` pinskia at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-28  6:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109254

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED
   Target Milestone|---                         |13.0

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-02-28  6:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23  3:14 [Bug target/109254] New: Bug in gcc (13.0.1) support for ARM SVE, which randomly modifies the prediction register wumingchuan1992 at foxmail dot com
2023-03-23  3:23 ` [Bug target/109254] " pinskia at gcc dot gnu.org
2023-03-23 14:43 ` jakub at gcc dot gnu.org
2023-03-23 16:08 ` jakub at gcc dot gnu.org
2023-03-23 17:13 ` jakub at gcc dot gnu.org
2023-03-24 12:24 ` jakub at gcc dot gnu.org
2023-03-27  7:53 ` pinskia at gcc dot gnu.org
2023-04-01  6:58 ` cvs-commit at gcc dot gnu.org
2023-04-03 13:18 ` jakub at gcc dot gnu.org
2024-02-28  6:48 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).