public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/68106] New: c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64
@ 2015-10-26 22:29 zsojka at seznam dot cz
  2015-10-30 17:37 ` [Bug rtl-optimization/68106] " vmakarov at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: zsojka at seznam dot cz @ 2015-10-26 22:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

            Bug ID: 68106
           Summary: c-c++-common/torture/builtin-arith-overflow-11.c FAILs
                    with -flra-remat @ aarch64
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zsojka at seznam dot cz
  Target Milestone: ---

Created attachment 36594
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36594&action=edit
reduced testcase

The testcase fails at aarch64 at both trunk and 5-branch with -O -flra-remat. I
haven't managed to generate wrong code with -O2, which enables -flra-remat. I
am using qemu userspace emulation to run the testcase.

$ gcc -O -flra-remat testcase.c
$ ./a.out 
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted

$ gcc -O -flra-remat testcase.c -S
$ gcc -O -fno-lra-remat testcase.c -S -o testcase-no-lra-remat.s
$ diff -u testcase.s testcase-no-lra-remat.s
...
@@ -104,20 +104,21 @@
 .L25:
        lsl     x20, x23, 56
        add     x19, x20, x26
+       str     x19, [x29, 128]
        asr     x22, x19, 63
        adds    x27, x21, x19
        adc     x0, x24, x22
-       str     x0, [x29, 120]
-       add     x2, x29, 140
+       str     x0, [x29, 136]
+       add     x2, x29, 156
        mov     x1, x19
        mov     x0, x25
        bl      upseu
        cmp     x0, x27
        bne     .L14
-       adc     x0, x24, x22
+       ldr     x0, [x29, 136]
        cmp     x0, xzr
        cset    w1, ne
-       ldr     w0, [x29, 140]
+       ldr     w0, [x29, 156]
        cmp     w1, w0
        bne     .L14
        subs    x1, x21, x20
...

If I am reading the assembly correctly, the important difference is:
...
        adds    x27, x21, x19
        adc     x0, x24, x22
-       str     x0, [x29, 120]
-       add     x2, x29, 140
+       str     x0, [x29, 136]
+       add     x2, x29, 156
...
-       adc     x0, x24, x22
+       ldr     x0, [x29, 136]
...

Normally, without lra-remat, the result of "adc     x0, x24, x22" is stored to
the stack and then reloaded. 
With -flra-remat, the value is stored as well, but later, "adc" is used to
recompute the value again - that saves one access to the stack, but cpsr has
changed in the meantime, so it is using wrong value of the C bit.

Tested revisions:
r229293 - FAIL
5-branch r229305 - FAIL
4_9-branch - doesn't know -flra-remat


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64
  2015-10-26 22:29 [Bug rtl-optimization/68106] New: c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64 zsojka at seznam dot cz
@ 2015-10-30 17:37 ` vmakarov at gcc dot gnu.org
  2015-10-30 17:45 ` vmakarov at gcc dot gnu.org
  2015-10-30 17:52 ` vmakarov at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-30 17:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #1 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Zdenek Sojka from comment #0)
> Created attachment 36594 [details]
> reduced testcase
> 
> The testcase fails at aarch64 at both trunk and 5-branch with -O
> -flra-remat. I haven't managed to generate wrong code with -O2, which
> enables -flra-remat. I am using qemu userspace emulation to run the testcase.
> 
> $ gcc -O -flra-remat testcase.c
> $ ./a.out 
> qemu: uncaught target signal 6 (Aborted) - core dumped
> Aborted
> 
> $ gcc -O -flra-remat testcase.c -S
> $ gcc -O -fno-lra-remat testcase.c -S -o testcase-no-lra-remat.s
> $ diff -u testcase.s testcase-no-lra-remat.s
> ...
> @@ -104,20 +104,21 @@
>  .L25:
>         lsl     x20, x23, 56
>         add     x19, x20, x26
> +       str     x19, [x29, 128]
>         asr     x22, x19, 63
>         adds    x27, x21, x19
>         adc     x0, x24, x22
> -       str     x0, [x29, 120]
> -       add     x2, x29, 140
> +       str     x0, [x29, 136]
> +       add     x2, x29, 156
>         mov     x1, x19
>         mov     x0, x25
>         bl      upseu
>         cmp     x0, x27
>         bne     .L14
> -       adc     x0, x24, x22
> +       ldr     x0, [x29, 136]
>         cmp     x0, xzr
>         cset    w1, ne
> -       ldr     w0, [x29, 140]
> +       ldr     w0, [x29, 156]
>         cmp     w1, w0
>         bne     .L14
>         subs    x1, x21, x20
> ...
> 
> If I am reading the assembly correctly, the important difference is:
> ...
>         adds    x27, x21, x19
>         adc     x0, x24, x22
> -       str     x0, [x29, 120]
> -       add     x2, x29, 140
> +       str     x0, [x29, 136]
> +       add     x2, x29, 156
> ...
> -       adc     x0, x24, x22
> +       ldr     x0, [x29, 136]
> ...
> 
> Normally, without lra-remat, the result of "adc     x0, x24, x22" is stored
> to the stack and then reloaded. 
> With -flra-remat, the value is stored as well, but later, "adc" is used to
> recompute the value again - that saves one access to the stack, but cpsr has
> changed in the meantime, so it is using wrong value of the C bit.
> 
> Tested revisions:
> r229293 - FAIL
> 5-branch r229305 - FAIL
> 4_9-branch - doesn't know -flra-remat

I was not able to reproduce it on the current trunk.  But I've reproduced it on
r229293.  I've been working on it and I am planning to submit a patch for the
trunk today.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64
  2015-10-26 22:29 [Bug rtl-optimization/68106] New: c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64 zsojka at seznam dot cz
  2015-10-30 17:37 ` [Bug rtl-optimization/68106] " vmakarov at gcc dot gnu.org
@ 2015-10-30 17:45 ` vmakarov at gcc dot gnu.org
  2015-10-30 17:52 ` vmakarov at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-30 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #2 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
Author: vmakarov
Date: Fri Oct 30 17:45:16 2015
New Revision: 229593

URL: https://gcc.gnu.org/viewcvs?rev=229593&root=gcc&view=rev
Log:
2015-10-30  Vladimir Makarov  <vmakarov@redhat.com>

        PR rtl-optimization/68106
        * lra-remat.c (input_regno_present_p): Process hard regs
        explicitly present in machine description insns.
        (call_used_input_regno_present_p): Ditto.
        (calculate_gen_cands): Ditto.
        (do_remat): Ditto.

2015-10-30  Vladimir Makarov  <vmakarov@redhat.com>

        PR rtl-optimization/68106
        * gcc.target/aarch64/pr68106.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/aarch64/pr68106.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-remat.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug rtl-optimization/68106] c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64
  2015-10-26 22:29 [Bug rtl-optimization/68106] New: c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64 zsojka at seznam dot cz
  2015-10-30 17:37 ` [Bug rtl-optimization/68106] " vmakarov at gcc dot gnu.org
  2015-10-30 17:45 ` vmakarov at gcc dot gnu.org
@ 2015-10-30 17:52 ` vmakarov at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vmakarov at gcc dot gnu.org @ 2015-10-30 17:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68106

--- Comment #3 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
The problem was in ignoring hard registers explicitly present in machine
description insns by LRA rematerialization subpass.

I'll wait for a few days before backporting this in gcc-5-branch.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-10-30 17:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-26 22:29 [Bug rtl-optimization/68106] New: c-c++-common/torture/builtin-arith-overflow-11.c FAILs with -flra-remat @ aarch64 zsojka at seznam dot cz
2015-10-30 17:37 ` [Bug rtl-optimization/68106] " vmakarov at gcc dot gnu.org
2015-10-30 17:45 ` vmakarov at gcc dot gnu.org
2015-10-30 17:52 ` vmakarov at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).