public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/89057] [8/9/10 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
@ 2020-03-24 15:13 ` tnfchris at gcc dot gnu.org
  2020-11-30 15:09 ` [Bug target/89057] [8/9/10/11 " abhiraj.garakapati at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2020-03-24 15:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057
Bug 89057 depends on bug 94052, which changed state.

Bug 94052 Summary: Paradoxical subregs out of expand causes ICE with multi register modes at -O2 or higher
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94052

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [8/9/10/11 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
  2020-03-24 15:13 ` [Bug target/89057] [8/9/10 Regression] AArch64 ld3 st4 less optimized tnfchris at gcc dot gnu.org
@ 2020-11-30 15:09 ` abhiraj.garakapati at gmail dot com
  2020-12-03  5:25 ` abhiraj.garakapati at gmail dot com
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: abhiraj.garakapati at gmail dot com @ 2020-11-30 15:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

Abhiraj Garakapati <abhiraj.garakapati at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |abhiraj.garakapati at gmail dot co
                   |                            |m

--- Comment #7 from Abhiraj Garakapati <abhiraj.garakapati at gmail dot com> ---
This issue is observed during the RTL phase (test1.cpp.234r.expand i.e, during
Gimple to RTL conversion.) with -O1 flag enabled. (This issue is seen in -O1,
-O2, -O3 not in -O0.)

All these below 3 Gimple instructions are converted to 2 move instructions each
during Gimple to RTL conversion. This scenario is not seen in GCC-7.3.0 only
seen from GCC-8.1.0 due to the patch:
https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=a977dc0c5e069bf198f78ed4767deac369904301
  _68 = __builtin_aarch64_combinev8qi (_67, { 0, 0, 0, 0, 0, 0, 0, 0 });
  _69 = __builtin_aarch64_combinev8qi (_66, { 0, 0, 0, 0, 0, 0, 0, 0 });
  _70 = __builtin_aarch64_combinev8qi (_65, { 0, 0, 0, 0, 0, 0, 0, 0 });

This issue can be fixed by adding "-fno-move-loop-invariants" (as a
workaround).

This issue can be fixed on GCC-8.1.0 by reverting "aarch64-simd.md" file
changes in the patch:
https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=a977dc0c5e069bf198f78ed4767deac369904301

Also, cross-checked the newly built toolchain with reverting "aarch64-simd.md"
file changes with the above-mentioned test case and got the expected output
same as GCC-7.3.0.

With gcc 8.1 with reverting "aarch64-simd.md" file changes the inner loop is:
        .L5:
                ld3     {v4.8b-v6.8b}, [x1]
                add     x1, x1, #0x18
                mov     v0.8b, v6.8b
                mov     v1.8b, v5.8b
                mov     v2.8b, v4.8b
                mov     v3.16b, v7.16b
                st4     {v0.8b-v3.8b}, [x0]
                add     x0, x0, 32
                cmp     x3, x0
                bhi     .L5

Also, cross-checked it with the below test case (which is mentioned in patch:
https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=a977dc0c5e069bf198f78ed4767deac369904301
this patch improves code generation for literal vector construction by
expanding and exposing the pattern to RTL optimization earlier. The current
implementation delays splitting the pattern until after reload which results in
poor code generation for the following code)

Test case to show patch
improvement(https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=a977dc0c5e069bf198f78ed4767deac369904301
):

        #include "arm_neon.h"
        int16x8_t
        foo ()
        {
          return vcombine_s16 (vdup_n_s16 (0), vdup_n_s16 (8));
        }

GCC_8.1.0 -O1 with reverting "aarch64-simd.md" file changes:

        foo():
                adrp    x0, 0 <_Z3foov>
                ldr     q0, [x0]
                ret

So, reverting the "aarch64-simd.md" file changes does not result in poor code
generation.
Also, cross-checked it with the latest GCC version GCC-10.2.0.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [8/9/10/11 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
  2020-03-24 15:13 ` [Bug target/89057] [8/9/10 Regression] AArch64 ld3 st4 less optimized tnfchris at gcc dot gnu.org
  2020-11-30 15:09 ` [Bug target/89057] [8/9/10/11 " abhiraj.garakapati at gmail dot com
@ 2020-12-03  5:25 ` abhiraj.garakapati at gmail dot com
  2020-12-30 10:24 ` rsandifo at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: abhiraj.garakapati at gmail dot com @ 2020-12-03  5:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

--- Comment #8 from Abhiraj Garakapati <abhiraj.garakapati at gmail dot com> ---
Also, cross-checked it with the latest GCC version GCC-8.4.0 with above
mentioned reverting changes of "aarch64-simd.md" file and got the expected
output same as GCC-7.3.0.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [8/9/10/11 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2020-12-03  5:25 ` abhiraj.garakapati at gmail dot com
@ 2020-12-30 10:24 ` rsandifo at gcc dot gnu.org
  2021-01-04 11:59 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-12-30 10:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rsandifo at gcc dot gnu.org

--- Comment #9 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
I think the fix is to make aarch64_combine<mode> aware of
aarch64_combinez{,_be}<mode> (i.e. the special case of the
second vector input being zero).  Testing a patch.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [8/9/10/11 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2020-12-30 10:24 ` rsandifo at gcc dot gnu.org
@ 2021-01-04 11:59 ` cvs-commit at gcc dot gnu.org
  2021-01-04 12:07 ` [Bug target/89057] [8/9/10 " rsandifo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-04 11:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

--- Comment #10 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:b41e6dd50f329b0291457e939d4c0dacd81c82c1

commit r11-6439-gb41e6dd50f329b0291457e939d4c0dacd81c82c1
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Mon Jan 4 11:59:07 2021 +0000

    aarch64: Improve vcombine codegen [PR89057]

    This patch fixes a codegen regression in the handling of things like:

      __temp.val[0]                                                            
 \
        = vcombine_##funcsuffix (__b.val[0],                                   
 \
                                 vcreate_##funcsuffix (__AARCH64_UINT64_C
(0))); \

    in the 64-bit vst[234] functions.  The zero was forced into a
    register at expand time, and we relied on combine to fuse the
    zero and combine back together into a single combinez pattern.
    The problem is that the zero could be hoisted before combine
    gets a chance to do its thing.

    gcc/
            PR target/89057
            * config/aarch64/aarch64-simd.md (aarch64_combine<mode>): Accept
            aarch64_simd_reg_or_zero for operand 2.  Use the combinez patterns
            to handle zero operands.

    gcc/testsuite/
            PR target/89057
            * gcc.target/aarch64/pr89057.c: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [8/9/10 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2021-01-04 11:59 ` cvs-commit at gcc dot gnu.org
@ 2021-01-04 12:07 ` rsandifo at gcc dot gnu.org
  2021-01-12 10:02 ` [Bug target/89057] [8/9 " rsandifo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2021-01-04 12:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[8/9/10/11 Regression]      |[8/9/10 Regression] AArch64
                   |AArch64 ld3 st4 less        |ld3 st4 less optimized
                   |optimized                   |

--- Comment #11 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Fixed on trunk so far.  Will backport after a grace period.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [8/9 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2021-01-04 12:07 ` [Bug target/89057] [8/9/10 " rsandifo at gcc dot gnu.org
@ 2021-01-12 10:02 ` rsandifo at gcc dot gnu.org
  2021-05-14  9:51 ` [Bug target/89057] [9 " jakub at gcc dot gnu.org
  2021-05-17  8:55 ` rsandifo at gcc dot gnu.org
  8 siblings, 0 replies; 9+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2021-01-12 10:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[8/9/10 Regression] AArch64 |[8/9 Regression] AArch64
                   |ld3 st4 less optimized      |ld3 st4 less optimized

--- Comment #12 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
Fixed for GCC 10 by r10-9255.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [9 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2021-01-12 10:02 ` [Bug target/89057] [8/9 " rsandifo at gcc dot gnu.org
@ 2021-05-14  9:51 ` jakub at gcc dot gnu.org
  2021-05-17  8:55 ` rsandifo at gcc dot gnu.org
  8 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-14  9:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|8.5                         |9.4

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 8 branch is being closed.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/89057] [9 Regression] AArch64 ld3 st4 less optimized
       [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2021-05-14  9:51 ` [Bug target/89057] [9 " jakub at gcc dot gnu.org
@ 2021-05-17  8:55 ` rsandifo at gcc dot gnu.org
  8 siblings, 0 replies; 9+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2021-05-17  8:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89057

rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #14 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
The patch doesn't apply to GCC 9 because the combinez expanders
don't exist there.  I think it would be dangerous to do something
ad hoc for GCC 9 only, so we should probably restrict the fix
to GCC 10+.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-05-17  8:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-89057-4@http.gcc.gnu.org/bugzilla/>
2020-03-24 15:13 ` [Bug target/89057] [8/9/10 Regression] AArch64 ld3 st4 less optimized tnfchris at gcc dot gnu.org
2020-11-30 15:09 ` [Bug target/89057] [8/9/10/11 " abhiraj.garakapati at gmail dot com
2020-12-03  5:25 ` abhiraj.garakapati at gmail dot com
2020-12-30 10:24 ` rsandifo at gcc dot gnu.org
2021-01-04 11:59 ` cvs-commit at gcc dot gnu.org
2021-01-04 12:07 ` [Bug target/89057] [8/9/10 " rsandifo at gcc dot gnu.org
2021-01-12 10:02 ` [Bug target/89057] [8/9 " rsandifo at gcc dot gnu.org
2021-05-14  9:51 ` [Bug target/89057] [9 " jakub at gcc dot gnu.org
2021-05-17  8:55 ` rsandifo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).