[Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
@ 2024-07-19 22:21 thiago.bauermann at linaro dot org
  2024-07-19 22:31 ` [Bug target/116010] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-19 22:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

            Bug ID: 116010
           Summary: [15 regression] vectorization regressions on arm and
                    aarch64 since r15-491-gc290e6a0b7a9de
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thiago.bauermann at linaro dot org
                CC: rguenth at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64, arm

Since commit r15-491-gc290e6a0b7a9de
(g:c290e6a0b7a9de5692963affc6627a4af7dc2411) we see this regression on
aarch64-linux-gnu:

Running gfortran:gfortran.dg/vect/vect.exp ...
gfortran.dg/vect/vect-8.f90   -O  : pattern found 0 times
FAIL: gfortran.dg/vect/vect-8.f90   -O   scan-tree-dump-times vect "vectorized
2[45] loops" 1

And this regression on armv8l-linux-gnueabihf and arm-eabi:

Running gcc:gcc.target/arm/simd/simd.exp ...
gcc.target/arm/simd/mve-vabs.c: memmove found 0 times
FAIL: gcc.target/arm/simd/mve-vabs.c scan-assembler-times memmove 3

Configure options for aarch64-linux-gnu:

/home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/snapshots/gcc.git~master/configure
\
    SHELL=/bin/bash \
   
--with-mpc=/home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/builds/destdir/aarch64-unknown-linux-gnu
\
   
--with-mpfr=/home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/builds/destdir/aarch64-unknown-linux-gnu
\
   
--with-gmp=/home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/builds/destdir/aarch64-unknown-linux-gnu
\
    --with-gnu-as \
    --with-gnu-ld \
    --disable-libmudflap \
    --enable-lto \
    --enable-shared \
    --without-included-gettext \
    --enable-nls \
    --with-system-zlib \
    --disable-sjlj-exceptions \
    --enable-gnu-unique-object \
    --enable-linker-build-id \
    --disable-libstdcxx-pch \
    --enable-c99 \
    --enable-clocale=gnu \
    --enable-libstdcxx-debug \
    --enable-long-long \
    --with-cloog=no \
    --with-ppl=no \
    --with-isl=no \
    --disable-multilib \
    --enable-fix-cortex-a53-835769 \
    --enable-fix-cortex-a53-843419 \
    --with-arch=armv8-a \
    --enable-threads=posix \
    --enable-multiarch \
    --enable-libstdcxx-time=yes \
    --enable-gnu-indirect-function \
    --enable-checking=yes \
    --disable-bootstrap \
    --enable-languages=default \
   
--prefix=/home/tcwg-buildslave/workspace/tcwg_gnu_4/abe/builds/destdir/aarch64-unknown-linux-gnu

Configure options for armv8l-linux-gnueabihf:

/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/gcc.git~master/configure
\
    SHELL=/bin/bash \
   
--with-mpc=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/armv8l-unknown-linux-gnueabihf
\
   
--with-mpfr=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/armv8l-unknown-linux-gnueabihf
\
   
--with-gmp=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/armv8l-unknown-linux-gnueabihf
\
    --with-gnu-as \
    --with-gnu-ld \
    --disable-libmudflap \
    --enable-lto \
    --enable-shared \
    --without-included-gettext \
    --enable-nls \
    --with-system-zlib \
    --disable-sjlj-exceptions \
    --enable-gnu-unique-object \
    --enable-linker-build-id \
    --disable-libstdcxx-pch \
    --enable-c99 \
    --enable-clocale=gnu \
    --enable-libstdcxx-debug \
    --enable-long-long \
    --with-cloog=no \
    --with-ppl=no \
    --with-isl=no \
    --disable-multilib \
    --with-float=hard \
    --with-fpu=neon-fp-armv8 \
    --with-mode=thumb \
    --with-arch=armv8-a \
    --enable-threads=posix \
    --enable-multiarch \
    --enable-libstdcxx-time=yes \
    --enable-gnu-indirect-function \
    --enable-checking=yes \
    --disable-bootstrap \
    --enable-languages=default \
   
--prefix=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/armv8l-unknown-linux-gnueabihf

Configure options for arm-eabi:

/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/snapshots/binutils.git~master/configure
\
    SHELL=/bin/bash \
    --enable-lto \
    --enable-plugins \
    --enable-gold \
    --disable-werror \
    CPPFLAGS=-UFORTIFY_SOURCE \
    --with-pkgversion=Linaro_Binutils-2024.07.19 \
    --disable-gdb \
    --disable-gdbserver \
    --disable-sim \
    --disable-libdecnumber \
    --disable-readline \
   
--with-sysroot=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/x86_64-pc-linux-gnu/arm-eabi
\
    --disable-shared \
    --enable-static \
    --build=x86_64-pc-linux-gnu \
    --host=x86_64-pc-linux-gnu \
    --target=arm-eabi \
   
--prefix=/home/tcwg-buildslave/workspace/tcwg_gnu_0/abe/builds/destdir/x86_64-pc-linux-gnu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
  2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
@ 2024-07-19 22:31 ` pinskia at gcc dot gnu.org
  2024-07-22  7:03 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-19 22:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |testsuite-fail

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So I think there are 2 different issues here.

First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the restrict
instead of memmove. That should be a simple fix there.

I have not looked into why gfortran.dg/vect/vect-8.f90 fails on aarch64 though.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
  2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
  2024-07-19 22:31 ` [Bug target/116010] " pinskia at gcc dot gnu.org
@ 2024-07-22  7:03 ` rguenth at gcc dot gnu.org
  2024-07-22 22:23 ` thiago.bauermann at linaro dot org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-22  7:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |15.0
           Keywords|                            |needs-bisection

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The gfortran.dg/vect/vect-8.f90 testcase is incredibly bad because it has so
many loops that are or are not vectorized.  It should ideally be split up.

But I think the blame is incorrect, the test uses
-fno-tree-loop-distribute-patterns and thus isn't effected by the rev in
question.

As Andrew says the fix for the other regression is trivial, I'm leaving that
to ARM folks as an exercise.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
  2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
  2024-07-19 22:31 ` [Bug target/116010] " pinskia at gcc dot gnu.org
  2024-07-22  7:03 ` rguenth at gcc dot gnu.org
@ 2024-07-22 22:23 ` thiago.bauermann at linaro dot org
  2024-07-22 22:29 ` thiago.bauermann at linaro dot org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-22 22:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

--- Comment #3 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
Created attachment 58725
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58725&action=edit
mve-vabs.s generated by the test after commit c290e6a0b7a9.

(In reply to Andrew Pinski from comment #1)
> So I think there are 2 different issues here.

If that is the case, then I can open a separate bugzilla so that there's one
for each test failure.

> First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the
> restrict instead of memmove. That should be a simple fix there.

In my setup I don't see memcpy being called. Instead of memmove, GCC is now
generating the load and store instruction. E.g.:

 test_vabs_u32x4:                                                               
        @ args = 0, pretend = 0, frame = 0                                      
        @ frame_needed = 0, uses_anonymous_args = 0                             
-       @ link register save eliminated.                                        
-       movs    r2, #16
-       b       memmove
+       push    {lr}
+       ldr     ip, [r1, #4]    @ unaligned
+       ldr     lr, [r1]        @ unaligned
+       ldr     r2, [r1, #8]    @ unaligned
+       ldr     r3, [r1, #12]   @ unaligned
+       str     lr, [r0]        @ unaligned
+       str     ip, [r0, #4]    @ unaligned
+       str     r2, [r0, #8]    @ unaligned
+       str     r3, [r0, #12]   @ unaligned
+       ldr     pc, [sp], #4
        .size   test_vabs_u32x4, .-test_vabs_u32x4
        .align  1
        .p2align 2,,3

I'm attaching the complete assembly file for the test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
  2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
                   ` (2 preceding siblings ...)
  2024-07-22 22:23 ` thiago.bauermann at linaro dot org
@ 2024-07-22 22:29 ` thiago.bauermann at linaro dot org
  2024-07-22 22:33 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-22 22:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

--- Comment #4 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
(In reply to Richard Biener from comment #2)
> The gfortran.dg/vect/vect-8.f90 testcase is incredibly bad because it has so
> many loops that are or are not vectorized.  It should ideally be split up.
> 
> But I think the blame is incorrect, the test uses
> -fno-tree-loop-distribute-patterns and thus isn't effected by the rev in
> question.

Today I confirmed the git bisect by reverting commit c290e6a0b7a9 from today's
trunk (commit 88d16194d0c8 at the time of my test), and observing that this
makes gfortran.dg/vect/vect-8.f90 pass again. The commit influences the
testcase somehow.

> As Andrew says the fix for the other regression is trivial, I'm leaving that
> to ARM folks as an exercise.

If that is the case, I will be happy to post a patch to the mailing list.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
  2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
                   ` (3 preceding siblings ...)
  2024-07-22 22:29 ` thiago.bauermann at linaro dot org
@ 2024-07-22 22:33 ` pinskia at gcc dot gnu.org
  2024-07-23  4:14 ` thiago.bauermann at linaro dot org
  2024-07-23  4:21 ` thiago.bauermann at linaro dot org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-22 22:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Thiago Jung Bauermann from comment #3)
> Created attachment 58725 [details]
> mve-vabs.s generated by the test after commit c290e6a0b7a9.
> 
> (In reply to Andrew Pinski from comment #1)
> > So I think there are 2 different issues here.
> 
> If that is the case, then I can open a separate bugzilla so that there's one
> for each test failure.
>  
> > First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the
> > restrict instead of memmove. That should be a simple fix there.
> 
> In my setup I don't see memcpy being called. Instead of memmove, GCC is now
> generating the load and store instruction. E.g.:

Well that is an inlined version of memcpy. I was looking at what was done in
the tree dump to see the difference.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
  2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
                   ` (4 preceding siblings ...)
  2024-07-22 22:33 ` pinskia at gcc dot gnu.org
@ 2024-07-23  4:14 ` thiago.bauermann at linaro dot org
  2024-07-23  4:21 ` thiago.bauermann at linaro dot org
  6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-23  4:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

--- Comment #6 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Thiago Jung Bauermann from comment #3)
> > > First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the
> > > restrict instead of memmove. That should be a simple fix there.
> > 
> > In my setup I don't see memcpy being called. Instead of memmove, GCC is now
> > generating the load and store instruction. E.g.:
> 
> Well that is an inlined version of memcpy. I was looking at what was done in
> the tree dump to see the difference.

Ah, thanks for clarifying! So apparently the path forward is to remove the
memmove check from mve-vabs.c. Or is there a way to test inlined memcpy?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
  2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
                   ` (5 preceding siblings ...)
  2024-07-23  4:14 ` thiago.bauermann at linaro dot org
@ 2024-07-23  4:21 ` thiago.bauermann at linaro dot org
  6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-23  4:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010

--- Comment #7 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
Created attachment 58729
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58729&action=edit
Testsuite results with/without bisected commit.

Regarding the gfortran.dg/vect/vect-8.f90, I'm attaching a tarball containing
gfortran.log, gfortran.sum, vect-8.s and vect-8.f90.180t.vect of trunk, and
also of trunk with the commit reverted, hoping that it can help.

In trunk, vect-8.f90.180t.vect says "note: vectorized 23 loops in function"
while when the commit is reverted, it says "note: vectorized 24 loops in
function".

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-07-23  4:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
2024-07-19 22:31 ` [Bug target/116010] " pinskia at gcc dot gnu.org
2024-07-22  7:03 ` rguenth at gcc dot gnu.org
2024-07-22 22:23 ` thiago.bauermann at linaro dot org
2024-07-22 22:29 ` thiago.bauermann at linaro dot org
2024-07-22 22:33 ` pinskia at gcc dot gnu.org
2024-07-23  4:14 ` thiago.bauermann at linaro dot org
2024-07-23  4:21 ` thiago.bauermann at linaro dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).