* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
@ 2024-07-19 22:31 ` pinskia at gcc dot gnu.org
2024-07-22 7:03 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-19 22:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |testsuite-fail
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So I think there are 2 different issues here.
First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the restrict
instead of memmove. That should be a simple fix there.
I have not looked into why gfortran.dg/vect/vect-8.f90 fails on aarch64 though.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
2024-07-19 22:31 ` [Bug target/116010] " pinskia at gcc dot gnu.org
@ 2024-07-22 7:03 ` rguenth at gcc dot gnu.org
2024-07-22 22:23 ` thiago.bauermann at linaro dot org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-22 7:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |15.0
Keywords| |needs-bisection
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The gfortran.dg/vect/vect-8.f90 testcase is incredibly bad because it has so
many loops that are or are not vectorized. It should ideally be split up.
But I think the blame is incorrect, the test uses
-fno-tree-loop-distribute-patterns and thus isn't effected by the rev in
question.
As Andrew says the fix for the other regression is trivial, I'm leaving that
to ARM folks as an exercise.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
2024-07-19 22:31 ` [Bug target/116010] " pinskia at gcc dot gnu.org
2024-07-22 7:03 ` rguenth at gcc dot gnu.org
@ 2024-07-22 22:23 ` thiago.bauermann at linaro dot org
2024-07-22 22:29 ` thiago.bauermann at linaro dot org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-22 22:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010
--- Comment #3 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
Created attachment 58725
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58725&action=edit
mve-vabs.s generated by the test after commit c290e6a0b7a9.
(In reply to Andrew Pinski from comment #1)
> So I think there are 2 different issues here.
If that is the case, then I can open a separate bugzilla so that there's one
for each test failure.
> First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the
> restrict instead of memmove. That should be a simple fix there.
In my setup I don't see memcpy being called. Instead of memmove, GCC is now
generating the load and store instruction. E.g.:
test_vabs_u32x4:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
- @ link register save eliminated.
- movs r2, #16
- b memmove
+ push {lr}
+ ldr ip, [r1, #4] @ unaligned
+ ldr lr, [r1] @ unaligned
+ ldr r2, [r1, #8] @ unaligned
+ ldr r3, [r1, #12] @ unaligned
+ str lr, [r0] @ unaligned
+ str ip, [r0, #4] @ unaligned
+ str r2, [r0, #8] @ unaligned
+ str r3, [r0, #12] @ unaligned
+ ldr pc, [sp], #4
.size test_vabs_u32x4, .-test_vabs_u32x4
.align 1
.p2align 2,,3
I'm attaching the complete assembly file for the test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
` (2 preceding siblings ...)
2024-07-22 22:23 ` thiago.bauermann at linaro dot org
@ 2024-07-22 22:29 ` thiago.bauermann at linaro dot org
2024-07-22 22:33 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-22 22:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010
--- Comment #4 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
(In reply to Richard Biener from comment #2)
> The gfortran.dg/vect/vect-8.f90 testcase is incredibly bad because it has so
> many loops that are or are not vectorized. It should ideally be split up.
>
> But I think the blame is incorrect, the test uses
> -fno-tree-loop-distribute-patterns and thus isn't effected by the rev in
> question.
Today I confirmed the git bisect by reverting commit c290e6a0b7a9 from today's
trunk (commit 88d16194d0c8 at the time of my test), and observing that this
makes gfortran.dg/vect/vect-8.f90 pass again. The commit influences the
testcase somehow.
> As Andrew says the fix for the other regression is trivial, I'm leaving that
> to ARM folks as an exercise.
If that is the case, I will be happy to post a patch to the mailing list.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
` (3 preceding siblings ...)
2024-07-22 22:29 ` thiago.bauermann at linaro dot org
@ 2024-07-22 22:33 ` pinskia at gcc dot gnu.org
2024-07-23 4:14 ` thiago.bauermann at linaro dot org
2024-07-23 4:21 ` thiago.bauermann at linaro dot org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-22 22:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Thiago Jung Bauermann from comment #3)
> Created attachment 58725 [details]
> mve-vabs.s generated by the test after commit c290e6a0b7a9.
>
> (In reply to Andrew Pinski from comment #1)
> > So I think there are 2 different issues here.
>
> If that is the case, then I can open a separate bugzilla so that there's one
> for each test failure.
>
> > First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the
> > restrict instead of memmove. That should be a simple fix there.
>
> In my setup I don't see memcpy being called. Instead of memmove, GCC is now
> generating the load and store instruction. E.g.:
Well that is an inlined version of memcpy. I was looking at what was done in
the tree dump to see the difference.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
` (4 preceding siblings ...)
2024-07-22 22:33 ` pinskia at gcc dot gnu.org
@ 2024-07-23 4:14 ` thiago.bauermann at linaro dot org
2024-07-23 4:21 ` thiago.bauermann at linaro dot org
6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-23 4:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010
--- Comment #6 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Thiago Jung Bauermann from comment #3)
> > > First gcc.target/arm/simd/mve-vabs.c now calls memcpy because of the
> > > restrict instead of memmove. That should be a simple fix there.
> >
> > In my setup I don't see memcpy being called. Instead of memmove, GCC is now
> > generating the load and store instruction. E.g.:
>
> Well that is an inlined version of memcpy. I was looking at what was done in
> the tree dump to see the difference.
Ah, thanks for clarifying! So apparently the path forward is to remove the
memmove check from mve-vabs.c. Or is there a way to test inlined memcpy?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug target/116010] [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de
2024-07-19 22:21 [Bug target/116010] New: [15 regression] vectorization regressions on arm and aarch64 since r15-491-gc290e6a0b7a9de thiago.bauermann at linaro dot org
` (5 preceding siblings ...)
2024-07-23 4:14 ` thiago.bauermann at linaro dot org
@ 2024-07-23 4:21 ` thiago.bauermann at linaro dot org
6 siblings, 0 replies; 8+ messages in thread
From: thiago.bauermann at linaro dot org @ 2024-07-23 4:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116010
--- Comment #7 from Thiago Jung Bauermann <thiago.bauermann at linaro dot org> ---
Created attachment 58729
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58729&action=edit
Testsuite results with/without bisected commit.
Regarding the gfortran.dg/vect/vect-8.f90, I'm attaching a tarball containing
gfortran.log, gfortran.sum, vect-8.s and vect-8.f90.180t.vect of trunk, and
also of trunk with the commit reverted, hoping that it can help.
In trunk, vect-8.f90.180t.vect says "note: vectorized 23 loops in function"
while when the commit is reverted, it says "note: vectorized 24 loops in
function".
^ permalink raw reply [flat|nested] 8+ messages in thread