public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/65375] New: poor codegen for ld[234]/st[234]
@ 2015-03-10 8:15 kugan at gcc dot gnu.org
2015-03-10 8:16 ` [Bug target/65375] " kugan at gcc dot gnu.org
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-03-10 8:15 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
Bug ID: 65375
Summary: poor codegen for ld[234]/st[234]
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: kugan at gcc dot gnu.org
#include <arm_neon.h>
void hello_vst2(float* fout, float *fin)
{
float32x4x2_t a;
a = vld2q_f32 (fin);
vst2q_f32 (fout, a);
}
with aarch64-none-linux-gnu-gcc -O2 -ffast-math -unsafe-math-optimisations
produces:
.cpu generic+fp+simd
.file "neon.c"
.text
.align 2
.p2align 3,,7
.global hello_vst2
.type hello_vst2, %function
hello_vst2:
ld2 {v0.4s - v1.4s}, [x1]
sub sp, sp, #32
umov x1, v0.d[0]
umov x2, v0.d[1]
str q1, [sp, 16]
mov x5, x1
stp x5, x2, [sp]
ld1 {v0.16b - v1.16b}, [sp]
st2 {v0.4s - v1.4s}, [x0]
add sp, sp, 32
ret
.size hello_vst2, .-hello_vst2
.ident "GCC: (GNU) 5.0.0 20150305 (experimental)"
.section .note.GNU-stack,"",%progbits
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] poor codegen for ld[234]/st[234]
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
@ 2015-03-10 8:16 ` kugan at gcc dot gnu.org
2015-03-10 8:18 ` kugan at gcc dot gnu.org
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-03-10 8:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #1 from kugan at gcc dot gnu.org ---
arm-none-linux-gnueabi-gcc -O2 -ffast-math -unsafe-math-optimisations
-mfpu=neon produces just:
hello_vst2:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
vld2.32 {d16-d19}, [r1]
vst2.32 {d16-d19}, [r0]
bx
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] poor codegen for ld[234]/st[234]
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
2015-03-10 8:16 ` [Bug target/65375] " kugan at gcc dot gnu.org
@ 2015-03-10 8:18 ` kugan at gcc dot gnu.org
2015-03-10 8:19 ` kugan at gcc dot gnu.org
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-03-10 8:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #2 from kugan at gcc dot gnu.org ---
aarch64-none-linux-gnu-gcc -O2 -ffast-math -unsafe-math-optimisations
-fno-split-wide-types produces :
ld2 {v2.4s - v3.4s}, [x1]
orr v0.16b, v2.16b, v2.16b
orr v1.16b, v3.16b, v3.16b
st2 {v0.4s - v1.4s}, [x0]
ret
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] poor codegen for ld[234]/st[234]
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
2015-03-10 8:16 ` [Bug target/65375] " kugan at gcc dot gnu.org
2015-03-10 8:18 ` kugan at gcc dot gnu.org
@ 2015-03-10 8:19 ` kugan at gcc dot gnu.org
2015-03-10 8:32 ` pinskia at gcc dot gnu.org
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-03-10 8:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
kugan at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |aarch64
Assignee|unassigned at gcc dot gnu.org |kugan at gcc dot gnu.org
--- Comment #3 from kugan at gcc dot gnu.org ---
aarch64-none-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/home/kugan/work/builds/gcc-fsf-gcc/tools/bin/aarch64-none-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/home/kugan/work/builds/gcc-fsf-gcc/tools/libexec/gcc/aarch64-none-linux-gnu/5.0.0/lto-wrapper
Target: aarch64-none-linux-gnu
Configured with: /home/kugan/work/sources/gcc-fsf/gcc/configure
--target=aarch64-none-linux-gnu
--prefix=/home/kugan/work/builds/gcc-fsf-gcc/tools
--with-sysroot=/home/kugan/work/builds/gcc-fsf-gcc/sysroot-aarch64-none-linux-gnu
--disable-libssp --disable-libgomp --disable-libmudflap
--enable-languages=c,c++,fortran
Thread model: posix
gcc version 5.0.0 20150305 (experimental) (GCC)
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] poor codegen for ld[234]/st[234]
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (2 preceding siblings ...)
2015-03-10 8:19 ` kugan at gcc dot gnu.org
@ 2015-03-10 8:32 ` pinskia at gcc dot gnu.org
2015-04-13 16:36 ` [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32 mkuvyrkov at gcc dot gnu.org
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-03-10 8:32 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
;; _6 = __builtin_aarch64_get_qregoiv4sf (__o_5, 0);
(insn 8 7 0 (set (reg:V4SF 74 [ D.16774 ])
(subreg:V4SF (reg/v:OI 73 [ __o ]) 0))
/data1/src/gcc-cavium/toolchain-thunder/thunderx-tools/lib/gcc/aarch64-thunderx-linux-gnu/5.0.0/include/arm_neon.h:15586
-1
(nil))
;; _7 = __builtin_aarch64_get_qregoiv4sf (__o_5, 1);
(insn 9 8 0 (set (reg:V4SF 75 [ D.16774 ])
(subreg:V4SF (reg/v:OI 73 [ __o ]) 16))
/data1/src/gcc-cavium/toolchain-thunder/thunderx-tools/lib/gcc/aarch64-thunderx-linux-gnu/5.0.0/include/arm_neon.h:15587
-1
(nil))
Actually maybe we should use POI here, the partial integer mode will cause
splitting subreg not do anything.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (3 preceding siblings ...)
2015-03-10 8:32 ` pinskia at gcc dot gnu.org
@ 2015-04-13 16:36 ` mkuvyrkov at gcc dot gnu.org
2015-04-14 8:05 ` jgreenhalgh at gcc dot gnu.org
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: mkuvyrkov at gcc dot gnu.org @ 2015-04-13 16:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #5 from Maxim Kuvyrkov <mkuvyrkov at gcc dot gnu.org> ---
Kugan and Jim Wilson have posted a patch for this on March 26th.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (4 preceding siblings ...)
2015-04-13 16:36 ` [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32 mkuvyrkov at gcc dot gnu.org
@ 2015-04-14 8:05 ` jgreenhalgh at gcc dot gnu.org
2015-04-14 8:06 ` mkuvyrkov at gcc dot gnu.org
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: jgreenhalgh at gcc dot gnu.org @ 2015-04-14 8:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
James Greenhalgh <jgreenhalgh at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
CC| |jgreenhalgh at gcc dot gnu.org
Resolution|--- |FIXED
--- Comment #6 from James Greenhalgh <jgreenhalgh at gcc dot gnu.org> ---
So, fixed then?
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (5 preceding siblings ...)
2015-04-14 8:05 ` jgreenhalgh at gcc dot gnu.org
@ 2015-04-14 8:06 ` mkuvyrkov at gcc dot gnu.org
2015-04-14 9:11 ` kugan at gcc dot gnu.org
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: mkuvyrkov at gcc dot gnu.org @ 2015-04-14 8:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
Maxim Kuvyrkov <mkuvyrkov at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |ASSIGNED
Last reconfirmed| |2015-04-14
Resolution|FIXED |---
Ever confirmed|0 |1
--- Comment #7 from Maxim Kuvyrkov <mkuvyrkov at gcc dot gnu.org> ---
The patch is not approved yet.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (6 preceding siblings ...)
2015-04-14 8:06 ` mkuvyrkov at gcc dot gnu.org
@ 2015-04-14 9:11 ` kugan at gcc dot gnu.org
2015-06-23 15:40 ` wilson at gcc dot gnu.org
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-04-14 9:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #8 from kugan at gcc dot gnu.org ---
Patch is at https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00857.html and not
approved yet.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (7 preceding siblings ...)
2015-04-14 9:11 ` kugan at gcc dot gnu.org
@ 2015-06-23 15:40 ` wilson at gcc dot gnu.org
2015-06-24 9:06 ` ramana at gcc dot gnu.org
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: wilson at gcc dot gnu.org @ 2015-06-23 15:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #10 from Jim Wilson <wilson at gcc dot gnu.org> ---
Improved, but not completely resolved. We still get unnecessary orr
instructions, same as in comment 2. This is partly an issue with the register
allocator not handling partially overlapping register reads/writes very well.
We already have a few other bugs for that. This is also partly an issue with
how the aarch64 builtins work, via __builtin_aarch64_[gs]et_qregoiv4sf which
create the partially overlapping register reads/writes. The ARM builtins don't
work this way, they use a union for type punning, and hence don't have the same
problem.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (8 preceding siblings ...)
2015-06-23 15:40 ` wilson at gcc dot gnu.org
@ 2015-06-24 9:06 ` ramana at gcc dot gnu.org
2015-06-24 9:13 ` kugan at gcc dot gnu.org
2015-06-25 20:49 ` ramana at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-06-24 9:06 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #11 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
(In reply to Jim Wilson from comment #10)
> Improved, but not completely resolved. We still get unnecessary orr
> instructions, same as in comment 2. This is partly an issue with the
> register allocator not handling partially overlapping register reads/writes
> very well. We already have a few other bugs for that. This is also partly
> an issue with how the aarch64 builtins work, via
> __builtin_aarch64_[gs]et_qregoiv4sf which create the partially overlapping
> register reads/writes. The ARM builtins don't work this way, they use a
> union for type punning, and hence don't have the same problem.
Both the ARM and the AArch64 ports have the issues with partially overlapping
register reads / writes especially with the vzip / vuzip style intrinsics in
AArch32 world or even the larger vld3/4 intrinsics in both ARM and AArch64
states. It would be nice to fix that finally.
If that is the only issue left in the ticket - maybe we should just park this
example in that ticket - IIRC PR43725 and close this one out ?
regards
Ramana
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (9 preceding siblings ...)
2015-06-24 9:06 ` ramana at gcc dot gnu.org
@ 2015-06-24 9:13 ` kugan at gcc dot gnu.org
2015-06-25 20:49 ` ramana at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: kugan at gcc dot gnu.org @ 2015-06-24 9:13 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
kugan at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #12 from kugan at gcc dot gnu.org ---
Fixed in trunk except for the additional orr instruction (overlapping register
reads / write). As Ramana mentioned, that is a known problem and tracked in
PR43725.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
` (10 preceding siblings ...)
2015-06-24 9:13 ` kugan at gcc dot gnu.org
@ 2015-06-25 20:49 ` ramana at gcc dot gnu.org
11 siblings, 0 replies; 13+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-06-25 20:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65375
--- Comment #13 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
Or indeed PR 63277...
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2015-06-25 20:49 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-10 8:15 [Bug target/65375] New: poor codegen for ld[234]/st[234] kugan at gcc dot gnu.org
2015-03-10 8:16 ` [Bug target/65375] " kugan at gcc dot gnu.org
2015-03-10 8:18 ` kugan at gcc dot gnu.org
2015-03-10 8:19 ` kugan at gcc dot gnu.org
2015-03-10 8:32 ` pinskia at gcc dot gnu.org
2015-04-13 16:36 ` [Bug target/65375] aarch64: poor codegen for vld2q_f32 and vst2q_f32 mkuvyrkov at gcc dot gnu.org
2015-04-14 8:05 ` jgreenhalgh at gcc dot gnu.org
2015-04-14 8:06 ` mkuvyrkov at gcc dot gnu.org
2015-04-14 9:11 ` kugan at gcc dot gnu.org
2015-06-23 15:40 ` wilson at gcc dot gnu.org
2015-06-24 9:06 ` ramana at gcc dot gnu.org
2015-06-24 9:13 ` kugan at gcc dot gnu.org
2015-06-25 20:49 ` ramana at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).