public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/66917] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
@ 2015-07-19 18:28 ` mikpelinux at gmail dot com
2015-07-29 8:21 ` [Bug tree-optimization/66917] " ramana at gcc dot gnu.org
` (18 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: mikpelinux at gmail dot com @ 2015-07-19 18:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #2 from Mikael Pettersson <mikpelinux at gmail dot com> ---
This test case changed behaviour twice in the 4.7->4.8 development cycle.
First r185807 broke it by replacing code for unaligned memory accesses with
code requiring more alignment than present in the source:
--- pr66917.s-r185806 2015-07-19 17:16:23.536116155 +0200
+++ pr66917.s-r185807 2015-07-19 17:13:23.016388416 +0200
@@ -17,42 +17,13 @@
.global test_neon_load_store_alignment
.type test_neon_load_store_alignment, %function
test_neon_load_store_alignment:
- @ args = 0, pretend = 0, frame = 32
+ @ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
- mov r3, r0
- stmfd sp!, {r4, r5, r6, r7}
- mov r7, r1
- ldr r0, [r0, #0] @ unaligned
- mov r6, r2
- sub sp, sp, #32
- ldr r1, [r3, #4] @ unaligned
- mov r5, sp
- ldr r2, [r3, #8] @ unaligned
- add r4, sp, #16
- ldr r3, [r3, #12] @ unaligned
- mov ip, sp
- stmia r5!, {r0, r1, r2, r3}
- ldr r0, [r7, #0] @ unaligned
- ldr r1, [r7, #4] @ unaligned
- ldr r2, [r7, #8] @ unaligned
- ldr r3, [r7, #12] @ unaligned
- fldd d16, [sp, #0] @ int
- fldd d19, [sp, #8] @ int
- stmia r4!, {r0, r1, r2, r3}
- fldd d18, [sp, #16] @ int
- veor d17, d16, d18
- fldd d18, [sp, #24] @ int
- fstd d17, [sp, #0] @ int
- veor d16, d19, d18
- fstd d16, [sp, #8] @ int
- ldmia ip!, {r0, r1, r2, r3}
- str r0, [r6, #0] @ unaligned
- str r1, [r6, #4] @ unaligned
- str r2, [r6, #8] @ unaligned
- str r3, [r6, #12] @ unaligned
- add sp, sp, #32
- ldmfd sp!, {r4, r5, r6, r7}
+ vldmia r0, {d18-d19}
+ vldmia r1, {d16-d17}
+ veor q8, q9, q8
+ vstmia r2, {d16-d17}
bx lr
.size test_neon_load_store_alignment,
.-test_neon_load_store_alignment
.section .text.startup,"ax",%progbits
On Linux, this code SIGBUSes because the kernel can't fix up the first
misaligned access:
[292105.326391] Alignment trap: not handling instruction ecd02b04 at
[<00008e84>]
[292105.396370] Unhandled fault: alignment exception (0x001) at 0x0008b109
Then r191399 changed it again by replacing the vldm/vstm instructions with
vld1.64/vst1.64 instructions:
--- pr66917.s-r191398 2015-07-19 19:12:15.815583139 +0200
+++ pr66917.s-r191399 2015-07-19 19:08:36.416037498 +0200
@@ -20,10 +20,10 @@
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
- vldmia r0, {d18-d19}
- vldmia r1, {d16-d17}
+ vld1.64 {d18-d19}, [r0:64]
+ vld1.64 {d16-d17}, [r1:64]
veor q8, q9, q8
- vstmia r2, {d16-d17}
+ vst1.64 {d16-d17}, [r2:64]
bx lr
.size test_neon_load_store_alignment,
.-test_neon_load_store_alignment
.section .text.startup,"ax",%progbits
These instructions still fault, but the kernel recognizes them and fixes up the
alignment faults (if suitably configured).
If I compile the test case for x86_64 w/ -O3 -mavx, the compiler generates
vmovdqu instructions which permit unaligned addresses. So I suspect a target
bug.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
2015-07-19 18:28 ` [Bug target/66917] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment mikpelinux at gmail dot com
@ 2015-07-29 8:21 ` ramana at gcc dot gnu.org
2015-07-29 8:23 ` [Bug tree-optimization/66917] [4.9/5/6 regression] " ramana at gcc dot gnu.org
` (17 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-07-29 8:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-07-29
CC| |ramana at gcc dot gnu.org
Component|target |tree-optimization
Ever confirmed|0 |1
--- Comment #3 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
for STRICT_ALIGNMENT targets properly and
(In reply to Mikael Pettersson from comment #2)
> This test case changed behaviour twice in the 4.7->4.8 development cycle.
> First r185807 broke it by replacing code for unaligned memory accesses with
> code requiring more alignment than present in the source:
>
> --- pr66917.s-r185806 2015-07-19 17:16:23.536116155 +0200
> +++ pr66917.s-r185807 2015-07-19 17:13:23.016388416 +0200
> @@ -17,42 +17,13 @@
> .global test_neon_load_store_alignment
> .type test_neon_load_store_alignment, %function
> test_neon_load_store_alignment:
> - @ args = 0, pretend = 0, frame = 32
> + @ args = 0, pretend = 0, frame = 0
> @ frame_needed = 0, uses_anonymous_args = 0
> @ link register save eliminated.
> - mov r3, r0
> - stmfd sp!, {r4, r5, r6, r7}
> - mov r7, r1
> - ldr r0, [r0, #0] @ unaligned
> - mov r6, r2
> - sub sp, sp, #32
> - ldr r1, [r3, #4] @ unaligned
> - mov r5, sp
> - ldr r2, [r3, #8] @ unaligned
> - add r4, sp, #16
> - ldr r3, [r3, #12] @ unaligned
> - mov ip, sp
> - stmia r5!, {r0, r1, r2, r3}
> - ldr r0, [r7, #0] @ unaligned
> - ldr r1, [r7, #4] @ unaligned
> - ldr r2, [r7, #8] @ unaligned
> - ldr r3, [r7, #12] @ unaligned
> - fldd d16, [sp, #0] @ int
> - fldd d19, [sp, #8] @ int
> - stmia r4!, {r0, r1, r2, r3}
> - fldd d18, [sp, #16] @ int
> - veor d17, d16, d18
> - fldd d18, [sp, #24] @ int
> - fstd d17, [sp, #0] @ int
> - veor d16, d19, d18
> - fstd d16, [sp, #8] @ int
> - ldmia ip!, {r0, r1, r2, r3}
> - str r0, [r6, #0] @ unaligned
> - str r1, [r6, #4] @ unaligned
> - str r2, [r6, #8] @ unaligned
> - str r3, [r6, #12] @ unaligned
> - add sp, sp, #32
> - ldmfd sp!, {r4, r5, r6, r7}
> + vldmia r0, {d18-d19}
> + vldmia r1, {d16-d17}
> + veor q8, q9, q8
> + vstmia r2, {d16-d17}
> bx lr
> .size test_neon_load_store_alignment,
> .-test_neon_load_store_alignment
> .section .text.startup,"ax",%progbits
>
> On Linux, this code SIGBUSes because the kernel can't fix up the first
> misaligned access:
>
> [292105.326391] Alignment trap: not handling instruction ecd02b04 at
> [<00008e84>]
> [292105.396370] Unhandled fault: alignment exception (0x001) at 0x0008b109
>
> Then r191399 changed it again by replacing the vldm/vstm instructions with
> vld1.64/vst1.64 instructions:
>
> --- pr66917.s-r191398 2015-07-19 19:12:15.815583139 +0200
> +++ pr66917.s-r191399 2015-07-19 19:08:36.416037498 +0200
> @@ -20,10 +20,10 @@
> @ args = 0, pretend = 0, frame = 0
> @ frame_needed = 0, uses_anonymous_args = 0
> @ link register save eliminated.
> - vldmia r0, {d18-d19}
> - vldmia r1, {d16-d17}
> + vld1.64 {d18-d19}, [r0:64]
> + vld1.64 {d16-d17}, [r1:64]
> veor q8, q9, q8
> - vstmia r2, {d16-d17}
> + vst1.64 {d16-d17}, [r2:64]
> bx lr
> .size test_neon_load_store_alignment,
> .-test_neon_load_store_alignment
> .section .text.startup,"ax",%progbits
>
> These instructions still fault, but the kernel recognizes them and fixes up
> the alignment faults (if suitably configured).
>
> If I compile the test case for x86_64 w/ -O3 -mavx, the compiler generates
> vmovdqu instructions which permit unaligned addresses. So I suspect a
> target bug.
No, sorry that doesn't suggest a target bug to me at all. The ARM backend is a
STRICT_ALIGNMENT target, this implies the optimizers cannot introduce unaligned
accesses unless sent through the movmisalign interface AFAIUI.
In this testcase somehow at expand time the compiler thinks that the pointer is
aligned to 8 bytes and can put out a normal mov insn rather than use the
misaligned interface.
>From the expand dump:
(insn 8 5 9 2 (set (reg:V2DI 117)
(mem:V2DI (reg/v/f:SI 113 [ ap ]) [0 MEM[(char * {ref-all})ap_2(D)]+0
S16 A64])) /tmp/misalign.c:17 -1
(nil))
(insn 9 8 10 2 (set (reg:V2DI 118)
(mem:V2DI (reg/v/f:SI 114 [ bp ]) [0 MEM[(char * {ref-all})bp_3(D)]+0
S16 A64])) /tmp/misalign.c:17 -1
(nil))
Look at the A64.. That's a smoking gun to me and needs further investigation.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
2015-07-19 18:28 ` [Bug target/66917] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment mikpelinux at gmail dot com
2015-07-29 8:21 ` [Bug tree-optimization/66917] " ramana at gcc dot gnu.org
@ 2015-07-29 8:23 ` ramana at gcc dot gnu.org
2015-07-29 14:03 ` ktkachov at gcc dot gnu.org
` (16 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: ramana at gcc dot gnu.org @ 2015-07-29 8:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Ramana Radhakrishnan <ramana at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |wrong-code
Summary|ARM: NEON: memcpy compiles |[4.9/5/6 regression] ARM:
|to vld1 and vst1 with |NEON: memcpy compiles to
|incorrect alignment |vld1 and vst1 with
| |incorrect alignment
Known to fail| |4.8.5, 4.9.3, 5.1.0, 5.2.0,
| |6.0
--- Comment #4 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> ---
Wrong code.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (2 preceding siblings ...)
2015-07-29 8:23 ` [Bug tree-optimization/66917] [4.9/5/6 regression] " ramana at gcc dot gnu.org
@ 2015-07-29 14:03 ` ktkachov at gcc dot gnu.org
2015-07-29 14:37 ` rguenth at gcc dot gnu.org
` (15 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-07-29 14:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #5 from ktkachov at gcc dot gnu.org ---
The slp2 dump at tree level says about the ap pointer:
base_address: ap_2(D)
offset from base address: 0
constant offset from base address: 0
step: 0
aligned to: 64
base_object: MEM[(char * {ref-all})ap_2(D)]
That looks dodgy?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (3 preceding siblings ...)
2015-07-29 14:03 ` ktkachov at gcc dot gnu.org
@ 2015-07-29 14:37 ` rguenth at gcc dot gnu.org
2015-07-29 14:41 ` ktkachov at gcc dot gnu.org
` (14 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-29 14:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Probably because you access a.u/b.u which is uint64_t and thus the union
is laid out as having 8 byte alignment?
How do the original GENERIC trees produced look like?
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (4 preceding siblings ...)
2015-07-29 14:37 ` rguenth at gcc dot gnu.org
@ 2015-07-29 14:41 ` ktkachov at gcc dot gnu.org
2015-07-29 14:47 ` ktkachov at gcc dot gnu.org
` (13 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-07-29 14:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #7 from ktkachov at gcc dot gnu.org ---
(In reply to Richard Biener from comment #6)
> Probably because you access a.u/b.u which is uint64_t and thus the union
> is laid out as having 8 byte alignment?
>
> How do the original GENERIC trees produced look like?
The 003t.original shows:
typedef union
{
uint64_t u[2];
uint8_t c[16];
} unionunion
{
uint64_t u[2];
uint8_t c[16];
};
union
{
uint64_t u[2];
uint8_t c[16];
} a;
union
{
uint64_t u[2];
uint8_t c[16];
} b;
union
{
uint64_t u[2];
uint8_t c[16];
} a;
union
{
uint64_t u[2];
uint8_t c[16];
} b;
memcpy ((void * restrict) &a.c, (const void * restrict) ap, 16);
memcpy ((void * restrict) &b.c, (const void * restrict) bp, 16);
a.u[0] = a.u[0] ^ b.u[0];
a.u[1] = a.u[1] ^ b.u[1];
memcpy ((void * restrict) outp, (const void * restrict) &a.c, 16);
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (5 preceding siblings ...)
2015-07-29 14:41 ` ktkachov at gcc dot gnu.org
@ 2015-07-29 14:47 ` ktkachov at gcc dot gnu.org
2015-07-30 7:20 ` [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA rguenth at gcc dot gnu.org
` (12 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-07-29 14:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #8 from ktkachov at gcc dot gnu.org ---
The 007t.lower one looks like:
try
{
MEM[(char * {ref-all})&a] = MEM[(char * {ref-all})ap];
MEM[(char * {ref-all})&b] = MEM[(char * {ref-all})bp];
D.5587 = a.u[0];
D.5588 = b.u[0];
D.5589 = D.5587 ^ D.5588;
a.u[0] = D.5589;
D.5590 = a.u[1];
D.5591 = b.u[1];
D.5592 = D.5590 ^ D.5591;
a.u[1] = D.5592;
MEM[(char * {ref-all})outp] = MEM[(char * {ref-all})&a];
}
finally
{
a = {CLOBBER};
b = {CLOBBER};
}
return;
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (6 preceding siblings ...)
2015-07-29 14:47 ` ktkachov at gcc dot gnu.org
@ 2015-07-30 7:20 ` rguenth at gcc dot gnu.org
2015-07-30 7:43 ` [Bug target/66917] " rguenth at gcc dot gnu.org
` (11 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-30 7:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target Milestone|--- |4.9.4
Summary|[4.9/5/6 regression] ARM: |[4.9/5/6 regression] ARM:
|NEON: memcpy compiles to |NEON: memcpy compiles to
|vld1 and vst1 with |vst1 with incorrect
|incorrect alignment |alignment due to SRA
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
I see the following, 'a' having an alignment of 8 bytes:
<array_ref 0x7ffff68201c0
type <integer_type 0x7ffff67f5d20 uint64_t unsigned DI
size <integer_cst 0x7ffff68ccf18 constant 64>
unit size <integer_cst 0x7ffff68ccf30 constant 8>
align 64 symtab 0 alias set -1 canonical type 0x7ffff68d19d8 precision
64 min <integer_cst 0x7ffff68de210 0> max <integer_cst 0x7ffff68d7500
18446744073709551615> context <translation_unit_decl 0x7ffff7ff10f0 D.5218>
pointer_to_this <pointer_type 0x7ffff68215e8>>
arg 0 <component_ref 0x7ffff681f570
type <array_type 0x7ffff6821540 type <integer_type 0x7ffff67f5d20
uint64_t>
TI
size <integer_cst 0x7ffff68de228 constant 128>
unit size <integer_cst 0x7ffff68de240 constant 16>
align 64 symtab 0 alias set -1 canonical type 0x7ffff6821690 domain
<integer_type 0x7ffff6821498>>
arg 0 <var_decl 0x7ffff68d6900 a type <union_type 0x7ffff68213f0>
addressable used BLK file t.c line 12 col 5 size <integer_cst
0x7ffff68de228 128> unit size <integer_cst 0x7ffff68de240 16>
align 64 context <function_decl 0x7ffff67f6e00
test_neon_load_store_alignment> chain <var_decl 0x7ffff68d6990 b>>
arg 1 <field_decl 0x7ffff69b3558 u type <array_type 0x7ffff6821540>
TI file t.c line 10 col 16 size <integer_cst 0x7ffff68de228 128>
unit size <integer_cst 0x7ffff68de240 16>
align 64 offset_align 64
offset <integer_cst 0x7ffff68ccee8 constant 0>
bit offset <integer_cst 0x7ffff68ccf48 constant 0> context
<union_type 0x7ffff68213f0> chain <field_decl 0x7ffff69b35f0 c>>>
arg 1 <integer_cst 0x7ffff68de2e8 type <integer_type 0x7ffff68d1690 int>
constant 1>>
so the smoking dump isn't the reads but the write:
(insn 11 10 0 (set (mem:V2DI (reg/v/f:SI 115 [ outp ]) [0 MEM[(char *
{ref-all})outp_7(D)]+0 S16 A64])
(reg:V2DI 116 [ vect__4.13 ])) t.c:18 -1
(nil))
which for some reason also got 8 byte alignment. I suppose this is
SRA at work as -fno-tree-sra fixes this.
Thus confirmed as SRA problem.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (7 preceding siblings ...)
2015-07-30 7:20 ` [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA rguenth at gcc dot gnu.org
@ 2015-07-30 7:43 ` rguenth at gcc dot gnu.org
2015-07-30 7:48 ` rguenth at gcc dot gnu.org
` (10 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-30 7:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|tree-optimization |target
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Actually SRA is fine. vect_compute_data_ref_alignment is, too, leaving
DR_MISALIGNMENT at -1. The target then tells us via
bool is_packed = false;
tree type = (TREE_TYPE (DR_REF (dr)));
if (!known_alignment_for_access_p (dr))
is_packed = not_size_aligned (DR_REF (dr));
if ((TYPE_USER_ALIGN (type) && !is_packed)
|| targetm.vectorize.
support_vector_misalignment (mode, type,
DR_MISALIGNMENT (dr), is_packed))
return dr_unaligned_supported;
rm_builtin_support_vector_misalignment (mode=V2DImode, type=0x7ffff68280a8,
misalignment=-1, is_packed=true)
at /space/rguenther/tramp3d/trunk/gcc/config/arm/arm.c:27218
27218 if (TARGET_NEON && !BYTES_BIG_ENDIAN && unaligned_access)
27217 {
27218 if (TARGET_NEON && !BYTES_BIG_ENDIAN && unaligned_access)
27219 {
27220 HOST_WIDE_INT align = TYPE_ALIGN_UNIT (type);
27221
27222 if (is_packed)
(gdb) p unaligned_access
No symbol "unaligned_access" in current context.
(gdb) n
27220 HOST_WIDE_INT align = TYPE_ALIGN_UNIT (type);
(gdb)
27222 if (is_packed)
(gdb)
27223 return align == 1;
oops. That's obviously buggy. Not sure what the ARM hook expects in 'type',
but appearantly it's not what the vectorizer passes in here.
Docs say
This hook should return true if the target supports misaligned vector\n\
store/load of a specific factor denoted in the @var{misalignment}\n\
parameter. The vector store/load should be of machine mode @var{mode} and\n\
the elements in the vectors should be of type @var{type}. @var{is_packed}\n\
parameter is true if the memory access is defined in a packed struct.
but the arm hook seems to say "yes, I can vectorize with V2DImode
for a vector with element type unsigned long long __attribute__((aligned(1)))"
which it obviously cannot.
IMHO the hook should compare required mode alignment with type alignment.
This seems to be the arm hook trying to be too clever.
-> target bug.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug target/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (8 preceding siblings ...)
2015-07-30 7:43 ` [Bug target/66917] " rguenth at gcc dot gnu.org
@ 2015-07-30 7:48 ` rguenth at gcc dot gnu.org
2015-07-30 14:11 ` [Bug tree-optimization/66917] " rearnsha at gcc dot gnu.org
` (9 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-30 7:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Suggested fix:
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c (revision 226348)
+++ gcc/config/arm/arm.c (working copy)
@@ -27217,7 +27217,8 @@ arm_builtin_support_vector_misalignment
{
if (TARGET_NEON && !BYTES_BIG_ENDIAN && unaligned_access)
{
- HOST_WIDE_INT align = TYPE_ALIGN_UNIT (type);
+ HOST_WIDE_INT align
+ = GET_MODE_ALIGNMENT (TYPE_MODE (type)) / BITS_PER_UNIT;
if (is_packed)
return align == 1;
which produces
push {r4}
add r3, r0, #8
add r4, r1, #8
vld1.64 {d17}, [r1]
vld1.64 {d16}, [r4]
ldr r4, [sp], #4
vld1.64 {d19}, [r3]
veor d16, d16, d19
vld1.64 {d18}, [r0]
veor d17, d17, d18
vst1.64 {d17}, [r2]!
vst1.64 {d16}, [r2]
bx lr
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (9 preceding siblings ...)
2015-07-30 7:48 ` rguenth at gcc dot gnu.org
@ 2015-07-30 14:11 ` rearnsha at gcc dot gnu.org
2015-07-30 14:39 ` rguenther at suse dot de
` (8 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rearnsha at gcc dot gnu.org @ 2015-07-30 14:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Component|target |tree-optimization
--- Comment #12 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> but the arm hook seems to say "yes, I can vectorize with V2DImode
> for a vector with element type unsigned long long __attribute__((aligned(1)))"
> which it obviously cannot.
It can vectorize for this case, but the loads must not be marked as aligned
loads. That is, the mid-end must use the misaligned code hooks.
So I don't think this is a target bug.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (10 preceding siblings ...)
2015-07-30 14:11 ` [Bug tree-optimization/66917] " rearnsha at gcc dot gnu.org
@ 2015-07-30 14:39 ` rguenther at suse dot de
2015-07-30 14:41 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenther at suse dot de @ 2015-07-30 14:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #14 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 30 Jul 2015, rearnsha at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
>
> Richard Earnshaw <rearnsha at gcc dot gnu.org> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Component|target |tree-optimization
>
> --- Comment #12 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
> > but the arm hook seems to say "yes, I can vectorize with V2DImode
> > for a vector with element type unsigned long long __attribute__((aligned(1)))"
> > which it obviously cannot.
>
> It can vectorize for this case, but the loads must not be marked as aligned
> loads. That is, the mid-end must use the misaligned code hooks.
>
> So I don't think this is a target bug.
Hmm, indeed. It's a vectorizer bug that always communicates at least
element alignment (thus doesn't properly code-gen for the packed_p
case)
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (11 preceding siblings ...)
2015-07-30 14:39 ` rguenther at suse dot de
@ 2015-07-30 14:41 ` rguenth at gcc dot gnu.org
2015-07-31 9:20 ` rguenth at gcc dot gnu.org
` (6 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-30 14:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
For both loads and stores we do
else if (DR_MISALIGNMENT (first_dr) == -1)
{
TREE_TYPE (data_ref)
= build_aligned_type (TREE_TYPE (data_ref),
TYPE_ALIGN (elem_type));
align = TYPE_ALIGN_UNIT (elem_type);
misalign = 0;
but TYPE_ALIGN (elem_type) doesn't always hold.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (12 preceding siblings ...)
2015-07-30 14:41 ` rguenth at gcc dot gnu.org
@ 2015-07-31 9:20 ` rguenth at gcc dot gnu.org
2015-07-31 9:23 ` ktkachov at gcc dot gnu.org
` (5 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-07-31 9:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 36099
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36099&action=edit
patch
I am testing the attached (on x86_64-linux), it fixes the testcase with a cross
to arm. Full testing on arm appreciated.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (13 preceding siblings ...)
2015-07-31 9:20 ` rguenth at gcc dot gnu.org
@ 2015-07-31 9:23 ` ktkachov at gcc dot gnu.org
2015-07-31 13:21 ` ktkachov at gcc dot gnu.org
` (4 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-07-31 9:23 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #17 from ktkachov at gcc dot gnu.org ---
(In reply to Richard Biener from comment #16)
> Created attachment 36099 [details]
> patch
>
> I am testing the attached (on x86_64-linux), it fixes the testcase with a
> cross to arm. Full testing on arm appreciated.
I'll give it a test run
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (14 preceding siblings ...)
2015-07-31 9:23 ` ktkachov at gcc dot gnu.org
@ 2015-07-31 13:21 ` ktkachov at gcc dot gnu.org
2015-08-03 7:14 ` [Bug tree-optimization/66917] [4.9/5 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment rguenth at gcc dot gnu.org
` (3 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2015-07-31 13:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #18 from ktkachov at gcc dot gnu.org ---
(In reply to ktkachov from comment #17)
> (In reply to Richard Biener from comment #16)
> > Created attachment 36099 [details]
> > patch
> >
> > I am testing the attached (on x86_64-linux), it fixes the testcase with a
> > cross to arm. Full testing on arm appreciated.
>
> I'll give it a test run
A bootstrap and make check on arm-none-linux-gnueabihf looks fine
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (16 preceding siblings ...)
2015-08-03 7:14 ` [Bug tree-optimization/66917] [4.9/5 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment rguenth at gcc dot gnu.org
@ 2015-08-03 7:14 ` rguenth at gcc dot gnu.org
2015-09-24 20:57 ` ctice at gcc dot gnu.org
2015-09-28 10:46 ` rguenth at gcc dot gnu.org
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-08-03 7:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |6.0
Summary|[4.9/5/6 regression] ARM: |[4.9/5 regression] ARM:
|NEON: memcpy compiles to |NEON: memcpy compiles to
|vst1 with incorrect |vst1 with incorrect
|alignment due to SRA |alignment
Known to fail|6.0 |
--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk sofar.
--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Mon Aug 3 07:13:36 2015
New Revision: 226487
URL: https://gcc.gnu.org/viewcvs?rev=226487&root=gcc&view=rev
Log:
2015-08-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/66917
* tree-vectorizer.h (struct dataref_aux): Add base_element_aligned
field.
(DR_VECT_AUX): New macro.
(set_dr_misalignment): Adjust.
(dr_misalignment): Likewise.
* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
Compute whether the base is at least element aligned.
* tree-vect-stmts.c (ensure_base_align): Adjust.
(vectorizable_store): If the base is not element aligned
preserve alignment of the original access if misalignment is unknown.
(vectorizable_load): Likewise.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-data-refs.c
trunk/gcc/tree-vect-stmts.c
trunk/gcc/tree-vectorizer.h
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (15 preceding siblings ...)
2015-07-31 13:21 ` ktkachov at gcc dot gnu.org
@ 2015-08-03 7:14 ` rguenth at gcc dot gnu.org
2015-08-03 7:14 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-08-03 7:14 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |6.0
Summary|[4.9/5/6 regression] ARM: |[4.9/5 regression] ARM:
|NEON: memcpy compiles to |NEON: memcpy compiles to
|vst1 with incorrect |vst1 with incorrect
|alignment due to SRA |alignment
Known to fail|6.0 |
--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk sofar.
--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Mon Aug 3 07:13:36 2015
New Revision: 226487
URL: https://gcc.gnu.org/viewcvs?rev=226487&root=gcc&view=rev
Log:
2015-08-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/66917
* tree-vectorizer.h (struct dataref_aux): Add base_element_aligned
field.
(DR_VECT_AUX): New macro.
(set_dr_misalignment): Adjust.
(dr_misalignment): Likewise.
* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
Compute whether the base is at least element aligned.
* tree-vect-stmts.c (ensure_base_align): Adjust.
(vectorizable_store): If the base is not element aligned
preserve alignment of the original access if misalignment is unknown.
(vectorizable_load): Likewise.
Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-data-refs.c
trunk/gcc/tree-vect-stmts.c
trunk/gcc/tree-vectorizer.h
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (17 preceding siblings ...)
2015-08-03 7:14 ` rguenth at gcc dot gnu.org
@ 2015-09-24 20:57 ` ctice at gcc dot gnu.org
2015-09-28 10:46 ` rguenth at gcc dot gnu.org
19 siblings, 0 replies; 20+ messages in thread
From: ctice at gcc dot gnu.org @ 2015-09-24 20:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #21 from ctice at gcc dot gnu.org ---
Author: ctice
Date: Thu Sep 24 20:56:29 2015
New Revision: 228099
URL: https://gcc.gnu.org/viewcvs?rev=228099&root=gcc&view=rev
Log:
Backport r226487 from GCC FSF mainline:
r226487 | rguenth | 2015-08-03 00:13:36 -0700 (Mon, 03 Aug 2015) | 15 lines
2015-08-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/66917
* tree-vectorizer.h (struct dataref_aux): Add base_element_aligned
field.
(DR_VECT_AUX): New macro.
(set_dr_misalignment): Adjust.
(dr_misalignment): Likewise.
* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
Compute whether the base is at least element aligned.
* tree-vect-stmts.c (ensure_base_align): Adjust.
(vectorizable_store): If the base is not element aligned
preserve alignment of the original access if misalignment is unknown.
(vectorizable_load): Likewise.
Modified:
branches/google/gcc-4_9-mobile/gcc/tree-vect-data-refs.c
branches/google/gcc-4_9-mobile/gcc/tree-vect-stmts.c
branches/google/gcc-4_9-mobile/gcc/tree-vectorizer.h
^ permalink raw reply [flat|nested] 20+ messages in thread
* [Bug tree-optimization/66917] [4.9/5 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
` (18 preceding siblings ...)
2015-09-24 20:57 ` ctice at gcc dot gnu.org
@ 2015-09-28 10:46 ` rguenth at gcc dot gnu.org
19 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-09-28 10:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #22 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Mon Sep 28 10:45:55 2015
New Revision: 228199
URL: https://gcc.gnu.org/viewcvs?rev=228199&root=gcc&view=rev
Log:
2015-09-28 Richard Biener <rguenther@suse.de>
Backport from mainline
2015-08-03 Richard Biener <rguenther@suse.de>
PR tree-optimization/66917
* tree-vectorizer.h (struct dataref_aux): Add base_element_aligned
field.
(DR_VECT_AUX): New macro.
(set_dr_misalignment): Adjust.
(dr_misalignment): Likewise.
* tree-vect-data-refs.c (vect_compute_data_ref_alignment):
Compute whether the base is at least element aligned.
* tree-vect-stmts.c (ensure_base_align): Adjust.
(vectorizable_store): If the base is not element aligned
preserve alignment of the original access if misalignment is unknown.
(vectorizable_load): Likewise.
2015-09-16 Richard Biener <rguenther@suse.de>
PR middle-end/67442
* fold-const.c (extract_muldiv_1): Properly extend multiplication
result before builting a tree via wide_int_to_tree.
* gcc.dg/torture/pr67442.c: New testcase.
Added:
branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr67442.c
Modified:
branches/gcc-5-branch/gcc/ChangeLog
branches/gcc-5-branch/gcc/fold-const.c
branches/gcc-5-branch/gcc/testsuite/ChangeLog
branches/gcc-5-branch/gcc/tree-vect-data-refs.c
branches/gcc-5-branch/gcc/tree-vect-stmts.c
branches/gcc-5-branch/gcc/tree-vectorizer.h
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2015-09-28 10:46 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-66917-4@http.gcc.gnu.org/bugzilla/>
2015-07-19 18:28 ` [Bug target/66917] ARM: NEON: memcpy compiles to vld1 and vst1 with incorrect alignment mikpelinux at gmail dot com
2015-07-29 8:21 ` [Bug tree-optimization/66917] " ramana at gcc dot gnu.org
2015-07-29 8:23 ` [Bug tree-optimization/66917] [4.9/5/6 regression] " ramana at gcc dot gnu.org
2015-07-29 14:03 ` ktkachov at gcc dot gnu.org
2015-07-29 14:37 ` rguenth at gcc dot gnu.org
2015-07-29 14:41 ` ktkachov at gcc dot gnu.org
2015-07-29 14:47 ` ktkachov at gcc dot gnu.org
2015-07-30 7:20 ` [Bug tree-optimization/66917] [4.9/5/6 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment due to SRA rguenth at gcc dot gnu.org
2015-07-30 7:43 ` [Bug target/66917] " rguenth at gcc dot gnu.org
2015-07-30 7:48 ` rguenth at gcc dot gnu.org
2015-07-30 14:11 ` [Bug tree-optimization/66917] " rearnsha at gcc dot gnu.org
2015-07-30 14:39 ` rguenther at suse dot de
2015-07-30 14:41 ` rguenth at gcc dot gnu.org
2015-07-31 9:20 ` rguenth at gcc dot gnu.org
2015-07-31 9:23 ` ktkachov at gcc dot gnu.org
2015-07-31 13:21 ` ktkachov at gcc dot gnu.org
2015-08-03 7:14 ` [Bug tree-optimization/66917] [4.9/5 regression] ARM: NEON: memcpy compiles to vst1 with incorrect alignment rguenth at gcc dot gnu.org
2015-08-03 7:14 ` rguenth at gcc dot gnu.org
2015-09-24 20:57 ` ctice at gcc dot gnu.org
2015-09-28 10:46 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).