public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/65952] New: [AArch64] Will not vectorize copying pointers
@ 2015-04-30 17:41 alalaw01 at gcc dot gnu.org
2015-04-30 18:46 ` [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses pinskia at gcc dot gnu.org
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: alalaw01 at gcc dot gnu.org @ 2015-04-30 17:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
Bug ID: 65952
Summary: [AArch64] Will not vectorize copying pointers
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: alalaw01 at gcc dot gnu.org
Target Milestone: ---
Target: aarch64
typedef struct {
int a, b, c, d;
} my_struct;
my_struct *array;
my_struct *ptrs[4];
void
loop ()
{
for (int i = 0; i < 4; i++)
ptrs[i] = &array[i];
}
Vectorizes on x86, but not on AArch64. From -fdump-tree-vect-details:
test.c:13:3: note: vectorization factor = 4
...
test.c:13:3: note: not vectorized: relevant stmt not supported: _6 = _5 * 16;
test.c:13:3: note: bad operation or unsupported loop bound.
test.c:13:3: note: ***** Re-trying analysis with vector size 8
...
test.c:13:3: note: not vectorized: no vectype for stmt: ptrs[i_12] = _7;
scalar_type: struct my_struct *
test.c:13:3: note: bad data references.
test.c:11:1: note: vectorized 0 loops in function.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
@ 2015-04-30 18:46 ` pinskia at gcc dot gnu.org
2015-04-30 18:48 ` pinskia at gcc dot gnu.org
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-04-30 18:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Depends on| |65951
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think the issue is really 65951.
>_6 = _5 * 16;
That is a size_t multiply which is an unsigned 64bit multiply.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65951
[Bug 65951] [AArch64] Will not vectorize multiplication by long constant
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
2015-04-30 18:46 ` [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses pinskia at gcc dot gnu.org
@ 2015-04-30 18:48 ` pinskia at gcc dot gnu.org
2015-04-30 20:34 ` [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64 pinskia at gcc dot gnu.org
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-04-30 18:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
if we use -mabi=ilp32, we can see it is vectorized:
loop:
adrp x0, array
adrp x1, ptrs
ldr q1, .LC0
add x3, x1, :lo12:ptrs
ldr w2, [x0, #:lo12:array]
dup v0.4s, w2
add v2.4s, v0.4s, v1.4s
str q2, [x3]
ret
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
2015-04-30 18:46 ` [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses pinskia at gcc dot gnu.org
2015-04-30 18:48 ` pinskia at gcc dot gnu.org
@ 2015-04-30 20:34 ` pinskia at gcc dot gnu.org
2015-05-01 9:52 ` alalaw01 at gcc dot gnu.org
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-04-30 20:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2015-04-30
Version|5.0 |5.1.0
Summary|[AArch64] Will not |[AArch64] Will not
|vectorize storing induction |vectorize storing induction
|of pointer addresses |of pointer addresses for
| |LP64
Ever confirmed|0 |1
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
` (2 preceding siblings ...)
2015-04-30 20:34 ` [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64 pinskia at gcc dot gnu.org
@ 2015-05-01 9:52 ` alalaw01 at gcc dot gnu.org
2015-06-17 12:59 ` alalaw01 at gcc dot gnu.org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: alalaw01 at gcc dot gnu.org @ 2015-05-01 9:52 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
--- Comment #4 from alalaw01 at gcc dot gnu.org ---
Hmmm. Yes. Well, x * 16 = x << 4, of course. Or, in theory something like VRP
could let us see that
# i_12 = PHI <i_9(4), 0(2)>
# ivtmp_18 = PHI <ivtmp_17(4), 4(2)>
_5 = (long unsigned int) i_12;
_6 = _5 * 16;
_7 = pretmp_11 + _6;
ptrs[i_12] = _7;
i_9 = i_12 + 1;
could be rewritten to something like
# i_12 = PHI <i_9(4), 0(2)>
# ivtmp_18 = PHI <ivtmp_17(4), 4(2)>
_5 = _12 * 16;
_6 = (long unsigned int) _5;
_7 = pretmp_11 + _6;
ptrs[i_12] = _7;
i_9 = i_12 + 1;
which would then be vectorizable.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
` (3 preceding siblings ...)
2015-05-01 9:52 ` alalaw01 at gcc dot gnu.org
@ 2015-06-17 12:59 ` alalaw01 at gcc dot gnu.org
2015-06-17 13:19 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: alalaw01 at gcc dot gnu.org @ 2015-06-17 12:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
--- Comment #5 from alalaw01 at gcc dot gnu.org ---
So the above example tends to get fully unrolled, but even on an example with
32 ptrs rather than 4, yes the vectorizer fails because of the multiplication -
but the multiplication is gone by the final tree stage, as it's strength
reduced down to an add; I believe this -fdump-tree-optimized would be perfectly
vectorizable:
loop ()
{
unsigned long ivtmp.12;
unsigned long ivtmp.10;
void * _4;
struct my_struct * _7;
struct my_struct * pretmp_11;
unsigned long _20;
<bb 2>:
pretmp_11 = array;
ivtmp.10_16 = (unsigned long) pretmp_11;
ivtmp.12_2 = (unsigned long) &ptrs;
_20 = (unsigned long) &MEM[(void *)&ptrs + 256B];
<bb 3>:
# ivtmp.10_10 = PHI <ivtmp.10_1(3), ivtmp.10_16(2)>
# ivtmp.12_15 = PHI <ivtmp.12_14(3), ivtmp.12_2(2)>
_7 = (struct my_struct *) ivtmp.10_10;
_4 = (void *) ivtmp.12_15;
MEM[base: _4, offset: 0B] = _7;
ivtmp.10_1 = ivtmp.10_10 + 16;
ivtmp.12_14 = ivtmp.12_15 + 8;
if (ivtmp.12_14 != _20)
goto <bb 3>;
else
goto <bb 4>;
<bb 4>:
return;
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
` (4 preceding siblings ...)
2015-06-17 12:59 ` alalaw01 at gcc dot gnu.org
@ 2015-06-17 13:19 ` rguenth at gcc dot gnu.org
2015-06-17 13:21 ` alalaw01 at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-17 13:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
So aarch64 has no DImode vectors? Or just no DImode multiply (but it has a
DImode vector shift?). If so this could be handled by a vectorizer pattern
transforming the multiply to a shift.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
` (5 preceding siblings ...)
2015-06-17 13:19 ` rguenth at gcc dot gnu.org
@ 2015-06-17 13:21 ` alalaw01 at gcc dot gnu.org
2015-06-17 13:22 ` alalaw01 at gcc dot gnu.org
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: alalaw01 at gcc dot gnu.org @ 2015-06-17 13:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
--- Comment #7 from alalaw01 at gcc dot gnu.org ---
(In reply to Richard Biener from comment #6)
> So aarch64 has no DImode vectors? Or just no DImode multiply (but it has a
> DImode vector shift?).
Yes, the latter.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
` (6 preceding siblings ...)
2015-06-17 13:21 ` alalaw01 at gcc dot gnu.org
@ 2015-06-17 13:22 ` alalaw01 at gcc dot gnu.org
2015-07-20 10:30 ` vekumar at gcc dot gnu.org
2015-08-10 7:05 ` vekumar at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: alalaw01 at gcc dot gnu.org @ 2015-06-17 13:22 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
--- Comment #8 from alalaw01 at gcc dot gnu.org ---
(In reply to alalaw01 from comment #7)
> (In reply to Richard Biener from comment #6)
> > So aarch64 has no DImode vectors? Or just no DImode multiply (but it has a
> > DImode vector shift?).
>
> Yes, the latter.
Sorry, aarch64 has a DImode multiply, but no V2DImode multiply; and it has
V2DImode shifts.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
` (7 preceding siblings ...)
2015-06-17 13:22 ` alalaw01 at gcc dot gnu.org
@ 2015-07-20 10:30 ` vekumar at gcc dot gnu.org
2015-08-10 7:05 ` vekumar at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: vekumar at gcc dot gnu.org @ 2015-07-20 10:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
--- Comment #10 from vekumar at gcc dot gnu.org ---
With the patch I get
loop:
adrp x0, array
ldr q1, .LC0
ldr q2, .LC1
adrp x1, ptrs
add x1, x1, :lo12:ptrs
ldr x0, [x0, #:lo12:array]
dup v0.2d, x0
add v1.2d, v0.2d, v1.2d <== vectorized
add v0.2d, v0.2d, v2.2d <== vectorized
str q1, [x1]
str q0, [x1, 16]
ret
.size loop, .-loop
.align 4
.LC0:
.xword 0
.xword 16
.align 4
.LC1:
.xword 32
.xword 48
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
` (8 preceding siblings ...)
2015-07-20 10:30 ` vekumar at gcc dot gnu.org
@ 2015-08-10 7:05 ` vekumar at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: vekumar at gcc dot gnu.org @ 2015-08-10 7:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65952
vekumar at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #11 from vekumar at gcc dot gnu.org ---
This is getting fixed after patch
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=226675
Vectorize mult expressions with power 2 constants via shift, for targets has no
vector multiplication support.
2015-08-06 Venkataramanan Kumar <Venkataramanan.kumar@amd.com>
* tree-vect-patterns.c (vect_recog_mult_pattern): New function
for vectorizing multiplication patterns.
* tree-vectorizer.h: Adjust the number of patterns.
2015-08-06 Venkataramanan Kumar <Venkataramanan.kumar@amd.com>
* gcc.dg/vect/vect-mult-pattern-1.c: New test.
* gcc.dg/vect/vect-mult-pattern-2.c: New test.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2015-08-10 7:05 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-30 17:41 [Bug target/65952] New: [AArch64] Will not vectorize copying pointers alalaw01 at gcc dot gnu.org
2015-04-30 18:46 ` [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses pinskia at gcc dot gnu.org
2015-04-30 18:48 ` pinskia at gcc dot gnu.org
2015-04-30 20:34 ` [Bug target/65952] [AArch64] Will not vectorize storing induction of pointer addresses for LP64 pinskia at gcc dot gnu.org
2015-05-01 9:52 ` alalaw01 at gcc dot gnu.org
2015-06-17 12:59 ` alalaw01 at gcc dot gnu.org
2015-06-17 13:19 ` rguenth at gcc dot gnu.org
2015-06-17 13:21 ` alalaw01 at gcc dot gnu.org
2015-06-17 13:22 ` alalaw01 at gcc dot gnu.org
2015-07-20 10:30 ` vekumar at gcc dot gnu.org
2015-08-10 7:05 ` vekumar at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).