public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114346] New: vectorizer generates the same IV twice
@ 2024-03-15 4:21 tnfchris at gcc dot gnu.org
2024-03-15 8:47 ` [Bug tree-optimization/114346] " rguenth at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-03-15 4:21 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114346
Bug ID: 114346
Summary: vectorizer generates the same IV twice
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following example:
---
double f(int n, double *data, double b) {
double res = b;
for (int i=0;i<n;i++) {
res += data[i] * i;
}
return res;
}
---
generates at -Ofast -march=armv9-a this code:
cntd x5
mov z28.s, w5
index z30.d, #0, #1
.L4:
incw x2
add z1.s, z30.s, z28.s
ld1d z25.d, p7/z, [x3, #1, mul vl]
mov z26.d, z30.d
ld1d z2.d, p7/z, [x3]
sxtw z1.d, p7/m, z1.d
sxtw z26.d, p7/m, z26.d
scvtf z1.d, p7/m, z1.d
scvtf z26.d, p7/m, z26.d
incb x3, all, mul #2
fmla z29.d, p7/m, z25.d, z1.d
incw z30.s
fmla z31.d, p7/m, z2.d, z26.d
cmp w4, w2
bcs .L4
note that the incw is calculating the vectorized IV of i, initialized and z28
is filled with the VL.
so the incw z30.s and the add z1.s, z30.s, z28.s are calculating the same
thing.
there are other issues with this codegen but this ticket is about the double
IVs.
The vectorizer genertes:
# vect_vec_iv_.7_45 = PHI <_49(6), { 0, 1, 2, ... }(15)>
_48 = vect_vec_iv_.7_45 + { POLY_INT_CST [2, 2], ... };
_71 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>(vect_vec_iv_.7_45);
_72 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>({ POLY_INT_CST [4, 4],
... });
_73 = _71 + _72;
_49 = VIEW_CONVERT_EXPR<vector([2,2]) int>(_73);
so it looks like _48 and _49 are the same value, except that _48 is done as
32-bit IV and _49 is calculated as a 64-bit one and truncated to 32?
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug tree-optimization/114346] vectorizer generates the same IV twice
2024-03-15 4:21 [Bug tree-optimization/114346] New: vectorizer generates the same IV twice tnfchris at gcc dot gnu.org
@ 2024-03-15 8:47 ` rguenth at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-03-15 8:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114346
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu.org
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-03-15
Ever confirmed|0 |1
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I'll note the missing constant folding of
_72 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>({ POLY_INT_CST [4, 4],
... });
(it's just a sign change)
Note the vectorizer generates
<bb 15> [local count: 94607391]:
vect_cst__44 = { POLY_INT_CST [4, 4], ... };
vect_cst__47 = { POLY_INT_CST [2, 2], ... };
_68 = niters.4_23 - POLY_INT_CST [4, 4];
<bb 3> [local count: 860067200]:
# res_17 = PHI <res_13(6), b_9(D)(15)>
# i_19 = PHI <i_14(6), 0(15)>
# vect_res_17.6_42 = PHI <vect_res_13.15_61(6), { 0.0, ... }(15)>
# vect_res_17.6_43 = PHI <vect_res_13.15_62(6), { 0.0, ... }(15)>
# vect_vec_iv_.7_45 = PHI <_49(6), { 0, 1, 2, ... }(15)>
# vectp_data.8_50 = PHI <vectp_data.8_51(6), data_12(D)(15)>
# ivtmp_69 = PHI <ivtmp_70(6), 0(15)>
_46 = vect_vec_iv_.7_45 + vect_cst__44;
_48 = vect_vec_iv_.7_45 + vect_cst__47;
_49 = _48 + vect_cst__47;
_1 = (long unsigned int) i_19;
_2 = _1 * 8;
_3 = data_12(D) + _2;
vect__4.10_52 = MEM <vector([2,2]) double> [(double *)vectp_data.8_50];
vectp_data.8_53 = vectp_data.8_50 + POLY_INT_CST [16, 16];
vect__4.11_54 = MEM <vector([2,2]) double> [(double *)vectp_data.8_53];
_4 = *_3;
vect__5.13_55 = (vector([2,2]) signed long) vect_vec_iv_.7_45;
vect__5.12_56 = (vector([2,2]) double) vect__5.13_55;
vect__5.13_57 = (vector([2,2]) signed long) _48;
vect__5.12_58 = (vector([2,2]) double) vect__5.13_57;
_5 = (double) i_19;
vect__6.14_59 = vect__4.10_52 * vect__5.12_56;
vect__6.14_60 = vect__4.11_54 * vect__5.12_58;
_6 = _4 * _5;
vect_res_13.15_61 = vect__6.14_59 + vect_res_17.6_42;
vect_res_13.15_62 = vect__6.14_60 + vect_res_17.6_43;
res_13 = _6 + res_17;
i_14 = i_19 + 1;
vectp_data.8_51 = vectp_data.8_53 + POLY_INT_CST [16, 16];
ivtmp_70 = ivtmp_69 + POLY_INT_CST [4, 4];
if (ivtmp_70 <= _68)
goto <bb 6>; [89.00%]
so there's just one IV here (the reduction needs two)
_46 = vect_vec_iv_.7_45 + vect_cst__44;
_48 = vect_vec_iv_.7_45 + vect_cst__47;
_49 = _48 + vect_cst__47;
looks somewhat redundant but the result you quote is from applying VN
and match.pd patterns. And in the original I can't
see the promotion to unsigned (possibly caused by some match.pd):
Value numbering stmt = _49 = _48 + vect_cst__47;
Setting value number of _49 to _49 (changed)
Matching expression match.pd:163, gimple-match-10.cc:57
Matching expression match.pd:163, gimple-match-10.cc:57
Matching expression match.pd:163, gimple-match-10.cc:57
Applying pattern match.pd:3561, gimple-match-8.cc:746
gimple_simplified to _71 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned
int>(vect_vec_iv_.7_45);
_72 = VIEW_CONVERT_EXPR<vector([2,2]) unsigned int>({ POLY_INT_CST [4, 4], ...
});
_73 = _71 + _72;
_49 = VIEW_CONVERT_EXPR<vector([2,2]) int>(_73);
it seems we think that (x + POLY_INT_CST) + POLY_INT_CST cannot be
associated with signed. And we fail to value-number both increments
to the same value because of that. Also _46 is dead, so the first thing
is to see where we code-generate those initial stmts.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-03-15 8:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15 4:21 [Bug tree-optimization/114346] New: vectorizer generates the same IV twice tnfchris at gcc dot gnu.org
2024-03-15 8:47 ` [Bug tree-optimization/114346] " rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).