public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/113594] New: Missing partial sum optimziation in the vectorizer.
@ 2024-01-25 3:48 liuhongt at gcc dot gnu.org
2024-01-25 3:50 ` [Bug tree-optimization/113594] " liuhongt at gcc dot gnu.org
0 siblings, 1 reply; 2+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-01-25 3:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113594
Bug ID: 113594
Summary: Missing partial sum optimziation in the vectorizer.
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
double
foo (short* p, int n)
{
double sum = 0;
for (int i = 0; i != n; i++)
sum += p[i] * (double)p[i];
return sum;
}
w/ fast-math vectorizer generates
<bb 5> [local count: 860067200]:
# vect_sum_16.8_44 = PHI <vect_sum_12.15_61(5), { 0.0, 0.0, 0.0, 0.0 }(4)>
# ivtmp.35_152 = PHI <ivtmp.35_151(5), ivtmp.35_150(4)>
# DEBUG BEGIN_STMT
# DEBUG D#13 => D#14 * 2
# DEBUG D#12 => p_11(D) + D#13
_149 = (void *) ivtmp.35_152;
vect__4.11_47 = MEM <vector(16) short int> [(short int *)_149];
# DEBUG D#11 => *D#12
vect__5.13_48 = [vec_unpack_lo_expr] vect__4.11_47;
vect__5.13_49 = [vec_unpack_hi_expr] vect__4.11_47;
vect__5.12_50 = [vec_unpack_float_lo_expr] vect__5.13_48;
vect__5.12_51 = [vec_unpack_float_hi_expr] vect__5.13_48;
vect__5.12_52 = [vec_unpack_float_lo_expr] vect__5.13_49;
vect__5.12_53 = [vec_unpack_float_hi_expr] vect__5.13_49;
# DEBUG D#10 => (double) D#11
vect_powmult_6.14_55 = vect__5.12_51 * vect__5.12_51;
_62 = .FMA (vect__5.12_50, vect__5.12_50, vect_powmult_6.14_55);
vect_powmult_6.14_57 = vect__5.12_53 * vect__5.12_53;
_45 = .FMA (vect__5.12_52, vect__5.12_52, vect_powmult_6.14_57);
_46 = _45 + _62;
# DEBUG D#9 => D#10 * D#10
vect_sum_12.15_61 = vect_sum_16.8_44 + _46;
# DEBUG sum => D#8
# DEBUG BEGIN_STMT
# DEBUG i => NULL
# DEBUG sum => D#8
# DEBUG BEGIN_STMT
ivtmp.35_151 = ivtmp.35_152 + 32;
if (_18 != ivtmp.35_151)
goto <bb 5>; [89.00%]
else
goto <bb 8>; [11.00%]
But it can be better with.
....
vect_powmult_6.14_55 = .FMA (vect__5.12_51, vect__5.12_51, 0);
_62 = .FMA (vect__5.12_50, vect__5.12_50, 0);
vect_powmult_6.14_57 = .FMA (vect__5.12_53, vect__5.12_53, 0);
_45 = .FMA (vect__5.12_52, vect__5.12_52, 0);
ivtmp.35_151 = ivtmp.35_152 + 32;
if (_18 != ivtmp.35_151)
goto <bb 5>; [89.00%]
else
goto <bb 8>; [11.00%]
<bb 8>
_tmp1 = vect_powmult_6.14_55 + _62;
_tmp2 = vect_powmult_6.14_57 + _45;
_tmp3 = _tmp1 + _tmp2;
_tmp4_scalar = .REDUCE_SUM (_tmp3);
^ permalink raw reply [flat|nested] 2+ messages in thread
* [Bug tree-optimization/113594] Missing partial sum optimziation in the vectorizer.
2024-01-25 3:48 [Bug tree-optimization/113594] New: Missing partial sum optimziation in the vectorizer liuhongt at gcc dot gnu.org
@ 2024-01-25 3:50 ` liuhongt at gcc dot gnu.org
0 siblings, 0 replies; 2+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-01-25 3:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113594
Hongtao Liu <liuhongt at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |DUPLICATE
Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
dup.
*** This bug has been marked as a duplicate of bug 113592 ***
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-01-25 3:50 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-25 3:48 [Bug tree-optimization/113594] New: Missing partial sum optimziation in the vectorizer liuhongt at gcc dot gnu.org
2024-01-25 3:50 ` [Bug tree-optimization/113594] " liuhongt at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).