* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
@ 2022-01-10 10:53 ` rguenth at gcc dot gnu.org
2022-01-10 12:49 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-10 10:53 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target| |x86_64-*-* i?86-*-*
Ever confirmed|0 |1
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Last reconfirmed| |2022-01-10
Blocks| |53947
Summary|uavgv2qi3_ceil is not used |uavgv2qi3_ceil is not used
| |(SLP costing and patterns
| |vs live stmts)
Keywords| |missed-optimization
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
t.c:8:11: note: Costing subgraph:
t.c:8:11: note: node 0x409a000 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: ur[0] = _23;
t.c:8:11: note: stmt 0 ur[0] = _23;
t.c:8:11: note: stmt 1 ur[1] = _35;
t.c:8:11: note: children 0x409a088
t.c:8:11: note: node 0x409a088 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: patt_58 = (unsigned char) patt_56;
t.c:8:11: note: stmt 0 patt_58 = (unsigned char) patt_56;
t.c:8:11: note: stmt 1 patt_71 = (unsigned char) patt_69;
t.c:8:11: note: children 0x409a110
t.c:8:11: note: node 0x409a110 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: patt_56 = .AVG_CEIL (_16, _18);
t.c:8:11: note: stmt 0 patt_56 = .AVG_CEIL (_16, _18);
t.c:8:11: note: stmt 1 patt_69 = .AVG_CEIL (_28, _30);
t.c:8:11: note: children 0x409a220 0x409a198
t.c:8:11: note: node 0x409a220 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: _16 = ua[0];
t.c:8:11: note: stmt 0 _16 = ua[0];
t.c:8:11: note: stmt 1 _28 = ua[1];
t.c:8:11: note: node 0x409a198 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: _18 = ub[0];
t.c:8:11: note: stmt 0 _18 = ub[0];
t.c:8:11: note: stmt 1 _30 = ub[1];
t.c:8:11: note: Cost model analysis:
_23 1 times scalar_store costs 12 in body
_35 1 times scalar_store costs 12 in body
(unsigned char) _22 1 times scalar_stmt costs 4 in body
(unsigned char) _34 1 times scalar_stmt costs 4 in body
ua[0] 1 times vector_load costs 12 in body
ub[0] 1 times vector_load costs 12 in body
.AVG_CEIL (_16, _18) 1 times vector_stmt costs 4 in body
_23 1 times vector_store costs 12 in body
ua[0] 1 times vec_to_scalar costs 4 in epilogue
ua[1] 1 times vec_to_scalar costs 4 in epilogue
ub[0] 1 times vec_to_scalar costs 4 in epilogue
ub[1] 1 times vec_to_scalar costs 4 in epilogue
t.c:8:11: note: Cost model analysis for part in loop 0:
Vector cost: 56
Scalar cost: 32
t.c:8:11: missed: not vectorized: vectorization is not profitable.
it looks like somehow the scalar costing is off and the scalar loads from
ua and ub are considered live. Possibly an artifact of patterns.
It's vectorized fine with -fno-vect-cost-model.
I will have a look, eventually not for GCC 12.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
2022-01-10 10:53 ` [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts) rguenth at gcc dot gnu.org
@ 2022-01-10 12:49 ` rguenth at gcc dot gnu.org
2022-01-12 19:57 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-10 12:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think I've seen this before - the use in the conversion is elided in the
vector path via recognizing a pattern of a pattern - that makes it not part of
the SLP
tree and thus left as SLP_TYPE (..) = loop_vect, fooling the live computation.
vect_detect_hybrid_slp now does this in a more correct way but the original
worklist seeding has to be done differently for BB SLP.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
2022-01-10 10:53 ` [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts) rguenth at gcc dot gnu.org
2022-01-10 12:49 ` rguenth at gcc dot gnu.org
@ 2022-01-12 19:57 ` cvs-commit at gcc dot gnu.org
2022-01-26 12:42 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-01-12 19:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:
https://gcc.gnu.org/g:cb46559cea1d554cef1138db5bfbdd0647ffbc0d
commit r12-6535-gcb46559cea1d554cef1138db5bfbdd0647ffbc0d
Author: Uros Bizjak <ubizjak@gmail.com>
Date: Wed Jan 12 20:57:12 2022 +0100
testsuite: Compile gcc.target/i386/pr103861-3.c with -fno-vect-cost-model
[PR103941]
2022-01-12 Uroš Bizjak <ubizjak@gmail.com>
gcc/testsuite/ChangeLog:
PR target/103941
* gcc.target/i386/pr103861-3.c (dg-options): Add
-fno-vect-cost-model.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
` (2 preceding siblings ...)
2022-01-12 19:57 ` cvs-commit at gcc dot gnu.org
@ 2022-01-26 12:42 ` rguenth at gcc dot gnu.org
2022-04-19 14:42 ` cvs-commit at gcc dot gnu.org
2022-04-19 14:44 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-26 12:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Another testcase where this occurs:
void foo (int *c, float *x, float *y)
{
c[0] = x[0] < y[0];
c[1] = x[1] < y[1];
c[2] = x[2] < y[2];
c[3] = x[3] < y[3];
}
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
` (3 preceding siblings ...)
2022-01-26 12:42 ` rguenth at gcc dot gnu.org
@ 2022-04-19 14:42 ` cvs-commit at gcc dot gnu.org
2022-04-19 14:44 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-04-19 14:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:353434b65ef7972172597d232ae17022d9a57244
commit r12-8195-g353434b65ef7972172597d232ae17022d9a57244
Author: Richard Biener <rguenther@suse.de>
Date: Wed Apr 13 13:49:45 2022 +0200
tree-optimization/104010 - fix SLP scalar costing with patterns
When doing BB vectorization the scalar cost compute is derailed
by patterns, causing lanes to be considered live and thus not
costed on the scalar side. For the testcase in PR104010 this
prevents vectorization which was done by GCC 11. PR103941
shows similar cases of missed optimizations that are fixed by
this patch.
2022-04-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/104010
PR tree-optimization/103941
* tree-vect-slp.cc (vect_bb_slp_scalar_cost): When
we run into stmts in patterns continue walking those
for uses outside of the vectorized region instead of
marking the lane live.
* gcc.target/i386/pr103941-1.c: New testcase.
* gcc.target/i386/pr103941-2.c: Likewise.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
` (4 preceding siblings ...)
2022-04-19 14:42 ` cvs-commit at gcc dot gnu.org
@ 2022-04-19 14:44 ` rguenth at gcc dot gnu.org
5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-19 14:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
Known to work| |12.0
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk via the PR104010 regression fix.
^ permalink raw reply [flat|nested] 7+ messages in thread