public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes
@ 2021-09-02 12:18 rguenth at gcc dot gnu.org
2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-02 12:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176
Bug ID: 102176
Summary: BB SLP scalar costing is off with extern promoted
nodes
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
On aarch64 we can see
int foo(long *restrict res, long *restrict foo, long a, long b)
{
res[0] = ((foo[0] * a) >> 1) + foo[0];
res[1] = ((foo[1] * b) >> 1) + foo[1];
}
being vectorized as
t.c:3:10: note: Costing subgraph:
t.c:3:10: note: node 0x35f03b0 (max_nunits=2, refcnt=1)
t.c:3:10: note: op template: *res_12(D) = _4;
t.c:3:10: note: stmt 0 *res_12(D) = _4;
t.c:3:10: note: stmt 1 MEM[(long int *)res_12(D) + 8B] = _8;
t.c:3:10: note: children 0x35f0440
t.c:3:10: note: node 0x35f0440 (max_nunits=2, refcnt=1)
t.c:3:10: note: op template: _4 = _1 + _3;
t.c:3:10: note: stmt 0 _4 = _1 + _3;
t.c:3:10: note: stmt 1 _8 = _5 + _7;
t.c:3:10: note: children 0x35f04d0 0x35f0560
t.c:3:10: note: node 0x35f04d0 (max_nunits=2, refcnt=2)
t.c:3:10: note: op template: _1 = *foo_10(D);
t.c:3:10: note: stmt 0 _1 = *foo_10(D);
t.c:3:10: note: stmt 1 _5 = MEM[(long int *)foo_10(D) + 8B];
t.c:3:10: note: node 0x35f0560 (max_nunits=2, refcnt=1)
t.c:3:10: note: op template: _3 = _2 >> 1;
t.c:3:10: note: stmt 0 _3 = _2 >> 1;
t.c:3:10: note: stmt 1 _7 = _6 >> 1;
t.c:3:10: note: children 0x35f05f0 0x35f0710
t.c:3:10: note: node (external) 0x35f05f0 (max_nunits=2, refcnt=1)
t.c:3:10: note: stmt 0 _2 = _1 * a_11(D);
t.c:3:10: note: stmt 1 _6 = _5 * b_14(D);
t.c:3:10: note: children 0x35f04d0 0x35f0680
t.c:3:10: note: node (external) 0x35f0680 (max_nunits=1, refcnt=1)
t.c:3:10: note: { a_11(D), b_14(D) }
t.c:3:10: note: node (constant) 0x35f0710 (max_nunits=1, refcnt=1)
t.c:3:10: note: { 1, 1 }
so the promoted external node 0x35f05f0 should keep the load live.
vect_bb_slp_scalar_cost relies on PURE_SLP_STMT but
that's unreliable here since the per-stmt setting cannot capture the
different uses. The code shares intend (and some bugs) with
vect_bb_slp_mark_live_stmts and the problem in general is a bit
difficult given the lack of back-mapping from stmt to SLP nodes
referencing it.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes
2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org
@ 2021-09-02 12:18 ` rguenth at gcc dot gnu.org
2021-09-02 12:26 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-02 12:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Ever confirmed|0 |1
Last reconfirmed| |2021-09-02
Status|UNCONFIRMED |ASSIGNED
Keywords| |missed-optimization
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes
2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org
2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org
@ 2021-09-02 12:26 ` rguenth at gcc dot gnu.org
2021-09-02 12:54 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-02 12:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
So in this case we have _2 = _1 * a_11(D) still pure_slp even though it does
not participate in any vectorized SLP node.
Unfortunately marking of PURE_SLP_STMTs happens before analyzing operations
(the vectorizable_* functions called rely on the SLP type here for no good
reason). But that analysis can promote nodes extern and the SLP type is
not adjusted afterwards.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes
2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org
2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org
2021-09-02 12:26 ` rguenth at gcc dot gnu.org
@ 2021-09-02 12:54 ` rguenth at gcc dot gnu.org
2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org
2021-09-06 6:56 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-02 12:54 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 51404
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51404&action=edit
patch
This brute-force approach of re-computing something like PURE_SLP_STMT minus
the set of defs used in extern def SLP nodes does the trick.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes
2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org
` (2 preceding siblings ...)
2021-09-02 12:54 ` rguenth at gcc dot gnu.org
@ 2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org
2021-09-06 6:56 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-09-06 6:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:a3fb781d4b341c0d50ef1b92cd3e8734e673ef18
commit r12-3362-ga3fb781d4b341c0d50ef1b92cd3e8734e673ef18
Author: Richard Biener <rguenther@suse.de>
Date: Thu Sep 2 14:48:10 2021 +0200
tree-optimization/102176 - locally compute participating SLP stmts
This performs local re-computation of participating scalar stmts
in BB vectorization subgraphs to allow precise computation of
liveness of scalar stmts after vectorization and thus precise
costing. This treats all extern defs as live but continues
to optimistically handle scalar defs that we think we can handle
by lane-extraction even though that can still fail late during
code-generation.
2021-09-02 Richard Biener <rguenther@suse.de>
PR tree-optimization/102176
* tree-vect-slp.c (vect_slp_gather_vectorized_scalar_stmts):
New function.
(vect_bb_slp_scalar_cost): Use the computed set of
vectorized scalar stmts instead of relying on the out-of-date
and not accurate PURE_SLP_STMT.
(vect_bb_vectorization_profitable_p): Compute the set
of vectorized scalar stmts.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes
2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org
` (3 preceding siblings ...)
2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org
@ 2021-09-06 6:56 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-09-06 6:56 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |12.0
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk, since we enabled whole-function BB vectorization for GCC 11 I'm
considering to backport this.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-09-06 6:56 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org
2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org
2021-09-02 12:26 ` rguenth at gcc dot gnu.org
2021-09-02 12:54 ` rguenth at gcc dot gnu.org
2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org
2021-09-06 6:56 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).