public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes @ 2021-09-02 12:18 rguenth at gcc dot gnu.org 2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org ` (4 more replies) 0 siblings, 5 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-02 12:18 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176 Bug ID: 102176 Summary: BB SLP scalar costing is off with extern promoted nodes Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- On aarch64 we can see int foo(long *restrict res, long *restrict foo, long a, long b) { res[0] = ((foo[0] * a) >> 1) + foo[0]; res[1] = ((foo[1] * b) >> 1) + foo[1]; } being vectorized as t.c:3:10: note: Costing subgraph: t.c:3:10: note: node 0x35f03b0 (max_nunits=2, refcnt=1) t.c:3:10: note: op template: *res_12(D) = _4; t.c:3:10: note: stmt 0 *res_12(D) = _4; t.c:3:10: note: stmt 1 MEM[(long int *)res_12(D) + 8B] = _8; t.c:3:10: note: children 0x35f0440 t.c:3:10: note: node 0x35f0440 (max_nunits=2, refcnt=1) t.c:3:10: note: op template: _4 = _1 + _3; t.c:3:10: note: stmt 0 _4 = _1 + _3; t.c:3:10: note: stmt 1 _8 = _5 + _7; t.c:3:10: note: children 0x35f04d0 0x35f0560 t.c:3:10: note: node 0x35f04d0 (max_nunits=2, refcnt=2) t.c:3:10: note: op template: _1 = *foo_10(D); t.c:3:10: note: stmt 0 _1 = *foo_10(D); t.c:3:10: note: stmt 1 _5 = MEM[(long int *)foo_10(D) + 8B]; t.c:3:10: note: node 0x35f0560 (max_nunits=2, refcnt=1) t.c:3:10: note: op template: _3 = _2 >> 1; t.c:3:10: note: stmt 0 _3 = _2 >> 1; t.c:3:10: note: stmt 1 _7 = _6 >> 1; t.c:3:10: note: children 0x35f05f0 0x35f0710 t.c:3:10: note: node (external) 0x35f05f0 (max_nunits=2, refcnt=1) t.c:3:10: note: stmt 0 _2 = _1 * a_11(D); t.c:3:10: note: stmt 1 _6 = _5 * b_14(D); t.c:3:10: note: children 0x35f04d0 0x35f0680 t.c:3:10: note: node (external) 0x35f0680 (max_nunits=1, refcnt=1) t.c:3:10: note: { a_11(D), b_14(D) } t.c:3:10: note: node (constant) 0x35f0710 (max_nunits=1, refcnt=1) t.c:3:10: note: { 1, 1 } so the promoted external node 0x35f05f0 should keep the load live. vect_bb_slp_scalar_cost relies on PURE_SLP_STMT but that's unreliable here since the per-stmt setting cannot capture the different uses. The code shares intend (and some bugs) with vect_bb_slp_mark_live_stmts and the problem in general is a bit difficult given the lack of back-mapping from stmt to SLP nodes referencing it. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes 2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org @ 2021-09-02 12:18 ` rguenth at gcc dot gnu.org 2021-09-02 12:26 ` rguenth at gcc dot gnu.org ` (3 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-02 12:18 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed| |2021-09-02 Status|UNCONFIRMED |ASSIGNED Keywords| |missed-optimization ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes 2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org 2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org @ 2021-09-02 12:26 ` rguenth at gcc dot gnu.org 2021-09-02 12:54 ` rguenth at gcc dot gnu.org ` (2 subsequent siblings) 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-02 12:26 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- So in this case we have _2 = _1 * a_11(D) still pure_slp even though it does not participate in any vectorized SLP node. Unfortunately marking of PURE_SLP_STMTs happens before analyzing operations (the vectorizable_* functions called rely on the SLP type here for no good reason). But that analysis can promote nodes extern and the SLP type is not adjusted afterwards. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes 2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org 2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org 2021-09-02 12:26 ` rguenth at gcc dot gnu.org @ 2021-09-02 12:54 ` rguenth at gcc dot gnu.org 2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org 2021-09-06 6:56 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-02 12:54 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Created attachment 51404 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51404&action=edit patch This brute-force approach of re-computing something like PURE_SLP_STMT minus the set of defs used in extern def SLP nodes does the trick. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes 2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org ` (2 preceding siblings ...) 2021-09-02 12:54 ` rguenth at gcc dot gnu.org @ 2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org 2021-09-06 6:56 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: cvs-commit at gcc dot gnu.org @ 2021-09-06 6:55 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176 --- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>: https://gcc.gnu.org/g:a3fb781d4b341c0d50ef1b92cd3e8734e673ef18 commit r12-3362-ga3fb781d4b341c0d50ef1b92cd3e8734e673ef18 Author: Richard Biener <rguenther@suse.de> Date: Thu Sep 2 14:48:10 2021 +0200 tree-optimization/102176 - locally compute participating SLP stmts This performs local re-computation of participating scalar stmts in BB vectorization subgraphs to allow precise computation of liveness of scalar stmts after vectorization and thus precise costing. This treats all extern defs as live but continues to optimistically handle scalar defs that we think we can handle by lane-extraction even though that can still fail late during code-generation. 2021-09-02 Richard Biener <rguenther@suse.de> PR tree-optimization/102176 * tree-vect-slp.c (vect_slp_gather_vectorized_scalar_stmts): New function. (vect_bb_slp_scalar_cost): Use the computed set of vectorized scalar stmts instead of relying on the out-of-date and not accurate PURE_SLP_STMT. (vect_bb_vectorization_profitable_p): Compute the set of vectorized scalar stmts. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/102176] BB SLP scalar costing is off with extern promoted nodes 2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org ` (3 preceding siblings ...) 2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org @ 2021-09-06 6:56 ` rguenth at gcc dot gnu.org 4 siblings, 0 replies; 6+ messages in thread From: rguenth at gcc dot gnu.org @ 2021-09-06 6:56 UTC (permalink / raw) To: gcc-bugs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102176 Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Known to work| |12.0 --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- Fixed on trunk, since we enabled whole-function BB vectorization for GCC 11 I'm considering to backport this. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-09-06 6:56 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-09-02 12:18 [Bug tree-optimization/102176] New: BB SLP scalar costing is off with extern promoted nodes rguenth at gcc dot gnu.org 2021-09-02 12:18 ` [Bug tree-optimization/102176] " rguenth at gcc dot gnu.org 2021-09-02 12:26 ` rguenth at gcc dot gnu.org 2021-09-02 12:54 ` rguenth at gcc dot gnu.org 2021-09-06 6:55 ` cvs-commit at gcc dot gnu.org 2021-09-06 6:56 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).