public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "fxue at os dot amperecomputing.com" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113091] Over-estimate SLP vector-to-scalar cost for non-live pattern statement Date: Thu, 21 Dec 2023 05:25:32 +0000 [thread overview] Message-ID: <bug-113091-4-2jOhYUJeqP@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-113091-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113091 --- Comment #2 from Feng Xue <fxue at os dot amperecomputing.com> --- (In reply to Richard Biener from comment #1) > It's the logic > > FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt_info) > { > if (svisited.contains (stmt_info)) > continue; > stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info); > if (STMT_VINFO_IN_PATTERN_P (orig_stmt_info) > && STMT_VINFO_RELATED_STMT (orig_stmt_info) != stmt_info) > /* Only the pattern root stmt computes the original scalar value. */ > continue; > bool mark_visited = true; > gimple *orig_stmt = orig_stmt_info->stmt; > ssa_op_iter op_iter; > def_operand_p def_p; > FOR_EACH_PHI_OR_STMT_DEF (def_p, orig_stmt, op_iter, SSA_OP_DEF) > { > imm_use_iterator use_iter; > gimple *use_stmt; > stmt_vec_info use_stmt_info; > FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p)) > if (!is_gimple_debug (use_stmt)) > { > use_stmt_info = bb_vinfo->lookup_stmt (use_stmt); > if (!use_stmt_info > || !PURE_SLP_STMT (vect_stmt_to_vectorize > (use_stmt_info))) > { > STMT_VINFO_LIVE_P (stmt_info) = true; > > specifically the last check. That's supposed to pick up the "main" pattern > that's now covering the scalar stmt. > > But somehow the "main" pattern, > > patt_67 = .VEC_WIDEN_MINUS (_1, _3); // _5 = _2 - _4; > patt_68 = (signed short) patt_67; // _5 = _2 - _4; > patt_69 = (int) patt_68; // _5 = _2 - _4; > > doesn't get picked up here. I wonder what's the orig_stmt and the def > picked and what original scalar use we end up in where the > vect_stmt_to_vectorize isn't the "last" pattern. Maybe we really want This problem happens at slp node: note: node 0x425bc38 (max_nunits=8, refcnt=1) vector(8) char note: op template: _1 = *a_50(D); note: stmt 0 _1 = *a_50(D); note: stmt 1 _7 = MEM[(char *)a_50(D) + 1B]; note: stmt 2 _13 = MEM[(char *)a_50(D) + 2B]; note: stmt 3 _19 = MEM[(char *)a_50(D) + 3B]; note: stmt 4 _25 = MEM[(char *)a_50(D) + 4B]; note: stmt 5 _31 = MEM[(char *)a_50(D) + 5B]; note: stmt 6 _37 = MEM[(char *)a_50(D) + 6B]; note: stmt 7 _43 = MEM[(char *)a_50(D) + 7B]; The orig_stmt is "_1 = *a_50(D)" The use stmt is "_2 = (int) _1", whose pattern statement is "patt_64 = (int) patt_63", which is not referenced by any original or other pattern statements. Or in other word, the orig_stmt could be absorbed into a vector operation, without any outlier scalar use. The fore-mentioned "last check" in vect_bb_slp_mark_live_stmts would make the orig_stmt to be STMT_VINFO_LIVE_P, which actually implies it has scalar use (though it should not have), the difference is re-generating the def somewhere, rather than retaining the original scalar statement. And the following "vectorizable_live_operation" would account the new operations into vectorization cost of the SLP instance. The function vect_bb_vectorization_profitable_p resorts to a recursive way to identify scalar use, for this case, setting STMT_VINFO_LIVE_P or not would change scalar cost computation. If we can avoid such fake-liveness adjustment on the statements we are interested in, vectorization cost could beat scalar cost, and make the former succeed. Unvectorized: mov x2, x0 stp x29, x30, [sp, -48]! mov x29, sp ldrb w3, [x1] ldrb w4, [x1, 1] add x0, sp, 16 ldrb w9, [x2] ldrb w8, [x2, 1] sub w9, w9, w3 ldrb w7, [x2, 2] ldrb w3, [x1, 2] sub w8, w8, w4 ldrb w6, [x2, 3] ldrb w4, [x1, 3] sub w7, w7, w3 ldrb w10, [x1, 5] ldrb w3, [x1, 4] sub w6, w6, w4 ldrb w5, [x2, 4] ldrb w4, [x2, 5] sub w5, w5, w3 ldrb w3, [x2, 6] sub w4, w4, w10 ldrb w2, [x2, 7] ldrb w10, [x1, 6] ldrb w1, [x1, 7] sub w3, w3, w10 stp w9, w8, [sp, 16] sub w1, w2, w1 stp w7, w6, [sp, 24] stp w5, w4, [sp, 32] stp w3, w1, [sp, 40] bl test ldp x29, x30, [sp], 48 ret Vectorized: mov x2, x0 stp x29, x30, [sp, -48]! mov x29, sp ldr d1, [x1] add x0, sp, 16 ldr d0, [x2] usubl v0.8h, v0.8b, v1.8b sxtl v1.4s, v0.4h sxtl2 v0.4s, v0.8h stp q1, q0, [sp, 16] bl test ldp x29, x30, [sp], 48 ret > these "overlapping" patterns, but IMHO having "two entries" into > a chain of scalar stmts is bad and we should link up the whole matched > sequence to the final "root" instead? > > That said, the current code doesn't see that wherever we end up isn't > dead code (aka fully covered by the vectorization). > > IMO vect_stmt_to_vectorize for each of those stmts should end up at > > patt_69 = (int) patt_68;
next prev parent reply other threads:[~2023-12-21 5:25 UTC|newest] Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-12-20 9:54 [Bug tree-optimization/113091] New: " fxue at os dot amperecomputing.com 2023-12-20 13:09 ` [Bug tree-optimization/113091] " rguenth at gcc dot gnu.org 2023-12-21 5:25 ` fxue at os dot amperecomputing.com [this message] 2023-12-21 5:27 ` fxue at os dot amperecomputing.com 2023-12-21 7:31 ` rguenth at gcc dot gnu.org 2023-12-21 11:01 ` rsandifo at gcc dot gnu.org 2023-12-22 3:55 ` fxue at os dot amperecomputing.com 2023-12-26 15:16 ` fxue at os dot amperecomputing.com 2023-12-29 10:35 ` fxue at os dot amperecomputing.com 2024-01-16 3:36 ` cvs-commit at gcc dot gnu.org 2024-01-31 3:13 ` fxue at os dot amperecomputing.com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-113091-4-2jOhYUJeqP@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).