public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "fxue at os dot amperecomputing.com" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113091] New: Over-estimate SLP vector-to-scalar cost for non-live pattern statement Date: Wed, 20 Dec 2023 09:54:01 +0000 [thread overview] Message-ID: <bug-113091-4@http.gcc.gnu.org/bugzilla/> (raw) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113091 Bug ID: 113091 Summary: Over-estimate SLP vector-to-scalar cost for non-live pattern statement Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: fxue at os dot amperecomputing.com Target Milestone: --- Gcc fails to vectorize the below testcase on aarch64. int test(unsigned array[8]); int foo(char *a, char *b) { unsigned array[8]; array[0] = (a[0] - b[0]); array[1] = (a[1] - b[1]); array[2] = (a[2] - b[2]); array[3] = (a[3] - b[3]); array[4] = (a[4] - b[4]); array[5] = (a[5] - b[5]); array[6] = (a[6] - b[6]); array[7] = (a[7] - b[7]); return test(array); } The dump shows that loads to a[i] and b[i] are considered to be live as scalar references, which results in over-estimated vector-to-scalar cost. *a_50(D) 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)a_50(D) + 1B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)a_50(D) + 2B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)a_50(D) + 3B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)a_50(D) + 4B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)a_50(D) + 5B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)a_50(D) + 6B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)a_50(D) + 7B] 1 times vec_to_scalar costs 2 in epilogue *b_51(D) 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)b_51(D) + 1B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)b_51(D) + 2B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)b_51(D) + 3B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)b_51(D) + 4B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)b_51(D) + 5B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)b_51(D) + 6B] 1 times vec_to_scalar costs 2 in epilogue MEM[(char *)b_51(D) + 7B] 1 times vec_to_scalar costs 2 in epilogue Subtraction on char type is recognized as widen-sub, and involves two kinds of pattern replacement. * Original _1 = *a_50(D); _2 = (int) _1; _3 = *b_51(D); _4 = (int) _3; _5 = _2 - _4; * After pattern replacement patt_63 = (unsigned short) _1; // _2 = (int) _1; patt_64 = (int) patt_63; // _2 = (int) _1; patt_65 = (unsigned short) _3; // _4 = (int) _3; patt_66 = (int) patt_65; // _4 = (int) _3; patt_67 = .VEC_WIDEN_MINUS (_1, _3); // _5 = _2 - _4; patt_68 = (signed short) patt_67; // _5 = _2 - _4; patt_69 = (int) patt_68; // _5 = _2 - _4; For the statement "_2 = (int) _1", its vectorization representative "patt_64 = (int) patt_63" is not marked as PURE_SLP, so it is conservatively considered to having scalar use and being live outside of SLP bb (in the function vect_bb_slp_mark_live_stmts). However, the pattern definition is actually dead, should not contribute to vector-to-scalar cost. Those defs from pattern statements are not part of function body, we could not track def/use chain as ordinary SSAs. Probably, we may have a quick fix for one situation, if the original SSA "_2" has single use, its existence should be only covered by vectorized operation, no matter what/how it would be w/o pattern replacement.
next reply other threads:[~2023-12-20 9:54 UTC|newest] Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-12-20 9:54 fxue at os dot amperecomputing.com [this message] 2023-12-20 13:09 ` [Bug tree-optimization/113091] " rguenth at gcc dot gnu.org 2023-12-21 5:25 ` fxue at os dot amperecomputing.com 2023-12-21 5:27 ` fxue at os dot amperecomputing.com 2023-12-21 7:31 ` rguenth at gcc dot gnu.org 2023-12-21 11:01 ` rsandifo at gcc dot gnu.org 2023-12-22 3:55 ` fxue at os dot amperecomputing.com 2023-12-26 15:16 ` fxue at os dot amperecomputing.com 2023-12-29 10:35 ` fxue at os dot amperecomputing.com 2024-01-16 3:36 ` cvs-commit at gcc dot gnu.org 2024-01-31 3:13 ` fxue at os dot amperecomputing.com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-113091-4@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).