public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/113104] Suboptimal loop-based slp node splicing across iterations Date: Fri, 05 Jan 2024 16:25:35 +0000 [thread overview] Message-ID: <bug-113104-4-Zhol5nElfu@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-113104-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104 --- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>: https://gcc.gnu.org/g:7328faf89e9b4953baaff10e18262c70fbd3e578 commit r14-6961-g7328faf89e9b4953baaff10e18262c70fbd3e578 Author: Richard Sandiford <richard.sandiford@arm.com> Date: Fri Jan 5 16:25:16 2024 +0000 aarch64: Extend VECT_COMPARE_COSTS to !SVE [PR113104] When SVE is enabled, we try vectorising with multiple different SVE and Advanced SIMD approaches and use the cost model to pick the best one. Until now, we've not done that for Advanced SIMD, since "the first mode that works should always be the best". The testcase is a counterexample. Each iteration of the scalar loop vectorises naturally with 64-bit input vectors and 128-bit output vectors. We do try that for SVE, and choose it as the best approach. But the first approach we try is instead to use: - a vectorisation factor of 2 - 1 128-bit vector for the inputs - 2 128-bit vectors for the outputs But since the stride is variable, the cost of marshalling the input vector from two iterations outweighs the benefit of doing two iterations at once. This patch therefore generalises aarch64-sve-compare-costs to aarch64-vect-compare-costs and applies it to non-SVE compilations. gcc/ PR target/113104 * doc/invoke.texi (aarch64-sve-compare-costs): Replace with... (aarch64-vect-compare-costs): ...this. * config/aarch64/aarch64.opt (-param=aarch64-sve-compare-costs=): Replace with... (-param=aarch64-vect-compare-costs=): ...this new param. * config/aarch64/aarch64.cc (aarch64_override_options_internal): Don't disable it when vectorizing for Advanced SIMD only. (aarch64_autovectorize_vector_modes): Apply VECT_COMPARE_COSTS whenever aarch64_vect_compare_costs is true. gcc/testsuite/ PR target/113104 * gcc.target/aarch64/pr113104.c: New test. * gcc.target/aarch64/sve/cond_arith_1.c: Update for new parameter names. * gcc.target/aarch64/sve/cond_arith_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_arith_3.c: Likewise. * gcc.target/aarch64/sve/cond_arith_3_run.c: Likewise. * gcc.target/aarch64/sve/gather_load_6.c: Likewise. * gcc.target/aarch64/sve/gather_load_7.c: Likewise. * gcc.target/aarch64/sve/load_const_offset_2.c: Likewise. * gcc.target/aarch64/sve/load_const_offset_3.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_6.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_7.c: Likewise. * gcc.target/aarch64/sve/mask_load_slp_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise. * gcc.target/aarch64/sve/pack_1.c: Likewise. * gcc.target/aarch64/sve/reduc_4.c: Likewise. * gcc.target/aarch64/sve/scatter_store_6.c: Likewise. * gcc.target/aarch64/sve/scatter_store_7.c: Likewise. * gcc.target/aarch64/sve/strided_load_3.c: Likewise. * gcc.target/aarch64/sve/strided_store_3.c: Likewise. * gcc.target/aarch64/sve/unpack_fcvt_signed_1.c: Likewise. * gcc.target/aarch64/sve/unpack_fcvt_unsigned_1.c: Likewise. * gcc.target/aarch64/sve/unpack_signed_1.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise. * gcc.target/aarch64/sve/vcond_11.c: Likewise. * gcc.target/aarch64/sve/vcond_11_run.c: Likewise.
next prev parent reply other threads:[~2024-01-05 16:25 UTC|newest] Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-12-21 8:22 [Bug tree-optimization/113104] New: " fxue at os dot amperecomputing.com 2023-12-21 9:07 ` [Bug tree-optimization/113104] " rguenth at gcc dot gnu.org 2023-12-21 9:33 ` fxue at os dot amperecomputing.com 2023-12-21 9:41 ` rguenther at suse dot de 2023-12-30 12:35 ` rsandifo at gcc dot gnu.org 2024-01-05 16:25 ` cvs-commit at gcc dot gnu.org [this message] 2024-01-05 16:32 ` rsandifo at gcc dot gnu.org 2024-01-10 5:01 ` fxue at os dot amperecomputing.com
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-113104-4-Zhol5nElfu@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).