public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug target/109078] Missing optimization on aarch64 for types like `float32x4x2_t` Date: Thu, 07 Dec 2023 19:41:32 +0000 [thread overview] Message-ID: <bug-109078-4-93p5zlBsQn@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-109078-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109078 --- Comment #2 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>: https://gcc.gnu.org/g:9f0f7d802482a8958d6cdc72f1fe0c8549db2182 commit r14-6290-g9f0f7d802482a8958d6cdc72f1fe0c8549db2182 Author: Richard Sandiford <richard.sandiford@arm.com> Date: Thu Dec 7 19:41:19 2023 +0000 aarch64: Add an early RA for strided registers This pass adds a simple register allocator for FP & SIMD registers. Its main purpose is to make use of SME2's strided LD1, ST1 and LUTI2/4 instructions, which require a very specific grouping structure, and so would be difficult to exploit with general allocation. The allocator is very simple. It gives up on anything that would require spilling, or that it might not handle well for other reasons. The allocator needs to track liveness at the level of individual FPRs. Doing that fixes a lot of the PRs relating to redundant moves caused by structure loads and stores. That particular problem is going to be fixed more generally for GCC 15 by Lehua's RA patches. However, the early-RA pass runs before scheduling, so it has a chance to bag a spill-free allocation of vector code before the scheduler moves things around. It could therefore still be useful for non-SME code (e.g. for hand-scheduled ACLE code) even after Lehua's patches are in. The pass is controlled by a tristate switch: - -mearly-ra=all: run on all functions - -mearly-ra=strided: run on functions that have access to strided registers - -mearly-ra=none: don't run on any function The patch makes -mearly-ra=all the default at -O2 and above for now. We can revisit this for GCC 15 once Lehua's patches are in; -mearly-ra=strided might then be more appropriate. As said previously, the pass is very naive. There's much more that we could do, such as handling invariants better. The main focus is on not committing to a bad allocation, rather than on handling as much as possible. gcc/ PR rtl-optimization/106694 PR rtl-optimization/109078 PR rtl-optimization/109391 * config.gcc: Add aarch64-early-ra.o for AArch64 targets. * config/aarch64/t-aarch64 (aarch64-early-ra.o): New rule. * config/aarch64/aarch64-opts.h (aarch64_early_ra_scope): New enum. * config/aarch64/aarch64.opt (mearly_ra): New option. * doc/invoke.texi: Document it. * common/config/aarch64/aarch64-common.cc (aarch_option_optimization_table): Use -mearly-ra=strided by default for -O2 and above. * config/aarch64/aarch64-passes.def (pass_aarch64_early_ra): New pass. * config/aarch64/aarch64-protos.h (aarch64_strided_registers_p) (make_pass_aarch64_early_ra): Declare. * config/aarch64/aarch64-sme.md (@aarch64_sme_lut<LUTI_BITS><mode>): Add a stride_type attribute. (@aarch64_sme_lut<LUTI_BITS><mode>_strided2): New pattern. (@aarch64_sme_lut<LUTI_BITS><mode>_strided4): Likewise. * config/aarch64/aarch64-sve-builtins-base.cc (svld1_impl::expand) (svldnt1_impl::expand, svst1_impl::expand, svstn1_impl::expand): Handle new way of defining multi-register loads and stores. * config/aarch64/aarch64-sve.md (@aarch64_ld1<SVE_FULLx24:mode>) (@aarch64_ldnt1<SVE_FULLx24:mode>, @aarch64_st1<SVE_FULLx24:mode>) (@aarch64_stnt1<SVE_FULLx24:mode>): Delete. * config/aarch64/aarch64-sve2.md (@aarch64_<LD1_COUNT:optab><mode>) (@aarch64_<LD1_COUNT:optab><mode>_strided2): New patterns. (@aarch64_<LD1_COUNT:optab><mode>_strided4): Likewise. (@aarch64_<ST1_COUNT:optab><mode>): Likewise. (@aarch64_<ST1_COUNT:optab><mode>_strided2): Likewise. (@aarch64_<ST1_COUNT:optab><mode>_strided4): Likewise. * config/aarch64/aarch64.cc (aarch64_strided_registers_p): New function. * config/aarch64/aarch64.md (UNSPEC_LD1_SVE_COUNT): Delete. (UNSPEC_ST1_SVE_COUNT, UNSPEC_LDNT1_SVE_COUNT): Likewise. (UNSPEC_STNT1_SVE_COUNT): Likewise. (stride_type): New attribute. * config/aarch64/constraints.md (Uwd, Uwt): New constraints. * config/aarch64/iterators.md (UNSPEC_LD1_COUNT, UNSPEC_LDNT1_COUNT) (UNSPEC_ST1_COUNT, UNSPEC_STNT1_COUNT): New unspecs. (optab): Handle them. (LD1_COUNT, ST1_COUNT): New iterators. * config/aarch64/aarch64-early-ra.cc: New file. gcc/testsuite/ PR rtl-optimization/106694 PR rtl-optimization/109078 PR rtl-optimization/109391 * gcc.target/aarch64/ldp_stp_16.c (cons4_4_float): Tighten expected output test. * gcc.target/aarch64/sve/shift_1.c: Allow reversed shifts for .s as well as .d. * gcc.target/aarch64/sme/strided_1.c: New test. * gcc.target/aarch64/pr109078.c: Likewise. * gcc.target/aarch64/pr109391.c: Likewise. * gcc.target/aarch64/sve/pr106694.c: Likewise.
next prev parent reply other threads:[~2023-12-07 19:41 UTC|newest] Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-03-09 10:21 [Bug target/109078] New: " dorazzsoft at gmail dot com 2023-04-02 17:04 ` [Bug target/109078] " pinskia at gcc dot gnu.org 2023-11-07 21:49 ` rsandifo at gcc dot gnu.org 2023-12-07 19:41 ` cvs-commit at gcc dot gnu.org [this message] 2023-12-07 19:52 ` rsandifo at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-109078-4-93p5zlBsQn@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).