From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by sourceware.org (Postfix) with ESMTPS id 55F0C3870866 for ; Mon, 27 Apr 2020 17:08:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 55F0C3870866 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=craig.blackmore@embecosm.com Received: by mail-wm1-x343.google.com with SMTP id z6so468575wml.2 for ; Mon, 27 Apr 2020 10:08:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=subject:to:cc:references:from:autocrypt:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=Yrl1ApRYXRitZNnPlsHkuKcl66tAEjbovWYgSMqvsGU=; b=fgMNWb24ttHj35/2LP238+5dfm7wE/QbKx8XSrxvmoOryBcxxh4Ic/NM6c6VbXs5DJ fRSuP0Jrtkcqi05gWejkCewzuDqHfZiCyj969ZqUTjC3veQ1fpAynDrHPEMeQhmLiI2G Q7ZBd6+ahlGVH7DFV+EjdO9A4yiwUDhGt3XVeqCAT5q/Tg09nGT/sgkobRxgu6o13ZxO QJSN2ky3wW8RFQFKhbRGz++JTpP1aO1CUsZ5gbhtNEEJvb17W+/CLQyzWzl9s1oqbEOU cv3jsiKkgAojtc0A3gSk0Lt8B6bVefV+vpLsPE1IUh7bR0rkH0WVV/sM77hkyeHM/+Gp rzsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language; bh=Yrl1ApRYXRitZNnPlsHkuKcl66tAEjbovWYgSMqvsGU=; b=VjARV3qmF/+dUF2IIMfozeWEenRsSOZcQpPFEDFAJCdTD+LS3p+jJX/8dyPoYsX17E a9mUH/mAJU4Ttn3zEpcHYdhKHGTiVbMW8T3oenwJPorEauVg7alBI368O1zw80pqAytS l1koTLlndleyAFc4FSQin3lK5F6HIzEaavfFkWVRwb7QuxJRVGaSAI85Ka6t8wc9P1XO bfLwsPkye6bDvjFIIuw3NRSxVOVPR5uq1hjko9t38A1VAcZTfFgnBQwq0tgyfPkej4wc ypo8hOxQK24Nzse9L2Z2w0FSKbYp1SYFbi+0cmtZK4n9JjhwzTTmffMZacvT+V0oBx9e 29WQ== X-Gm-Message-State: AGi0Puan5E2es+FCYtVDz7/BHAno45Zc2ECMHXGocN4u/iDDIfMlVjNv f4RuD3hgL4UcTcZ2JvHLxPVkEg== X-Google-Smtp-Source: APiQypLfGkMAP7aDLz+XVfBYOsrw789jkId4fNitaHyzdNOx+mmH9Cpr3B1e7yWAa4nFab/pts7faA== X-Received: by 2002:a1c:668a:: with SMTP id a132mr527708wmc.46.1588007336632; Mon, 27 Apr 2020 10:08:56 -0700 (PDT) Received: from [192.168.0.27] ([80.0.42.246]) by smtp.gmail.com with ESMTPSA id m15sm16033465wmc.35.2020.04.27.10.08.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 27 Apr 2020 10:08:54 -0700 (PDT) Subject: Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass To: Jim Wilson Cc: GCC Patches , Ofer Shinaar , Nidal Faour , Kito Cheng , Jeff Law References: <1572025151-22783-1-git-send-email-craig.blackmore@embecosm.com> <1572025151-22783-2-git-send-email-craig.blackmore@embecosm.com> <0f312bfe-c0af-bba3-3ed4-82d6b3f69900@embecosm.com> <75fbed58-5756-e012-725b-f653f60d619b@embecosm.com> From: Craig Blackmore Autocrypt: addr=craig.blackmore@embecosm.com; prefer-encrypt=mutual; keydata= xsBNBFdIF8oBCACwrsvc6YVfzJRT+ZoBfL9jEb8ITwNahDxCGSG6sIWrJ9UFeTwE8fnNhMpz RyFRm0OXruS5k/8YHJHrxKxFY9cgZ3CWNftXEjRqURUWGtN/ESiw0J7nVfhSGQTo3LBzpXZ1 0JHk4ZHKDJKYa+fhybCHOs19BfP3HydHoTlc5QTKMfom0X/xo7WDdwUYeZsjD9u8IzHk7gNw 05Abk1vqni+J7Fghjp4RI8W3IsjpKOfV3f02OyO/MTSraXNyejO4JRl0A8b3q1Lq+G6Z7o5n LVief5JpkRyzWQSawTIBKmRZa9EzAKZXd6IJdY/sZt7pTir5EP7MHq4a+AtKfKuDkrDDABEB AAHNLkNyYWlnIEJsYWNrbW9yZSA8Y3JhaWcuYmxhY2ttb3JlQGVtYmVjb3NtLmNvbT7CwHgE EwECACIFAldIF8oCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEGEeRQLLl5WtydMH /1nYd9jmOBaF8w5gGgjF5eOO5b/cdUegmO///VYj/5R7iF/zbB6KgF0Obo5h2gG9AIfsZG+T ybuTx7oU1DZYEIndw+YP9c9Yi5de5UzEHwbJiV57W0n+MP0Widgw7p6XJmUQ1XbHxdcWp7nY EJa8ASKLuuIhO8JFUXbQ8BcUiWbsA/JxgCzeid8iixGrzPWj6iFzoK2mX4GqP+24pXSDUamM TXmSQd2taYEsyUdJNiEkUC51ncRcMuThjdtfn6Ok+7lHjh3Zz8q0keJz5pnIp4EXdkAgKSjq U42PMrd3v1HoIFINTtr5F23OdkxoQzysu4GMO4pkw5pwz95Uckr08ojOwE0EV0gXygEIAKv/ luYHmCG/qefgzdbnegwMdG5753NJ+zGxFltFX6aaOPZ8go9Omf6zwjybUKv6Qx6AlDanwCl3 ewVQs+h9iW8uaQBRgeDmwAGMG/doBiFqs7X0jBf23exMiJezXlKb2ZlKzMAbzJ87408AzRaV sZdwEpXHVi2mRPoXtMrqL5iQEyG5hdx2ySj5164DIgVOs/ypFiaiFaDPkIcAQTzJrxsbt6pf iI9kT93DO9nRKVV0pPWztV8P5gKM8HY2rS0wQcfrqAU6T89Aa0VFw92J+w5d2spF8MUNPsvR NLm9ooCF3YME9STYHXrNH1U9fJUWpIC+b49UoWSWRD9nwl2h2i8AEQEAAcLAXwQYAQIACQUC V0gXygIbDAAKCRBhHkUCy5eVreLbB/sHPs1xu78uNV8O4UPTX7D5zBBS3nsrbDr+8stmXRap xbvo6kqKzIMAXuO3bYB/NyJ/tFzuFr9Tjd/2g56D2186bp01/kgxJ9CEl/m2T3lG3DlxIoLg pCExzTLTb8zH/7/6mdeJ17cdnrK+2QAKYctReVPAC67cq5KmUyU3bv5e1JzhV4ezz/i/O+Jv el112ZEsa54ya9KZOUHbgAR6hLnRWIa+8yQTtXqYRc3LxLRfS80Wn0Err1YvqFYzJsQMC8ND xAeEuqQ1gfk1b0jmv7tYljNqsHqzGVbuWz6hyzyLv5GjcdSDKpbw/797gRKQSY8Gty5ynfUH O4kKyuZPrE8P Message-ID: Date: Mon, 27 Apr 2020 18:08:52 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Spam-Status: No, score=-24.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, HTML_MESSAGE, KAM_SHORT, RCVD_IN_DNSWL_NONE, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Apr 2020 17:09:03 -0000 On 08/04/2020 17:04, Jim Wilson wrote: > On Wed, Feb 19, 2020 at 3:40 AM Craig Blackmore > wrote: >> On 10/12/2019 18:28, Craig Blackmore wrote: >> Thank you for your review. I have posted an updated patch below which = I think >> addresses your comments. >> >> Ping >> >> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00712.html > This looks OK. There are some minor issues. > > (riscv_new_address_profitable_p): New function. > (TARGET_NEW_ADDRESS_PROFITABLE_P): Define. > > These are actually in part2, and part2 already has changelog entries > for them, so these can just be dropped. > > + /* When optimizing for size, make uncompressible 32-bit addresses mo= re > + * expensive so that compressible 32-bit addresses are preferred. *= / > + if (!speed && riscv_mshorten_memrefs && mode =3D=3D SImode > + && !riscv_compressed_lw_address_p (addr)) > + return riscv_address_insns (addr, mode, false) + 1; > > I think that there should be a TARGET_RVC check here, just like in the = gate > function for the new pass. But I also suspect that this probably > doesn't matter much. Hi Jim, Thanks for the review. I have updated the following patch with those chan= ges. Craig --- gcc/ChangeLog: * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for riscv. * config/riscv/riscv-passes.def: New file. * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): Declare. * config/riscv/riscv-shorten-memrefs.c: New file. * config/riscv/riscv.c (tree-pass.h): New include. (riscv_compressed_reg_p): New Function (riscv_compressed_lw_offset_p): Likewise. (riscv_compressed_lw_address_p): Likewise. (riscv_shorten_lw_offset): Likewise. (riscv_legitimize_address): Attempt to convert base + large_offset to compressible new_base + small_offset. (riscv_address_cost): Make anticipated compressed load/stores cheaper for code size than uncompressed load/stores. (riscv_register_priority): Move compressed register check to riscv_compressed_reg_p. * config/riscv/riscv.h (C_S_BITS): Define. (CSW_MAX_OFFSET): Define. * config/riscv/riscv.opt (mshorten-memefs): New option. * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule. (PASSES_EXTRA): Add riscv-passes.def. * doc/invoke.texi: Document -mshorten-memrefs. gcc/testsuite/ChangeLog: * gcc.target/riscv/shorten-memrefs-1.c: New test. * gcc.target/riscv/shorten-memrefs-2.c: New test. * gcc.target/riscv/shorten-memrefs-3.c: New test. * gcc.target/riscv/shorten-memrefs-4.c: New test. * gcc.target/riscv/shorten-memrefs-5.c: New test. * gcc.target/riscv/shorten-memrefs-6.c: New test. * gcc.target/riscv/shorten-memrefs-7.c: New test. --- gcc/config.gcc | 2 +- gcc/config/riscv/riscv-passes.def | 20 ++ gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-shorten-memrefs.c | 200 ++++++++++++++++++ gcc/config/riscv/riscv.c | 88 +++++++- gcc/config/riscv/riscv.h | 5 + gcc/config/riscv/riscv.opt | 6 + gcc/config/riscv/t-riscv | 5 + gcc/doc/invoke.texi | 10 + .../gcc.target/riscv/shorten-memrefs-1.c | 26 +++ .../gcc.target/riscv/shorten-memrefs-2.c | 51 +++++ .../gcc.target/riscv/shorten-memrefs-3.c | 39 ++++ .../gcc.target/riscv/shorten-memrefs-4.c | 26 +++ .../gcc.target/riscv/shorten-memrefs-5.c | 53 +++++ .../gcc.target/riscv/shorten-memrefs-6.c | 39 ++++ .../gcc.target/riscv/shorten-memrefs-7.c | 46 ++++ 16 files changed, 612 insertions(+), 6 deletions(-) create mode 100644 gcc/config/riscv/riscv-passes.def create mode 100644 gcc/config/riscv/riscv-shorten-memrefs.c create mode 100644 gcc/testsuite/gcc.target/riscv/shorten-memrefs-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/shorten-memrefs-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/shorten-memrefs-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/shorten-memrefs-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/shorten-memrefs-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/shorten-memrefs-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/shorten-memrefs-7.c diff --git a/gcc/config.gcc b/gcc/config.gcc index cf1a87e2efd..3c2a0389b98 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -525,7 +525,7 @@ pru-*-*) ;; riscv*) cpu_type=3Driscv - extra_objs=3D"riscv-builtins.o riscv-c.o riscv-sr.o" + extra_objs=3D"riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memre= fs.o" d_target_objs=3D"riscv-d.o" ;; rs6000*-*-*) diff --git a/gcc/config/riscv/riscv-passes.def b/gcc/config/riscv/riscv-p= asses.def new file mode 100644 index 00000000000..8a4ea0918db --- /dev/null +++ b/gcc/config/riscv/riscv-passes.def @@ -0,0 +1,20 @@ +/* Declaration of target-specific passes for RISC-V. + Copyright (C) 2019 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs); diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-pro= tos.h index 8cf9137b5e7..72280ec1c76 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -91,4 +91,6 @@ extern std::string riscv_arch_str (); =20 extern bool riscv_hard_regno_rename_ok (unsigned, unsigned); =20 +rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); + #endif /* ! GCC_RISCV_PROTOS_H */ diff --git a/gcc/config/riscv/riscv-shorten-memrefs.c b/gcc/config/riscv/= riscv-shorten-memrefs.c new file mode 100644 index 00000000000..3686005fe2e --- /dev/null +++ b/gcc/config/riscv/riscv-shorten-memrefs.c @@ -0,0 +1,200 @@ +/* Shorten memrefs pass for RISC-V. + Copyright (C) 2018-2019 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#define IN_TARGET_CODE 1 + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "rtl.h" +#include "backend.h" +#include "regs.h" +#include "target.h" +#include "memmodel.h" +#include "emit-rtl.h" +#include "df.h" +#include "predict.h" +#include "tree-pass.h" + +/* Try to make more use of compressed load and store instructions by rep= lacing + a load/store at address BASE + LARGE_OFFSET with a new load/store at = address + NEW BASE + SMALL OFFSET. If NEW BASE is stored in a compressed regis= ter, the + load/store can be compressed. Since creating NEW BASE incurs an over= head, + the change is only attempted when BASE is referenced by at least four= + load/stores in the same basic block. */ + +namespace { + +const pass_data pass_data_shorten_memrefs =3D +{ + RTL_PASS, /* type */ + "shorten_memrefs", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_NONE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_shorten_memrefs : public rtl_opt_pass +{ +public: + pass_shorten_memrefs (gcc::context *ctxt) + : rtl_opt_pass (pass_data_shorten_memrefs, ctxt) + {} + + /* opt_pass methods: */ + virtual bool gate (function *) + { + return TARGET_RVC && riscv_mshorten_memrefs && optimize > 0; + } + virtual unsigned int execute (function *); + +private: + typedef int_hash regno_hash; + typedef hash_map regno_map; + + regno_map * analyze (basic_block bb); + void transform (regno_map *m, basic_block bb); + bool get_si_mem_base_reg (rtx mem, rtx *addr); +}; // class pass_shorten_memrefs + +bool +pass_shorten_memrefs::get_si_mem_base_reg (rtx mem, rtx *addr) +{ + if (!MEM_P (mem) || GET_MODE (mem) !=3D SImode) + return false; + *addr =3D XEXP (mem, 0); + return GET_CODE (*addr) =3D=3D PLUS && REG_P (XEXP (*addr, 0)); +} + +/* Count how many times each regno is referenced as base address for a m= emory + access. */ + +pass_shorten_memrefs::regno_map * +pass_shorten_memrefs::analyze (basic_block bb) +{ + regno_map *m =3D hash_map::create_ggc (10); + rtx_insn *insn; + + regstat_init_n_sets_and_refs (); + + FOR_BB_INSNS (bb, insn) + { + if (!NONJUMP_INSN_P (insn)) + continue; + rtx pat =3D PATTERN (insn); + if (GET_CODE (pat) !=3D SET) + continue; + /* Analyze stores first then loads. */ + for (int i =3D 0; i < 2; i++) + { + rtx mem =3D XEXP (pat, i); + rtx addr; + if (get_si_mem_base_reg (mem, &addr)) + { + HOST_WIDE_INT regno =3D REGNO (XEXP (addr, 0)); + /* Do not count store zero as these cannot be compressed. */ + if (i =3D=3D 0) + { + if (XEXP (pat, 1) =3D=3D CONST0_RTX (GET_MODE (XEXP (pat, 1)))) + continue; + } + if (REG_N_REFS (regno) < 4) + continue; + m->get_or_insert (regno)++; + } + } + } + regstat_free_n_sets_and_refs (); + + return m; +} + +/* Convert BASE + LARGE_OFFSET to NEW_BASE + SMALL_OFFSET for each load/= store + with a base reg referenced at least 4 times. */ + +void +pass_shorten_memrefs::transform (regno_map *m, basic_block bb) +{ + rtx_insn *insn; + FOR_BB_INSNS (bb, insn) + { + if (!NONJUMP_INSN_P (insn)) + continue; + rtx pat =3D PATTERN (insn); + if (GET_CODE (pat) !=3D SET) + continue; + start_sequence (); + /* Transform stores first then loads. */ + for (int i =3D 0; i < 2; i++) + { + rtx mem =3D XEXP (pat, i); + rtx addr; + if (get_si_mem_base_reg (mem, &addr)) + { + HOST_WIDE_INT regno =3D REGNO (XEXP (addr, 0)); + /* Do not transform store zero as these cannot be compressed. */= + if (i =3D=3D 0) + { + if (XEXP (pat, 1) =3D=3D CONST0_RTX (GET_MODE (XEXP (pat, 1)))) + continue; + } + if (m->get_or_insert (regno) > 3) + { + addr + =3D targetm.legitimize_address (addr, addr, GET_MODE (mem)); + XEXP (pat, i) =3D replace_equiv_address (mem, addr); + df_insn_rescan (insn); + } + } + } + rtx_insn *seq =3D get_insns (); + end_sequence (); + emit_insn_before (seq, insn); + } +} + +unsigned int +pass_shorten_memrefs::execute (function *fn) +{ + basic_block bb; + + FOR_ALL_BB_FN (bb, fn) + { + regno_map *m; + if (optimize_bb_for_speed_p (bb)) + continue; + m =3D analyze (bb); + transform (m, bb); + } + + return 0; +} + +} // anon namespace + +rtl_opt_pass * +make_pass_shorten_memrefs (gcc::context *ctxt) +{ + return new pass_shorten_memrefs (ctxt); +} diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c index 94b5ac01762..0c0879c8aee 100644 --- a/gcc/config/riscv/riscv.c +++ b/gcc/config/riscv/riscv.c @@ -55,6 +55,7 @@ along with GCC; see the file COPYING3. If not see #include "diagnostic.h" #include "builtins.h" #include "predict.h" +#include "tree-pass.h" =20 /* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF. */ #define UNSPEC_ADDRESS_P(X) \ @@ -848,6 +849,52 @@ riscv_legitimate_address_p (machine_mode mode, rtx x= , bool strict_p) return riscv_classify_address (&addr, x, mode, strict_p); } =20 +/* Return true if hard reg REGNO can be used in compressed instructions.= */ + +static bool +riscv_compressed_reg_p (int regno) +{ + /* x8-x15/f8-f15 are compressible registers. */ + return (TARGET_RVC && (IN_RANGE (regno, GP_REG_FIRST + 8, GP_REG_FIRST= + 15) + || IN_RANGE (regno, FP_REG_FIRST + 8, FP_REG_FIRST + 15))); +} + +/* Return true if x is an unsigned 5-bit immediate scaled by 4. */ + +static bool +riscv_compressed_lw_offset_p (rtx x) +{ + return (CONST_INT_P (x) + && (INTVAL (x) & 3) =3D=3D 0 + && IN_RANGE (INTVAL (x), 0, CSW_MAX_OFFSET)); +} + +/* Return true if load/store from/to address x can be compressed. */ + +static bool +riscv_compressed_lw_address_p (rtx x) +{ + struct riscv_address_info addr; + bool result =3D riscv_classify_address (&addr, x, GET_MODE (x), + reload_completed); + + /* Before reload, assuming all load/stores of valid addresses get comp= ressed + gives better code size than checking if the address is reg + small_= offset + early on. */ + if (result && !reload_completed) + return true; + + /* Return false if address is not compressed_reg + small_offset. */ + if (!result + || addr.type !=3D ADDRESS_REG + || (!riscv_compressed_reg_p (REGNO (addr.reg)) + && addr.reg !=3D stack_pointer_rtx) + || !riscv_compressed_lw_offset_p (addr.offset)) + return false; + + return result; +} + /* Return the number of instructions needed to load or store a value of mode MODE at address X. Return 0 if X isn't valid for MODE. Assume that multiword moves may need to be split into word moves @@ -1308,6 +1355,24 @@ riscv_force_address (rtx x, machine_mode mode) return x; } =20 +/* Modify base + offset so that offset fits within a compressed load/sto= re insn + and the excess is added to base. */ + +static rtx +riscv_shorten_lw_offset (rtx base, HOST_WIDE_INT offset) +{ + rtx addr, high; + /* Leave OFFSET as an unsigned 5-bit offset scaled by 4 and put the ex= cess + into HIGH. */ + high =3D GEN_INT (offset & ~CSW_MAX_OFFSET); + offset &=3D CSW_MAX_OFFSET; + if (!SMALL_OPERAND (INTVAL (high))) + high =3D force_reg (Pmode, high); + base =3D force_reg (Pmode, gen_rtx_PLUS (Pmode, high, base)); + addr =3D plus_constant (Pmode, base, offset); + return addr; +} + /* This function is used to implement LEGITIMIZE_ADDRESS. If X can be legitimized in a way that the generic machinery might not expect, return a new address, otherwise return NULL. MODE is the mode of @@ -1326,7 +1391,7 @@ riscv_legitimize_address (rtx x, rtx oldx ATTRIBUTE= _UNUSED, if (riscv_split_symbol (NULL, x, mode, &addr, FALSE)) return riscv_force_address (addr, mode); =20 - /* Handle BASE + OFFSET using riscv_add_offset. */ + /* Handle BASE + OFFSET. */ if (GET_CODE (x) =3D=3D PLUS && CONST_INT_P (XEXP (x, 1)) && INTVAL (XEXP (x, 1)) !=3D 0) { @@ -1335,7 +1400,14 @@ riscv_legitimize_address (rtx x, rtx oldx ATTRIBUT= E_UNUSED, =20 if (!riscv_valid_base_register_p (base, mode, false)) base =3D copy_to_mode_reg (Pmode, base); - addr =3D riscv_add_offset (NULL, base, offset); + if (optimize_function_for_size_p (cfun) + && (strcmp (current_pass->name, "shorten_memrefs") =3D=3D 0) + && mode =3D=3D SImode) + /* Convert BASE + LARGE_OFFSET into NEW_BASE + SMALL_OFFSET to allow + possible compressed load/store. */ + addr =3D riscv_shorten_lw_offset (base, offset); + else + addr =3D riscv_add_offset (NULL, base, offset); return riscv_force_address (addr, mode); } =20 @@ -1833,6 +1905,11 @@ riscv_address_cost (rtx addr, machine_mode mode, addr_space_t as ATTRIBUTE_UNUSED, bool speed ATTRIBUTE_UNUSED) { + /* When optimizing for size, make uncompressible 32-bit addresses more= + * expensive so that compressible 32-bit addresses are preferred. */ + if (TARGET_RVC && !speed && riscv_mshorten_memrefs && mode =3D=3D SImo= de + && !riscv_compressed_lw_address_p (addr)) + return riscv_address_insns (addr, mode, false) + 1; return riscv_address_insns (addr, mode, false); } =20 @@ -4666,6 +4743,7 @@ riscv_option_override (void) error ("%<-mriscv-attribute%> RISC-V ELF attribute requires GNU as 2= =2E32" " [%<-mriscv-attribute%>]"); #endif + } =20 /* Implement TARGET_CONDITIONAL_REGISTER_USAGE. */ @@ -4705,9 +4783,9 @@ riscv_conditional_register_usage (void) static int riscv_register_priority (int regno) { - /* Favor x8-x15/f8-f15 to improve the odds of RVC instruction selectio= n. */ - if (TARGET_RVC && (IN_RANGE (regno, GP_REG_FIRST + 8, GP_REG_FIRST + 1= 5) - || IN_RANGE (regno, FP_REG_FIRST + 8, FP_REG_FIRST + 15))) + /* Favor compressed registers to improve the odds of RVC instruction + selection. */ + if (riscv_compressed_reg_p (regno)) return 1; =20 return 0; diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index 567c23380fe..e6209ede9d6 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -920,6 +920,7 @@ extern unsigned riscv_stack_boundary; #define SHIFT_RS1 15 #define SHIFT_IMM 20 #define IMM_BITS 12 +#define C_S_BITS 5 #define C_SxSP_BITS 6 =20 #define IMM_REACH (1LL << IMM_BITS) @@ -929,6 +930,10 @@ extern unsigned riscv_stack_boundary; #define SWSP_REACH (4LL << C_SxSP_BITS) #define SDSP_REACH (8LL << C_SxSP_BITS) =20 +/* This is the maximum value that can be represented in a compressed loa= d/store + offset (an unsigned 5-bit value scaled by 4). */ +#define CSW_MAX_OFFSET ((4LL << C_S_BITS) - 1) & ~3 + /* Called from RISCV_REORG, this is defined in riscv-sr.c. */ =20 extern void riscv_remove_unneeded_save_restore_calls (void); diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index 29de246759e..e4bfcb86f51 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -87,6 +87,12 @@ msave-restore Target Report Mask(SAVE_RESTORE) Use smaller but slower prologue and epilogue code. =20 +mshorten-memrefs +Target Bool Var(riscv_mshorten_memrefs) Init(1) +Convert BASE + LARGE_OFFSET addresses to NEW_BASE + SMALL_OFFSET to allo= w more +memory accesses to be generated as compressed instructions. Currently t= argets +32-bit integer load/stores. + mcmodel=3D Target Report RejectNegative Joined Enum(code_model) Var(riscv_cmodel) I= nit(TARGET_DEFAULT_CMODEL) Specify the code model. diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index 5ecb3c160a6..4820fb35d31 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -19,3 +19,8 @@ riscv-d.o: $(srcdir)/config/riscv/riscv-d.c $(COMPILE) $< $(POSTCOMPILE) =20 +riscv-shorten-memrefs.o: $(srcdir)/config/riscv/riscv-shorten-memrefs.c + $(COMPILE) $< + $(POSTCOMPILE) + +PASSES_EXTRA +=3D $(srcdir)/config/riscv/riscv-passes.def diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index a37a2ee9c19..ad4c6d94f82 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1129,6 +1129,7 @@ See RS/6000 and PowerPC Options. -mpreferred-stack-boundary=3D@var{num} @gol -msmall-data-limit=3D@var{N-bytes} @gol -msave-restore -mno-save-restore @gol +-mshorten-memrefs -mno-shorten-memrefs @gol -mstrict-align -mno-strict-align @gol -mcmodel=3Dmedlow -mcmodel=3Dmedany @gol -mexplicit-relocs -mno-explicit-relocs @gol @@ -25321,6 +25322,15 @@ Do or don't use smaller but slower prologue and = epilogue code that uses library function calls. The default is to use fast inline prologues and= epilogues. =20 +@item -mshorten-memrefs +@itemx -mno-shorten-memrefs +@opindex mshorten-memrefs +Do or do not attempt to make more use of compressed load/store instructi= ons by +replacing a load/store of 'base register + large offset' with a new load= /store +of 'new base + small offset'. If the new base gets stored in a compress= ed +register, then the new load/store can be compressed. Currently targets = 32-bit +integer load/stores only. + @item -mstrict-align @itemx -mno-strict-align @opindex mstrict-align diff --git a/gcc/testsuite/gcc.target/riscv/shorten-memrefs-1.c b/gcc/tes= tsuite/gcc.target/riscv/shorten-memrefs-1.c new file mode 100644 index 00000000000..958942a6f7f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shorten-memrefs-1.c @@ -0,0 +1,26 @@ +/* { dg-options "-Os -march=3Drv32imc -mabi=3Dilp32" } */ + +/* These stores cannot be compressed because x0 is not a compressed reg.= + Therefore the shorten_memrefs pass should not attempt to rewrite them= into a + compressible format. */ + +void +store1z (int *array) +{ + array[200] =3D 0; + array[201] =3D 0; + array[202] =3D 0; + array[203] =3D 0; +} + +void +store2z (long long *array) +{ + array[200] =3D 0; + array[201] =3D 0; + array[202] =3D 0; + array[203] =3D 0; +} + +/* { dg-final { scan-assembler-not "store1z:\n\taddi" } } */ +/* { dg-final { scan-assembler-not "store2z:\n\taddi" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/shorten-memrefs-2.c b/gcc/tes= tsuite/gcc.target/riscv/shorten-memrefs-2.c new file mode 100644 index 00000000000..2c2f41548c6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shorten-memrefs-2.c @@ -0,0 +1,51 @@ +/* { dg-options "-Os -march=3Drv32imc -mabi=3Dilp32" } */ + +/* shorten_memrefs should rewrite these load/stores into a compressible + format. */ + +void +store1a (int *array, int a) +{ + array[200] =3D a; + array[201] =3D a; + array[202] =3D a; + array[203] =3D a; +} + +void +store2a (long long *array, long long a) +{ + array[200] =3D a; + array[201] =3D a; + array[202] =3D a; + array[203] =3D a; +} + +int +load1r (int *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return a; +} + +long long +load2r (long long *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return a; +} + +/* { dg-final { scan-assembler "store1a:\n\taddi" } } */ +/* The sd insns in store2a are not rewritten because shorten_memrefs cur= rently + only optimizes lw and sw. +/* { dg-final { scan-assembler "store2a:\n\taddi" { xfail riscv*-*-* } = } } */ +/* { dg-final { scan-assembler "load1r:\n\taddi" } } */ +/* { dg-final { scan-assembler "load2r:\n\taddi" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/shorten-memrefs-3.c b/gcc/tes= tsuite/gcc.target/riscv/shorten-memrefs-3.c new file mode 100644 index 00000000000..2001fe871ee --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shorten-memrefs-3.c @@ -0,0 +1,39 @@ +/* { dg-options "-Os -march=3Drv32imc -mabi=3Dilp32" } */ + +/* These loads cannot be compressed because only one compressed reg is + available (since args are passed in a0-a4, that leaves a5-a7 availabl= e, of + which only a5 is a compressed reg). Therefore the shorten_memrefs pas= s should + not attempt to rewrite these loads into a compressible format. It may= not + be possible to avoid this because shorten_memrefs happens before reg = alloc. +*/ + +extern int sub1 (int, int, int, int, int, int, int); + +int +load1a (int a0, int a1, int a2, int a3, int a4, int *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return sub1 (a0, a1, a2, a3, a4, 0, a); +} + +extern long long sub2 (long long, long long, long long, long long, long = long, + long long, long long); + +long long +load2a (long long a0, long long a1, long long a2, long long a3, long lon= g a4, + long long *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return sub2 (a0, a1, a2, a3, a4, 0, a); +} + +/* { dg-final { scan-assembler-not "load1a:\n\taddi" { xfail riscv*-*-* = } } } */ +/* { dg-final { scan-assembler-not "load2a:\n.*addi\[ \t\]*\[at\]\[0-9\]= ,\[at\]\[0-9\],\[0-9\]*" { xfail riscv*-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/shorten-memrefs-4.c b/gcc/tes= tsuite/gcc.target/riscv/shorten-memrefs-4.c new file mode 100644 index 00000000000..cd4784913e4 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shorten-memrefs-4.c @@ -0,0 +1,26 @@ +/* { dg-options "-Os -march=3Drv64imc -mabi=3Dlp64" } */ + +/* These stores cannot be compressed because x0 is not a compressed reg.= + Therefore the shorten_memrefs pass should not attempt to rewrite them= into a + compressible format. */ + +void +store1z (int *array) +{ + array[200] =3D 0; + array[201] =3D 0; + array[202] =3D 0; + array[203] =3D 0; +} + +void +store2z (long long *array) +{ + array[200] =3D 0; + array[201] =3D 0; + array[202] =3D 0; + array[203] =3D 0; +} + +/* { dg-final { scan-assembler-not "store1z:\n\taddi" } } */ +/* { dg-final { scan-assembler-not "store2z:\n\taddi" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/shorten-memrefs-5.c b/gcc/tes= tsuite/gcc.target/riscv/shorten-memrefs-5.c new file mode 100644 index 00000000000..80b3897e4da --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shorten-memrefs-5.c @@ -0,0 +1,53 @@ +/* { dg-options "-Os -march=3Drv64imc -mabi=3Dlp64" } */ + +/* shorten_memrefs should rewrite these load/stores into a compressible + format. */ + +void +store1a (int *array, int a) +{ + array[200] =3D a; + array[201] =3D a; + array[202] =3D a; + array[203] =3D a; +} + +void +store2a (long long *array, long long a) +{ + array[200] =3D a; + array[201] =3D a; + array[202] =3D a; + array[203] =3D a; +} + +int +load1r (int *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return a; +} + +long long +load2r (long long *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return a; +} + +/* { dg-final { scan-assembler "store1a:\n\taddi" } } */ +/* The sd insns in store2a are not rewritten because shorten_memrefs cur= rently + only optimizes lw and sw. +/* { dg-final { scan-assembler "store2a:\n\taddi" { xfail riscv*-*-* } }= } */ +/* { dg-final { scan-assembler "load1r:\n\taddi" } } */ +/* The ld insns in load2r are not rewritten because shorten_memrefs curr= ently + only optimizes lw and sw. +/* { dg-final { scan-assembler "load2r:\n\taddi" { xfail riscv*-*-* } } = } */ diff --git a/gcc/testsuite/gcc.target/riscv/shorten-memrefs-6.c b/gcc/tes= tsuite/gcc.target/riscv/shorten-memrefs-6.c new file mode 100644 index 00000000000..3403c7044df --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shorten-memrefs-6.c @@ -0,0 +1,39 @@ +/* { dg-options "-Os -march=3Drv64imc -mabi=3Dlp64" } */ + +/* These loads cannot be compressed because only one compressed reg is + available (since args are passed in a0-a4, that leaves a5-a7 availabl= e, of + which only a5 is a compressed reg). Therefore the shorten_memrefs pas= s should + not attempt to rewrite these loads into a compressible format. It may= not + be possible to avoid this because shorten_memrefs happens before reg = alloc. +*/ + +extern int sub1 (int, int, int, int, int, int, int); + +int +load1a (int a0, int a1, int a2, int a3, int a4, int *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return sub1 (a0, a1, a2, a3, a4, 0, a); +} + +extern long long sub2 (long long, long long, long long, long long, long = long, + long long, long long); + +long long +load2a (long long a0, long long a1, long long a2, long long a3, long lon= g a4, + long long *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return sub2 (a0, a1, a2, a3, a4, 0, a); +} + +/* { dg-final { scan-assembler-not "load1a:\n\taddi" { xfail riscv*-*-* = } } } */ +/* { dg-final { scan-assembler-not "load2a:\n.*addi\[ \t\]*\[at\]\[0-9\]= ,\[at\]\[0-9\],\[0-9\]*" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/shorten-memrefs-7.c b/gcc/tes= tsuite/gcc.target/riscv/shorten-memrefs-7.c new file mode 100644 index 00000000000..a5833fd356d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shorten-memrefs-7.c @@ -0,0 +1,46 @@ +/* { dg-options "-Os -march=3Drv32imc -mabi=3Dilp32 -mno-shorten-memrefs= " } */ + +/* Check that these load/stores do not get rewritten into a compressible= format + when shorten_memrefs is disabled. */ + +void +store1a (int *array, int a) +{ + array[200] =3D a; + array[201] =3D a; + array[202] =3D a; + array[203] =3D a; +} + +void +store2a (long long *array, long long a) +{ + array[200] =3D a; + array[201] =3D a; + array[202] =3D a; + array[203] =3D a; +} + +int +load1r (int *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return a; +} + +long long +load2r (long long *array) +{ + int a =3D 0; + a +=3D array[200]; + a +=3D array[201]; + a +=3D array[202]; + a +=3D array[203]; + return a; +} + +/* { dg-final { scan-assembler-not "addi" } } */ --=20 2.17.1