From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15680 invoked by alias); 30 Oct 2019 23:57:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 15453 invoked by uid 89); 30 Oct 2019 23:57:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=005, 0.05, riscvh, UD:riscv.h X-HELO: mail-vs1-f65.google.com Received: from mail-vs1-f65.google.com (HELO mail-vs1-f65.google.com) (209.85.217.65) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 30 Oct 2019 23:57:36 +0000 Received: by mail-vs1-f65.google.com with SMTP id o21so2872276vsq.13 for ; Wed, 30 Oct 2019 16:57:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pUCVUeErBZzCCBkz71uRO2WeoX33uYuzTLJopQZaBM8=; b=GdiILppe3SvSeC01LIaPPsJ9dka+Fph1tKXviEIXC9RW/7MfAOieUzsiri4Qeo7VjG nzweHjfeudxeGPBuh1IY7mLIEblXDEe8L8h9nHLOKTgOCYN/WjZJoOnrTlZpxGRS1tU4 w4eiziiiB3NfNhAg8qLyJpJwxq/dbLVHNL2wb0du1NDfaixUgxrTcdDi+Wztk6SGIqOr 1F/oV1bZt6BM1Wbuh+XBQgk4DNxRyWM9ZbyCoWHTqepC3vDDY/66bPgeKOkBBZZvpkWq uGms2ksPTYw/EB97p2pB6XRSspS+HxcwtJIMQs8lLkq5lDlBR2rgKsfUw9u9MOnucZMS DKnQ== MIME-Version: 1.0 References: <1572025151-22783-1-git-send-email-craig.blackmore@embecosm.com> <1572025151-22783-2-git-send-email-craig.blackmore@embecosm.com> In-Reply-To: <1572025151-22783-2-git-send-email-craig.blackmore@embecosm.com> From: Jim Wilson Date: Thu, 31 Oct 2019 00:00:00 -0000 Message-ID: Subject: Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass To: Craig Blackmore Cc: GCC Patches , Ofer Shinaar , Nidal Faour , Kito Cheng , Jeff Law Content-Type: text/plain; charset="UTF-8" X-IsSubscribed: yes X-SW-Source: 2019-10/txt/msg02199.txt.bz2 On Fri, Oct 25, 2019 at 10:40 AM Craig Blackmore wrote: > This patch aims to allow more load/store instructions to be compressed by > replacing a load/store of 'base register + large offset' with a new load/store > of 'new base + small offset'. If the new base gets stored in a compressed > register, then the new load/store can be compressed. Since there is an overhead > in creating the new base, this change is only attempted when 'base register' is > referenced in at least 4 load/stores in a basic block. > > The optimization is implemented in a new RISC-V specific pass called > shorten_memrefs which is enabled for RVC targets. It has been developed for the > 32-bit lw/sw instructions but could also be extended to 64-bit ld/sd in future. The fact that this needs 4 load/stores in a block with the same base address means it won't trigger very often. But it does seem to work. I see about a 0.05% code size reduction for a rv32gc newlib nano libc.a. There might be other codes that benefit more. I'm concerned about the number of RISC-V specific optimization passes people are writing. I've seen at least 3 so far. This one is at least small and simple enough to understand that it doesn't look like it will cause maintenance problems. The config.gcc change conflicts with the save-restore optimization pass that Andrew Burgess added, but that is a trivial fix. The code only works for 32-bit load/stores. rv64 has compressed 64-bit load/stores, and the F and D extensions have float and double compressed loads and stores. The patch would be more useful if it handled all of these. The patch doesn't check the other operand, it only looks at the memory operand. This results in some cases where the code rewrites instructions that can never be compressed. For instance, given void store1z (int *array) { array[200] = 0; array[201] = 0; array[202] = 0; array[203] = 0; } the patch increases code size by 4 bytes because it rewrites the stores, but we don't have compressed store 0, and x0 is not a compressed reg, so there is no point in rewriting the stores. I can also create examples that show a size increase with loads but it is a little trickier. extern int sub1 (int, int, int, int, int, int, int); int load1a (int a0, int a1, int a2, int a3, int a4, int *array) { int a = 0; a += array[200]; a += array[201]; a += array[202]; a += array[203]; return sub1 (a0, a1, a2, a3, a4, 0, a); } The register allocator will pass a0-a4 directly through from args to the call, leaving only a5-a7 for the load base address and dest, and only one of those regs is a compressed reg, but we need two, so these loads can't be compressed. The code still gets rewritten, resulting in a size increase for the extra add. Not sure if you can do anything about that though, since you are doing this before register allocation. It isn't clear that the change riscv_address_cost is for. That should be commented. I'd suggest parens in the riscv_compressed_lw_offset_p return statement, that makes it easier to align the code correctly (in emacs at least). The RISCV_MAX_COMPRESSED_LW_OFFSET should be moved down to the "ISA constants" section of riscv.h, and maybe renamed similar to the other constants. There is a REG_P check in get_si_mem_reg. That should probably handle SUBREGs too. A testcase to verify the patch would be nice. I have one I wrote for testing that shows some tests that work, some tests that don't work, and some that work but maybe shouldn't. I vaguely remember Micheal Meissner talking about doing something similar for PPC in a GNU Cauldron maybe 2 years ago. But I don't know if he tried, or how far he got if he did try. It might be useful to ask him about that work. Otherwise this looks OK to me. Jim