From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27654 invoked by alias); 31 Oct 2019 10:18:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 27645 invoked by uid 89); 31 Oct 2019 10:18:50 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.9 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=money, HX-Received:c515, H*i:sk:MN2PR04, H*f:sk:MN2PR04 X-HELO: mail-ed1-f66.google.com Received: from mail-ed1-f66.google.com (HELO mail-ed1-f66.google.com) (209.85.208.66) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 31 Oct 2019 10:18:48 +0000 Received: by mail-ed1-f66.google.com with SMTP id y7so4299910edo.12 for ; Thu, 31 Oct 2019 03:18:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=s1W57tL6GI2vIsW2i5LH7hfHwJ8PU+YlSBTx+MO+XR4=; b=PyZBL4Cb8yg7s0JUI2UZcwqSq6KsRMMxGUlfHOCzWLNjDt8kyBdVXHQ4ULx2C5wYXz AEi/TFDatt2WoSy5oEhWb3MIkjozAdgZJk8yB+XlLyhf7qW9UIu2z3ArdVU5h1UUu3q3 fOdRweL2aAYh5o2X+2U04smx0qUmM9toDjqQnXVnZMB4ULD7njBUKoFGtDlox4i1kifW +Yt+of2D8GhZw0GSpbrOSMeI17bNiYQ8cqnLJA/IXoQENxRgPlTflyviprG9saSWD5WD 9joUgUWNIGaEZKbIfvjMKJsfBSfWbF0kwIY5NQfCrhaDPwRmujn6617Zj7ZMycksrn9r IAUA== Return-Path: Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com. [209.85.128.49]) by smtp.gmail.com with ESMTPSA id d26sm71551edu.37.2019.10.31.03.18.44 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 31 Oct 2019 03:18:45 -0700 (PDT) Received: by mail-wm1-f49.google.com with SMTP id w9so5267278wmm.5 for ; Thu, 31 Oct 2019 03:18:44 -0700 (PDT) MIME-Version: 1.0 References: <1572025151-22783-1-git-send-email-craig.blackmore@embecosm.com> <1572025151-22783-2-git-send-email-craig.blackmore@embecosm.com> In-Reply-To: From: Andrew Waterman Date: Thu, 31 Oct 2019 10:42:00 -0000 Message-ID: Subject: Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass To: Nidal Faour Cc: Jim Wilson , Craig Blackmore , GCC Patches , Ofer Shinaar , Kito Cheng , Jeff Law Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-SW-Source: 2019-10/txt/msg02217.txt.bz2 Nevertheless, as Jim observed, it's a great burden on the RISC-V backend maintainers to support all these passes. Are you saying WD is willing to put its money where its mouth is and will provide active support for these passes? On Thu, Oct 31, 2019 at 2:42 AM Nidal Faour wrote: > > A tests case for this patch has been written and pushed to WD github repo= sitory at the following link: > https://github.com/westerndigitalcorporation/riscv32-Code-density-test-be= nch/tree/master/common_load_store_base_address_offset > > the test case itself has been written based on a real product scenario. > We got a saving of about 10% in code size of the test itself. > > Best Regards, > > Nidal Faour > Staff Engineer, R&D Engineering =E2=80=93 Firmware & Toolchain, CTO Group > > Western Digital=C2=AE > Migdal Tefen 24959, P.O Box 3 > Email: nidal.faour@wdc.com > Office: +972-4-9078756 > Mobile: +972-50-8867756 > > >-----Original Message----- > >From: Jim Wilson > >Sent: Thursday, 31 October 2019 1:57 > >To: Craig Blackmore > >Cc: GCC Patches ; Ofer Shinaar > >; Nidal Faour ; Kito Cheng > >; Jeff Law > >Subject: Re: [PATCH v2 1/2] RISC-V: Add shorten_memrefs pass > > > >CAUTION: This email originated from outside of Western Digital. Do not c= lick > >on links or open attachments unless you recognize the sender and know th= at > >the content is safe. > > > > > >On Fri, Oct 25, 2019 at 10:40 AM Craig Blackmore > > wrote: > >> This patch aims to allow more load/store instructions to be compressed > >> by replacing a load/store of 'base register + large offset' with a new > >> load/store of 'new base + small offset'. If the new base gets stored > >> in a compressed register, then the new load/store can be compressed. > >> Since there is an overhead in creating the new base, this change is > >> only attempted when 'base register' is referenced in at least 4 load/s= tores in > >a basic block. > >> > >> The optimization is implemented in a new RISC-V specific pass called > >> shorten_memrefs which is enabled for RVC targets. It has been > >> developed for the 32-bit lw/sw instructions but could also be extended= to > >64-bit ld/sd in future. > > > >The fact that this needs 4 load/stores in a block with the same base add= ress > >means it won't trigger very often. But it does seem to work. > >I see about a 0.05% code size reduction for a rv32gc newlib nano libc.a.= There > >might be other codes that benefit more. > > > >I'm concerned about the number of RISC-V specific optimization passes > >people are writing. I've seen at least 3 so far. This one is at least = small and > >simple enough to understand that it doesn't look like it will cause main= tenance > >problems. > > > >The config.gcc change conflicts with the save-restore optimization pass = that > >Andrew Burgess added, but that is a trivial fix. > > > >The code only works for 32-bit load/stores. rv64 has compressed 64-bit > >load/stores, and the F and D extensions have float and double compressed > >loads and stores. The patch would be more useful if it handled all of t= hese. > > > >The patch doesn't check the other operand, it only looks at the memory > >operand. This results in some cases where the code rewrites instruction= s that > >can never be compressed. For instance, given void store1z (int *array) { > > array[200] =3D 0; > > array[201] =3D 0; > > array[202] =3D 0; > > array[203] =3D 0; > >} > >the patch increases code size by 4 bytes because it rewrites the stores,= but we > >don't have compressed store 0, and x0 is not a compressed reg, so there = is no > >point in rewriting the stores. I can also create examples that show a s= ize > >increase with loads but it is a little trickier. > >extern int sub1 (int, int, int, int, int, int, int); int load1a (int a0,= int a1, int a2, int > >a3, int a4, int *array) { > > int a =3D 0; > > a +=3D array[200]; > > a +=3D array[201]; > > a +=3D array[202]; > > a +=3D array[203]; > > return sub1 (a0, a1, a2, a3, a4, 0, a); } The register allocator will = pass a0-a4 > >directly through from args to the call, leaving only a5-a7 for the load = base > >address and dest, and only one of those regs is a compressed reg, but we > >need two, so these loads can't be compressed. The code still gets rewri= tten, > >resulting in a size increase for the extra add. Not sure if you can do = anything > >about that though, since you are doing this before register allocation. > > > >It isn't clear that the change riscv_address_cost is for. That should be > >commented. > > > >I'd suggest parens in the riscv_compressed_lw_offset_p return statement, > >that makes it easier to align the code correctly (in emacs at least). > > > >The RISCV_MAX_COMPRESSED_LW_OFFSET should be moved down to the > >"ISA constants" section of riscv.h, and maybe renamed similar to the oth= er > >constants. > > > >There is a REG_P check in get_si_mem_reg. That should probably handle > >SUBREGs too. > > > >A testcase to verify the patch would be nice. I have one I wrote for te= sting > >that shows some tests that work, some tests that don't work, and some th= at > >work but maybe shouldn't. > > > >I vaguely remember Micheal Meissner talking about doing something similar > >for PPC in a GNU Cauldron maybe 2 years ago. But I don't know if he tri= ed, or > >how far he got if he did try. It might be useful to ask him about that = work. > > > >Otherwise this looks OK to me. > > > >Jim