From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=HRIl=CF=gmail.com=jeffreyalaw@sourceware.org>
Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e])
	by sourceware.org (Postfix) with ESMTPS id 709353858D35
	for <gcc-patches@gcc.gnu.org>; Sat, 17 Jun 2023 18:31:22 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 709353858D35
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b50e309602so16290045ad.0
        for <gcc-patches@gcc.gnu.org>; Sat, 17 Jun 2023 11:31:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1687026681; x=1689618681;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :from:to:cc:subject:date:message-id:reply-to;
        bh=jEdioq9UK9jezHOO8ZC++HkS+zFGvukMtDzjrWY2oHM=;
        b=gtCbFAcIV8TEDBpluYjfP5UMH8yH34UnH7ZzeFCXdpxLXCVFeZgMmjhH18TBI9cZEw
         o96gMXkKb0G3A5V4rLDayooSASn0OTocqms78sEKJ4kzfRCN7gu0YD13uX67el+MxtFX
         7O0omBLfM+1/RJCoqL2DN/NHJK7p21kG6xLH9/OzDp0rKSxfj/sgrliwVjHEDFuattze
         huLqpbJs0M0o5nVwaFwZQvM3ZNnhW7U4GJtBiidzgLzPErzfMCIMbQL1E4t/qIY+C0Yw
         4hFQcrteaM4pEg01N3FIaC1njDa11npNZJmqx5TBjh0KAXO9q1ZfIjpYEpkTDKwGG1PA
         mX2A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1687026681; x=1689618681;
        h=content-transfer-encoding:in-reply-to:from:references:cc:to
         :content-language:subject:user-agent:mime-version:date:message-id
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=jEdioq9UK9jezHOO8ZC++HkS+zFGvukMtDzjrWY2oHM=;
        b=e2G2XdNYa7WhVi641QtEtUkWPilgheBa0eRtxeeQzf/EU9yaSqiukCwA2v/KY2lGJf
         sT0RLp0QLq0Qthkiq+2ekRMiUaEJfu8F1pA1CZK7HPCS/Ss5/uemBJGJElgdl4llmpq8
         fyg9cIezXPaV2L8myvRS4QkunC6I2/7BzleVecQrMa+vlKINeuG7U1uy1uwl2QH+wMrv
         JNevurNzQsj/PnB99F1c7MDa9cNE9F/DAfkXfy5UGcHLODPKZWjzh28GoRNJJtz5AEl+
         dODZncEcoaM20Y6i9NHHVd6Igbdd5X4QQXEGZ7gzKHCNtvOm6jrYwG8haT2J6C7ySPQ2
         FKqA==
X-Gm-Message-State: AC+VfDzgmvF5zctZVFIpEChsLggfcn1d4Uem9VZikBTsCOcNsqYEGSTc
	raezS7BxRveqtif7nDNIDIc=
X-Google-Smtp-Source: ACHHUZ5oLNV3fNS9jsJHwlxgA1JdLxCMsufO685VIjCLnQ0e4gvLgNTiCgxmljR4ttqItjv+H0J8/w==
X-Received: by 2002:a17:902:8489:b0:1b3:d357:5ea6 with SMTP id c9-20020a170902848900b001b3d3575ea6mr4952481plo.53.1687026681160;
        Sat, 17 Jun 2023 11:31:21 -0700 (PDT)
Received: from [172.31.0.109] ([136.36.130.248])
        by smtp.gmail.com with ESMTPSA id j11-20020a170902690b00b001b02bd00c61sm17725111plk.237.2023.06.17.11.31.19
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Sat, 17 Jun 2023 11:31:20 -0700 (PDT)
Message-ID: <87be2392-09b4-ca67-5f93-3bac3efa858c@gmail.com>
Date: Sat, 17 Jun 2023 12:31:19 -0600
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.12.0
Subject: Re: [PATCH v2] Implement new RTL optimizations pass:
 fold-mem-offsets.
Content-Language: en-US
To: Manolis Tsamis <manolis.tsamis@vrull.eu>, gcc-patches@gcc.gnu.org
Cc: Richard Biener <richard.guenther@gmail.com>,
 Philipp Tomsich <philipp.tomsich@vrull.eu>
References: <20230615172817.3587006-1-manolis.tsamis@vrull.eu>
From: Jeff Law <jeffreyalaw@gmail.com>
In-Reply-To: <20230615172817.3587006-1-manolis.tsamis@vrull.eu>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>


On 6/15/23 11:28, Manolis Tsamis wrote:
> This is a new RTL pass that tries to optimize memory offset calculations
> by moving them from add immediate instructions to the memory loads/stores.
> For example it can transform this:
> 
>    addi t4,sp,16
>    add  t2,a6,t4
>    shl  t3,t2,1
>    ld   a2,0(t3)
>    addi a2,1
>    sd   a2,8(t2)
> 
> into the following (one instruction less):
> 
>    add  t2,a6,sp
>    shl  t3,t2,1
>    ld   a2,32(t3)
>    addi a2,1
>    sd   a2,24(t2)
> 
> Although there are places where this is done already, this pass is more
> powerful and can handle the more difficult cases that are currently not
> optimized. Also, it runs late enough and can optimize away unnecessary
> stack pointer calculations.
> 
> gcc/ChangeLog:
> 
> 	* Makefile.in: Add fold-mem-offsets.o.
> 	* passes.def: Schedule a new pass.
> 	* tree-pass.h (make_pass_fold_mem_offsets): Declare.
> 	* common.opt: New options.
> 	* doc/invoke.texi: Document new option.
> 	* fold-mem-offsets.cc: New file.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/fold-mem-offsets-1.c: New test.
> 	* gcc.target/riscv/fold-mem-offsets-2.c: New test.
> 	* gcc.target/riscv/fold-mem-offsets-3.c: New test.
> 
> Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> ---
> 
> Changes in v2:
>          - Made the pass target-independant instead of RISCV specific.
>          - Fixed a number of bugs.
>          - Add code to handle more ADD patterns as found
>            in other targets (x86, aarch64).
>          - Improved naming and comments.
>          - Fixed bitmap memory leak.
> 

> diff --git a/gcc/fold-mem-offsets.cc b/gcc/fold-mem-offsets.cc
> new file mode 100644
> index 00000000000..8ef0f438191
> --- /dev/null
> +++ b/gcc/fold-mem-offsets.cc
> @@ -0,0 +1,630 @@
> +/* Late RTL pass to fold memory offsets.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +GNU General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#define IN_TARGET_CODE 1
Do we still need this #define?


> +/* Tracks which instructions can be reached through instructions that can
> +   propagate offsets for folding.  */
> +static bitmap_head can_fold_insns;
Is there any reason why you're using "bitmap_head" rather than just the 
generic "bitmap" type?

Also note that since you've got a class you could just put these objects 
into the class and make the routines that use them member functions. 
It's marginally cleaner than using static variables.


> +
> +/* Helper function that performs the actions defined by PHASE for INSN.  */
> +static void
> +fold_mem_offsets_driver (rtx_insn* insn, int phase)
> +{
> +  if (phase == FM_PHASE_COMMIT_INSNS)
So FM_PHASE_COMMIT_INSNS doesn't share any code with the other phases. 
Would it be better to just factor this into a distinct function?


>
> +    {
> +      rtx mem = get_foldable_mem (insn);
> +
> +      if (!mem)
> +	return;
> +
> +      rtx mem_addr = XEXP (mem, 0);
> +      rtx reg;
> +      HOST_WIDE_INT cur_offset;
> +
> +      if (REG_P (mem_addr))
> +	{
> +	  reg = mem_addr;
> +	  cur_offset = 0;
> +	}
> +      else if (GET_CODE (mem_addr) == PLUS
> +	       && REG_P (XEXP (mem_addr, 0))
> +	       && CONST_INT_P (XEXP (mem_addr, 1)))
> +	{
> +	  reg = XEXP (mem_addr, 0);
> +	  cur_offset = INTVAL (XEXP (mem_addr, 1));
> +	}
> +      else
> +	return;
So these is common to the non-commit phases.  Would it be cleaner to 
factor it into its own function, then factor each of the non-commit 
phases into their own function which calls this common routine?


> +      else if (phase == FM_PHASE_VALIDITY)
> +	{
> +	  bitmap_head fold_insns;
> +	  bitmap_initialize (&fold_insns, NULL);
Note that we have auto-bitmap types which will clean up after themselves 
so that you don't have to manage allocation/deallocation.


Overall it looks really good.  I could make an argument to include it 
now, but I think one more cycle would be best.

In the mean time, I've updated my tester to use the V2 version.

Thanks!
jeff