From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id 709353858D35 for ; Sat, 17 Jun 2023 18:31:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 709353858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b50e309602so16290045ad.0 for ; Sat, 17 Jun 2023 11:31:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687026681; x=1689618681; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=jEdioq9UK9jezHOO8ZC++HkS+zFGvukMtDzjrWY2oHM=; b=gtCbFAcIV8TEDBpluYjfP5UMH8yH34UnH7ZzeFCXdpxLXCVFeZgMmjhH18TBI9cZEw o96gMXkKb0G3A5V4rLDayooSASn0OTocqms78sEKJ4kzfRCN7gu0YD13uX67el+MxtFX 7O0omBLfM+1/RJCoqL2DN/NHJK7p21kG6xLH9/OzDp0rKSxfj/sgrliwVjHEDFuattze huLqpbJs0M0o5nVwaFwZQvM3ZNnhW7U4GJtBiidzgLzPErzfMCIMbQL1E4t/qIY+C0Yw 4hFQcrteaM4pEg01N3FIaC1njDa11npNZJmqx5TBjh0KAXO9q1ZfIjpYEpkTDKwGG1PA mX2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687026681; x=1689618681; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jEdioq9UK9jezHOO8ZC++HkS+zFGvukMtDzjrWY2oHM=; b=e2G2XdNYa7WhVi641QtEtUkWPilgheBa0eRtxeeQzf/EU9yaSqiukCwA2v/KY2lGJf sT0RLp0QLq0Qthkiq+2ekRMiUaEJfu8F1pA1CZK7HPCS/Ss5/uemBJGJElgdl4llmpq8 fyg9cIezXPaV2L8myvRS4QkunC6I2/7BzleVecQrMa+vlKINeuG7U1uy1uwl2QH+wMrv JNevurNzQsj/PnB99F1c7MDa9cNE9F/DAfkXfy5UGcHLODPKZWjzh28GoRNJJtz5AEl+ dODZncEcoaM20Y6i9NHHVd6Igbdd5X4QQXEGZ7gzKHCNtvOm6jrYwG8haT2J6C7ySPQ2 FKqA== X-Gm-Message-State: AC+VfDzgmvF5zctZVFIpEChsLggfcn1d4Uem9VZikBTsCOcNsqYEGSTc raezS7BxRveqtif7nDNIDIc= X-Google-Smtp-Source: ACHHUZ5oLNV3fNS9jsJHwlxgA1JdLxCMsufO685VIjCLnQ0e4gvLgNTiCgxmljR4ttqItjv+H0J8/w== X-Received: by 2002:a17:902:8489:b0:1b3:d357:5ea6 with SMTP id c9-20020a170902848900b001b3d3575ea6mr4952481plo.53.1687026681160; Sat, 17 Jun 2023 11:31:21 -0700 (PDT) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id j11-20020a170902690b00b001b02bd00c61sm17725111plk.237.2023.06.17.11.31.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 17 Jun 2023 11:31:20 -0700 (PDT) Message-ID: <87be2392-09b4-ca67-5f93-3bac3efa858c@gmail.com> Date: Sat, 17 Jun 2023 12:31:19 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v2] Implement new RTL optimizations pass: fold-mem-offsets. Content-Language: en-US To: Manolis Tsamis , gcc-patches@gcc.gnu.org Cc: Richard Biener , Philipp Tomsich References: <20230615172817.3587006-1-manolis.tsamis@vrull.eu> From: Jeff Law In-Reply-To: <20230615172817.3587006-1-manolis.tsamis@vrull.eu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 6/15/23 11:28, Manolis Tsamis wrote: > This is a new RTL pass that tries to optimize memory offset calculations > by moving them from add immediate instructions to the memory loads/stores. > For example it can transform this: > > addi t4,sp,16 > add t2,a6,t4 > shl t3,t2,1 > ld a2,0(t3) > addi a2,1 > sd a2,8(t2) > > into the following (one instruction less): > > add t2,a6,sp > shl t3,t2,1 > ld a2,32(t3) > addi a2,1 > sd a2,24(t2) > > Although there are places where this is done already, this pass is more > powerful and can handle the more difficult cases that are currently not > optimized. Also, it runs late enough and can optimize away unnecessary > stack pointer calculations. > > gcc/ChangeLog: > > * Makefile.in: Add fold-mem-offsets.o. > * passes.def: Schedule a new pass. > * tree-pass.h (make_pass_fold_mem_offsets): Declare. > * common.opt: New options. > * doc/invoke.texi: Document new option. > * fold-mem-offsets.cc: New file. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/fold-mem-offsets-1.c: New test. > * gcc.target/riscv/fold-mem-offsets-2.c: New test. > * gcc.target/riscv/fold-mem-offsets-3.c: New test. > > Signed-off-by: Manolis Tsamis > --- > > Changes in v2: > - Made the pass target-independant instead of RISCV specific. > - Fixed a number of bugs. > - Add code to handle more ADD patterns as found > in other targets (x86, aarch64). > - Improved naming and comments. > - Fixed bitmap memory leak. > > diff --git a/gcc/fold-mem-offsets.cc b/gcc/fold-mem-offsets.cc > new file mode 100644 > index 00000000000..8ef0f438191 > --- /dev/null > +++ b/gcc/fold-mem-offsets.cc > @@ -0,0 +1,630 @@ > +/* Late RTL pass to fold memory offsets. > + Copyright (C) 2023 Free Software Foundation, Inc. > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify > +it under the terms of the GNU General Public License as published by > +the Free Software Foundation; either version 3, or (at your option) > +any later version. > + > +GCC is distributed in the hope that it will be useful, > +but WITHOUT ANY WARRANTY; without even the implied warranty of > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +GNU General Public License for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +. */ > + > +#define IN_TARGET_CODE 1 Do we still need this #define? > +/* Tracks which instructions can be reached through instructions that can > + propagate offsets for folding. */ > +static bitmap_head can_fold_insns; Is there any reason why you're using "bitmap_head" rather than just the generic "bitmap" type? Also note that since you've got a class you could just put these objects into the class and make the routines that use them member functions. It's marginally cleaner than using static variables. > + > +/* Helper function that performs the actions defined by PHASE for INSN. */ > +static void > +fold_mem_offsets_driver (rtx_insn* insn, int phase) > +{ > + if (phase == FM_PHASE_COMMIT_INSNS) So FM_PHASE_COMMIT_INSNS doesn't share any code with the other phases. Would it be better to just factor this into a distinct function? > > + { > + rtx mem = get_foldable_mem (insn); > + > + if (!mem) > + return; > + > + rtx mem_addr = XEXP (mem, 0); > + rtx reg; > + HOST_WIDE_INT cur_offset; > + > + if (REG_P (mem_addr)) > + { > + reg = mem_addr; > + cur_offset = 0; > + } > + else if (GET_CODE (mem_addr) == PLUS > + && REG_P (XEXP (mem_addr, 0)) > + && CONST_INT_P (XEXP (mem_addr, 1))) > + { > + reg = XEXP (mem_addr, 0); > + cur_offset = INTVAL (XEXP (mem_addr, 1)); > + } > + else > + return; So these is common to the non-commit phases. Would it be cleaner to factor it into its own function, then factor each of the non-commit phases into their own function which calls this common routine? > + else if (phase == FM_PHASE_VALIDITY) > + { > + bitmap_head fold_insns; > + bitmap_initialize (&fold_insns, NULL); Note that we have auto-bitmap types which will clean up after themselves so that you don't have to manage allocation/deallocation. Overall it looks really good. I could make an argument to include it now, but I think one more cycle would be best. In the mean time, I've updated my tester to use the V2 version. Thanks! jeff