From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by sourceware.org (Postfix) with ESMTPS id CF9833858D33 for ; Tue, 3 Oct 2023 11:50:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CF9833858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-pj1-x1031.google.com with SMTP id 98e67ed59e1d1-27758be8a07so617794a91.0 for ; Tue, 03 Oct 2023 04:50:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1696333834; x=1696938634; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=+88Kg0E9sBjET8UpM8oLL5fR/MHaUGaguQFbTXkh8XY=; b=DwhmG7TAlOeRgqrZeHtoRwW/XhsYzXxAp5+wOOAfsU43fqpF+LjfeVK9YL5z65UoaU ItATD5nAQuXS/362pyqN3bh0OPnLO5ww2fcZUsi5Ycowq0QLsFxqe+6DtLigNHTKx/h9 zSyxu2nUmD21FJyhKmAQCnB+EvvWvb36jRqu2F2fUqaeJE6Piq0zIlggidLbmnCVp2tx A0LzNrCpOSXIpJ67PAMz6PszFNagR87u7+69lTAetJsJn22CR7xr2sC7II6Yi9Tgctt+ LqRFj5Em6BnE/zCIkDaiI74EQmdgihZlycLNhzsgZA6RrtslimBQNAPgD8a/RsG51z7H WFMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696333834; x=1696938634; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+88Kg0E9sBjET8UpM8oLL5fR/MHaUGaguQFbTXkh8XY=; b=IneXXMYXUaYPUw6AKJKz8WYiGhNdXUVOmmAESoA12/kN0nFdgoXTSZDsmnC1i4qscK 8mLj+DLtxX+AdpWf0PHjskmKn+VLM3bE3SH98ctUSnjFArZYegqtlSFABzjdOffr7T6O rcNKyMzjScSP/fFXcnuUMeCWUJ8X2xftMClToKZnwjWU28nrjPZijOegwVcx4tp+yZSs na8N4WJ0PJvuOcwzWHHPDerLO0ZVBg+aK/jdwKjQ4DV08ebT9uc5NcUw/KG9XMAIFkEm HaYmdTRX7j/RXUAgFeTENoj++PSDaIMa3gGXDe+672yNVEGzI20znOBk74uDZEEOfYv3 CRpw== X-Gm-Message-State: AOJu0YyAas8vsl4xCOtQ1aFz6JNlSuHF/o2aQ54Qm2xmwyRJqmHNMsOD 3nHA5ApNQ6EcSHujVm/PEmyADnVQcqhHMXNDIP9AXQ== X-Google-Smtp-Source: AGHT+IGRqzbpslchg+QwQTsR/oy2+ReEFQMDr1f8iLf2cYjU3lxesyenluB66ACt90BnGVUhoHGum1HDUr/jHIYCQU4= X-Received: by 2002:a17:90b:1497:b0:277:4b68:b93c with SMTP id js23-20020a17090b149700b002774b68b93cmr13296252pjb.4.1696333833625; Tue, 03 Oct 2023 04:50:33 -0700 (PDT) MIME-Version: 1.0 References: <20230909084652.2655745-1-manolis.tsamis@vrull.eu> In-Reply-To: From: Manolis Tsamis Date: Tue, 3 Oct 2023 14:49:57 +0300 Message-ID: Subject: Re: [PATCH v5] Implement new RTL optimizations pass: fold-mem-offsets. To: Jeff Law Cc: gcc-patches@gcc.gnu.org, Philipp Tomsich , Vineet Gupta , Richard Biener Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, Sep 29, 2023 at 10:22=E2=80=AFPM Jeff Law w= rote: > > > > On 9/12/23 04:13, Manolis Tsamis wrote: > > >>> + > >>> +/* Get the single reaching definition of an instruction inside a BB. > >>> + The definition is desired for REG used in INSN. > >>> + Return the definition insn or NULL if there's no definition with > >>> + the desired criteria. */ > >>> +static rtx_insn* > >>> +get_single_def_in_bb (rtx_insn *insn, rtx reg) > >>> +{ > >>> + df_ref use; > >>> + struct df_link *ref_chain, *ref_link; > >>> + > >>> + FOR_EACH_INSN_USE (use, insn) > >>> + { > >>> + if (GET_CODE (DF_REF_REG (use)) =3D=3D SUBREG) > >>> + return NULL; > >>> + if (REGNO (DF_REF_REG (use)) =3D=3D REGNO (reg)) > >>> + break; > >>> + } > >>> + > >>> + if (!use) > >>> + return NULL; > >>> + > >>> + ref_chain =3D DF_REF_CHAIN (use); > >> So what if there's two uses of REG in INSN? I don't think it's be > >> common at all, but probably better safe and reject than sorry, right? = Or > >> is that case filtered out earlier? > >> > > If the REG is the same won't the definitions be the same even if that > > REG appears multiple times in INSN? > Yes. Good, so no issues here. > > > fold_offsets_1 should be able to handle the folding with multiple uses > > of REG just fine, for example add R1, R1 or add (ashift R1, 1), R1. > > If there's no other issue here I assume we want to keep that as-is in > > order to not reduce the propagation power (Which I assume is similar > > to ree which uses the same logic). > OK. I was primarily concerned about the folding and rewriting aspects. > It probably can only show up on targets with LEA like instructions, and > even on such targets it's probably rate. > OK. > > > > >>> +/* Test if INSN is a memory load / store that can have an offset fol= ded to it. > >>> + Return true iff INSN is such an instruction and return through ME= M_OUT, > >>> + REG_OUT and OFFSET_OUT the RTX that has a MEM code, the register = that is > >>> + used as a base address and the offset accordingly. > >>> + All of the out pointers may be NULL in which case they will be ig= nored. */ > >>> +bool > >>> +get_fold_mem_root (rtx_insn* insn, rtx *mem_out, rtx *reg_out, > >>> + HOST_WIDE_INT *offset_out) > >>> +{ > >>> + rtx set =3D single_set (insn); > >>> + rtx mem =3D NULL_RTX; > >>> + > >>> + if (set !=3D NULL_RTX) > >>> + { > >>> + rtx src =3D SET_SRC (set); > >>> + rtx dest =3D SET_DEST (set); > >>> + > >>> + /* Don't fold when we have unspec / volatile. */ > >>> + if (GET_CODE (src) =3D=3D UNSPEC > >>> + || GET_CODE (src) =3D=3D UNSPEC_VOLATILE > >>> + || GET_CODE (dest) =3D=3D UNSPEC > >>> + || GET_CODE (dest) =3D=3D UNSPEC_VOLATILE) > >>> + return false; > >>> + > >>> + if (MEM_P (src)) > >>> + mem =3D src; > >>> + else if (MEM_P (dest)) > >>> + mem =3D dest; > >>> + else if ((GET_CODE (src) =3D=3D SIGN_EXTEND > >>> + || GET_CODE (src) =3D=3D ZERO_EXTEND) > >>> + && MEM_P (XEXP (src, 0))) > >> Note some architectures allow both a source and destination memory. I= t > >> looks like your code will prefer the source operand in that case. > >> That's fine, just pointing it out. > >> > > Thanks for pointing that possibility out. I thought for a moment that > > this would be a bug with multiple mentions of the address register. > > but it should be fine due to: > > /* Special case: A foldable memory store is not foldable if it > > mentions DEST outside of the address calculation. */ > ACK. > > The other thing I keep pondering is autoincrement style addressing. > Though I think at some point I convinced myself they weren't a problem. > I think your checks only allow specific kinds of expressions for the > memory address and I don't think {PRE,POST}_{INC,DEC.MODIFY} were in the > list of valid ops. > Yes, although I haven't considered pre/post-inc/dec, they're indeed not allowed due to what is allowed to be a root memory operation (get_fold_mem_root). I believe these shouldn't be an issue as anything that transitively affects even one of these will be rejected. > > >>> + > >>> + int max_iters =3D 5; > >>> + for (int i =3D 0; i < max_iters; i++) > >>> + { > >>> + bool made_changes =3D false; > >>> + for (fold_info_map::iterator iter =3D fold_info->begin (); > >>> + iter !=3D fold_info->end (); ++iter) > >>> + { > >>> + fold_mem_info *info =3D (*iter).second; > >>> + if (bitmap_intersect_p (&cannot_fold_insns, info->fold_insns)= ) > >>> + made_changes |=3D bitmap_ior_into (&cannot_fold_insns, > >>> + info->fold_insns); > >>> + } > >>> + > >>> + if (!made_changes) > >>> + return true; > >>> + } > >>> + > >>> + return false; > >> So how was the magic value of "5" determined here? In general we try > >> not to have magic #s like that and instead find a better way to contro= l > >> iterations, falling back to a PARAM when all else fails. > >> > > It's indeed just a magic value :) > > In general even two iterations of this should be quite rare. One > > iteration will handle the majority of interesting cases. I chose 5 to > > limit the worst case execution time for degenerate cases. > > > > I'll be happy to change that to something better but I couldn't > > figure out how "falling back to a PARAM when all else fails" should > > work here. Is there a similar example somewhere else in the code that > > I can look at? > If we expect two iterations to be quite rare, then I doubt a param is > really worth the effort. I might suggest using a loop like > > for (pass =3D 0; pass < flag_expensive_optimizations + 1; pass++) > > No magic numbers :-) We use similar loops elsewhere for this kind of > scenario. > Oh I see, thanks for noting that. I initially added a param, but this is much better as the param wasn't useful in a meaningful way. Done. > > If you ever need to create a param, the procedure is to add the option > to params.opt. If you look in that file you'll see that you define a > variable name in the specification that you can then reference. > > So for a simple example look for param_avg_loop_niter. > So, V6 with the formatting fixes and the iteration count change is here: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631839.html Thanks for all the feedback! Manolis > Jeff