From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x233.google.com (mail-oi1-x233.google.com [IPv6:2607:f8b0:4864:20::233]) by sourceware.org (Postfix) with ESMTPS id F05403854805 for ; Mon, 10 May 2021 14:12:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F05403854805 Received: by mail-oi1-x233.google.com with SMTP id i81so15915436oif.6 for ; Mon, 10 May 2021 07:12:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=akp2NtJQyw18urOqd7cB4wQbWnOGUTz3SNPXE2EgLEE=; b=LGstY2XcTSZOfLHDxYwu5yT4i2Vw2xDFoTDGzXSYrdeSvHT8RDw24dK/zVQOHnp+Fm ZG4YyRO2mdI0G5hxhemzUSfn5TNzgHEb6P098tQxlgfnGla7EZRIXmLhwEoGbC6+4Yfz miYLHjdGIpUTenYepBugGvaUlWO4Ye+WkwEZhQ+pvfG2l0+2fxOuy20zCAMuGO/5Y7xA wjyjf+FvEEl2AK8lljmCewi716qfJ9ls8P9AIwAPLgB1kwPEwk+iJOy/g7BMYTLYji4B y6WXeBarFDn169uSWkA+YRgENaJLWTKNV0oZ/og+EAnXwkmCed/xtnkyU4reGqFjqBME wBsQ== X-Gm-Message-State: AOAM5328iNfx2RzqyUOH3DMYD9Y2wZWwqdSjgqpxnxwkS5TR8WRurFqL C19C7bBHIFLqB4ItodbvwMraYEIs/metZCE3Mk4= X-Google-Smtp-Source: ABdhPJyOp/I+lqrvXxsYL10596We4oHWKp83f98G014Np0W/mXXJGYOjU7q5odPxEbW/ZlgGPbZn1XrDGXASkKM/nZM= X-Received: by 2002:aca:bdc6:: with SMTP id n189mr26091297oif.156.1620655932314; Mon, 10 May 2021 07:12:12 -0700 (PDT) MIME-Version: 1.0 References: <20210429125415.1634118-1-hjl.tools@gmail.com> <20210429125415.1634118-3-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Mon, 10 May 2021 07:11:35 -0700 Message-ID: Subject: Re: [PATCH 02/12] Allow generating pseudo register with specific alignment To: Richard Biener Cc: Richard Biener via Gcc-patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3027.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2021 14:12:14 -0000 On Mon, May 10, 2021 at 6:59 AM Richard Biener wrote: > > On Mon, May 10, 2021 at 3:29 PM H.J. Lu wrote: > > > > On Mon, May 10, 2021 at 2:39 AM Richard Sandiford > > wrote: > > > > > > Richard Biener via Gcc-patches writes: > > > > On Fri, Apr 30, 2021 at 8:30 PM Richard Sandiford via Gcc-patches > > > > wrote: > > > >> > > > >> "H.J. Lu via Gcc-patches" writes: > > > >> > On Fri, Apr 30, 2021 at 5:49 AM H.J. Lu wr= ote: > > > >> >> > > > >> >> On Fri, Apr 30, 2021 at 5:42 AM Richard Sandiford > > > >> >> wrote: > > > >> >> > > > > >> >> > "H.J. Lu via Gcc-patches" writes: > > > >> >> > > On Fri, Apr 30, 2021 at 2:06 AM Richard Sandiford > > > >> >> > > wrote: > > > >> >> > >> > > > >> >> > >> "H.J. Lu via Gcc-patches" writes= : > > > >> >> > >> > gen_reg_rtx tracks stack alignment needed for pseudo reg= isters so that > > > >> >> > >> > associated hard registers can be properly spilled onto s= tack. But there > > > >> >> > >> > are cases where associated hard registers will never be = spilled onto > > > >> >> > >> > stack. gen_reg_rtx is changed to take an argument for r= egister alignment > > > >> >> > >> > so that stack realignment can be avoided when not needed= . > > > >> >> > >> > > > >> >> > >> How is it guaranteed that they will never be spilled thoug= h? > > > >> >> > >> I don't think that that guarantee exists for any kind of p= seudo, > > > >> >> > >> except perhaps for the temporary pseudos that the RA creat= es to > > > >> >> > >> replace (match_scratch =E2=80=A6)es. > > > >> >> > >> > > > >> >> > > > > > >> >> > > The caller of creating pseudo registers with specific align= ment must > > > >> >> > > guarantee that they will never be spilled. I am only usin= g it in > > > >> >> > > > > > >> >> > > /* Make operand1 a register if it isn't already. */ > > > >> >> > > if (can_create_pseudo_p () > > > >> >> > > && !register_operand (op0, mode) > > > >> >> > > && !register_operand (op1, mode)) > > > >> >> > > { > > > >> >> > > /* NB: Don't increase stack alignment requirement whe= n forcing > > > >> >> > > operand1 into a pseudo register to copy data from = one memory > > > >> >> > > location to another since it doesn't require a spi= ll. */ > > > >> >> > > emit_move_insn (op0, > > > >> >> > > force_reg (GET_MODE (op0), op1, > > > >> >> > > (UNITS_PER_WORD * BITS_PER= _UNIT))); > > > >> >> > > return; > > > >> >> > > } > > > >> >> > > > > > >> >> > > for vector moves. RA shouldn't spill it. > > > >> >> > > > > >> >> > But this is the point: it's a case of hoping that the RA won'= t spill it, > > > >> >> > rather than having a guarantee that it won't. > > > >> >> > > > > >> >> > Even if the moves start out adjacent, they could be separated= by later > > > >> >> > RTL optimisations, particularly scheduling. (I realise pre-R= A scheduling > > > >> >> > isn't enabled by default for x86, but it can still be enabled= explicitly.) > > > >> >> > Or if the same data is being copied to two locations, we migh= t reuse > > > >> >> > values loaded by the first copy for the second copy as well. > > > >> > > > > >> > There are cases where pseudo vector registers are created as pur= e > > > >> > temporary registers in the backend and they shouldn't ever be sp= illed > > > >> > to stack. They will be spilled to stack only if there are othe= r non-temporary > > > >> > vector register usage in which case stack will be properly re-al= igned. > > > >> > Caller of creating pseudo registers with specific alignment guar= antees > > > >> > that they are used only as pure temporary registers. > > > >> > > > >> I don't think there's really a distinct category of pure temporary > > > >> registers though. The things I mentioned above can happen for any > > > >> kind of pseudo register. > > > > > > > > I wonder if for the cases HJ thinks of it is appropriate to use har= dregs? > > > > Do we generally handle those well? That is, are they again subject > > > > to be allocated by RA when no longer live? > > > > > > Yeah, using hard registers should work. Of course, any given fixed c= hoice > > > of hard register has the potential to be suboptimal in some situation= , > > > but it should be safe. > > > > I tried hard registers. The generated code isn't as good as pseudo reg= isters. > > But I want to avoid align the shack when YMM registers are only used to > > inline memcpy/memset. Any suggestions? > > I wonder if we can mark pseudos with a new reg flag, like 'nospill' and > enforce this in LRA or ICE if we can't? That said, we should be able > to verify our assumption holds. Now, we then of course need to avoid > CSE re-using such pseudo in ways that could lead to spilling > (not sure how that could happen, but ...). Spill should be rare. It is up to backends to decide if unaligned spill should be used when spill does happen. > Did you investigate closer what made the hardreg case generate worse > code? Can we hide the copies behind UNSPECs and split them late I chose XMM7 for memcpy/memset. Only XMM7 is used for memcpy vs XMM0/XMM1/..... > after reload? Or is that too awkward to support when generating the > sequence from the middle-end (I suppose it's not going via the optabs?) That is correct. > Richard. > > > Thanks. > > > > -- > > H.J. --=20 H.J.