From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x22a.google.com (mail-oi1-x22a.google.com [IPv6:2607:f8b0:4864:20::22a]) by sourceware.org (Postfix) with ESMTPS id 622CE3858039 for ; Mon, 10 May 2021 13:29:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 622CE3858039 Received: by mail-oi1-x22a.google.com with SMTP id l6so15808330oii.1 for ; Mon, 10 May 2021 06:29:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=VGr7DafDrSFGy8zpxC9yEgM0U51Cr45wwT5jhXep5MY=; b=IkbDMFCoQ5rQATHjmKgSVH2pZDAhO+8K3qXgnv6BntP0qUEL6LATx8gkdqM+5RXQPb tOAtKSMKmAH2AP70Wb0RaOTC+o7b6fEPOPJ9CzeuJXv3QlCMwePXB2ujxhpV3LqIQj8E mkmwtmHUbJseBaVIsusT2y7TTmEMZVMH87NgB+n/+pj8TaowaG70ukxetZ1+V2IeFjk2 sbwxKfM6/S9Q9BtaRFhx8XYleRsH2E+bmk9b0qhiGCoH1S5oBVBm5KZO6JPwvG3gLr9u DSftJI6y8y387cDTc+7qWgaLhSpmDojRiOkLGMfavSxTHRm4BTD7mYuxqBlJA5woMTBZ PWPg== X-Gm-Message-State: AOAM530zoc0WSaaqkrCvGatVuuroFocFWMzvt9GA1r24p8IhgTI8URIj hcib9UwB6q35/ovKexujAJWlFQNIXsMqpyrSF+Kb4USLcp4= X-Google-Smtp-Source: ABdhPJzrqqLIFVp66/7JHs0+WpotFxOVbSjr4HLm3/6QEXZIS6MqJnJC7Aasa6oDYYnk4Q/PBgNLrYfHXePSST1BQ8k= X-Received: by 2002:aca:bdc6:: with SMTP id n189mr25932332oif.156.1620653379157; Mon, 10 May 2021 06:29:39 -0700 (PDT) MIME-Version: 1.0 References: <20210429125415.1634118-1-hjl.tools@gmail.com> <20210429125415.1634118-3-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Mon, 10 May 2021 06:29:03 -0700 Message-ID: Subject: Re: [PATCH 02/12] Allow generating pseudo register with specific alignment To: Richard Biener via Gcc-patches , "H.J. Lu" , Richard Biener , Richard Sandiford Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3027.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2021 13:29:41 -0000 On Mon, May 10, 2021 at 2:39 AM Richard Sandiford wrote: > > Richard Biener via Gcc-patches writes: > > On Fri, Apr 30, 2021 at 8:30 PM Richard Sandiford via Gcc-patches > > wrote: > >> > >> "H.J. Lu via Gcc-patches" writes: > >> > On Fri, Apr 30, 2021 at 5:49 AM H.J. Lu wrote: > >> >> > >> >> On Fri, Apr 30, 2021 at 5:42 AM Richard Sandiford > >> >> wrote: > >> >> > > >> >> > "H.J. Lu via Gcc-patches" writes: > >> >> > > On Fri, Apr 30, 2021 at 2:06 AM Richard Sandiford > >> >> > > wrote: > >> >> > >> > >> >> > >> "H.J. Lu via Gcc-patches" writes: > >> >> > >> > gen_reg_rtx tracks stack alignment needed for pseudo registe= rs so that > >> >> > >> > associated hard registers can be properly spilled onto stack= . But there > >> >> > >> > are cases where associated hard registers will never be spil= led onto > >> >> > >> > stack. gen_reg_rtx is changed to take an argument for regis= ter alignment > >> >> > >> > so that stack realignment can be avoided when not needed. > >> >> > >> > >> >> > >> How is it guaranteed that they will never be spilled though? > >> >> > >> I don't think that that guarantee exists for any kind of pseud= o, > >> >> > >> except perhaps for the temporary pseudos that the RA creates t= o > >> >> > >> replace (match_scratch =E2=80=A6)es. > >> >> > >> > >> >> > > > >> >> > > The caller of creating pseudo registers with specific alignment= must > >> >> > > guarantee that they will never be spilled. I am only using it= in > >> >> > > > >> >> > > /* Make operand1 a register if it isn't already. */ > >> >> > > if (can_create_pseudo_p () > >> >> > > && !register_operand (op0, mode) > >> >> > > && !register_operand (op1, mode)) > >> >> > > { > >> >> > > /* NB: Don't increase stack alignment requirement when fo= rcing > >> >> > > operand1 into a pseudo register to copy data from one = memory > >> >> > > location to another since it doesn't require a spill. = */ > >> >> > > emit_move_insn (op0, > >> >> > > force_reg (GET_MODE (op0), op1, > >> >> > > (UNITS_PER_WORD * BITS_PER_UNI= T))); > >> >> > > return; > >> >> > > } > >> >> > > > >> >> > > for vector moves. RA shouldn't spill it. > >> >> > > >> >> > But this is the point: it's a case of hoping that the RA won't sp= ill it, > >> >> > rather than having a guarantee that it won't. > >> >> > > >> >> > Even if the moves start out adjacent, they could be separated by = later > >> >> > RTL optimisations, particularly scheduling. (I realise pre-RA sc= heduling > >> >> > isn't enabled by default for x86, but it can still be enabled exp= licitly.) > >> >> > Or if the same data is being copied to two locations, we might re= use > >> >> > values loaded by the first copy for the second copy as well. > >> > > >> > There are cases where pseudo vector registers are created as pure > >> > temporary registers in the backend and they shouldn't ever be spille= d > >> > to stack. They will be spilled to stack only if there are other no= n-temporary > >> > vector register usage in which case stack will be properly re-aligne= d. > >> > Caller of creating pseudo registers with specific alignment guarante= es > >> > that they are used only as pure temporary registers. > >> > >> I don't think there's really a distinct category of pure temporary > >> registers though. The things I mentioned above can happen for any > >> kind of pseudo register. > > > > I wonder if for the cases HJ thinks of it is appropriate to use hardreg= s? > > Do we generally handle those well? That is, are they again subject > > to be allocated by RA when no longer live? > > Yeah, using hard registers should work. Of course, any given fixed choic= e > of hard register has the potential to be suboptimal in some situation, > but it should be safe. I tried hard registers. The generated code isn't as good as pseudo registe= rs. But I want to avoid align the shack when YMM registers are only used to inline memcpy/memset. Any suggestions? Thanks. --=20 H.J.