From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by sourceware.org (Postfix) with ESMTPS id 915FD3857433 for ; Mon, 10 May 2021 13:59:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 915FD3857433 Received: by mail-ej1-x631.google.com with SMTP id m12so24702655eja.2 for ; Mon, 10 May 2021 06:59:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=3pjwAJKbvAZTV+OlXM3le2f2KVCIK+v0ErB9mPwipG8=; b=QeFaF5gaKDTXvBZDRuqfD5M7voJua6kIcj4bg6NCIggMFQ55FMaNh5g8CQI+T+Qtwg evoQ/3De3CJhgZGF4UJPPeLQUgUBNaZ+o1+C0DpK7dqgF+kwXonU+iPdEsxzIstPqoWT wPfPOsxCnYDdEl7Srdvuwv39Dldcy45bp+vSN1EozJlaLjxOpoIgmZ34nkj0LIqw2uVA XAWKOVCK9p4DQfbfGq8tOuOgNCiJRmVo2Xu3JnnphcFDx6xTdyDkT6cReZTNc21/oeiO tWa4aRcOcn+NBNPBYWoSqcukKA4iu7IWrdcg23ZNuqDO/3NKSIrGvuyTbNGb9maqRUS8 av5A== X-Gm-Message-State: AOAM530kHMf3+WRSf3bx/QWyuUT5p3pbTcq7NoFOisOeCRA/l8vsQuqz qmhty3hfjz1ubG1RAmEITAxNU+mOXhiLD2GOoX0= X-Google-Smtp-Source: ABdhPJz/LxJW2AV6MWUZs+NCQiMqfnd9QR/7I+oe3W9+o9Do57TUblvZNy9U+InE4uAptTefwiEQDzLvDiY/GkOO/ms= X-Received: by 2002:a17:906:364d:: with SMTP id r13mr26636687ejb.250.1620655175642; Mon, 10 May 2021 06:59:35 -0700 (PDT) MIME-Version: 1.0 References: <20210429125415.1634118-1-hjl.tools@gmail.com> <20210429125415.1634118-3-hjl.tools@gmail.com> In-Reply-To: From: Richard Biener Date: Mon, 10 May 2021 15:59:24 +0200 Message-ID: Subject: Re: [PATCH 02/12] Allow generating pseudo register with specific alignment To: "H.J. Lu" Cc: Richard Biener via Gcc-patches , Richard Sandiford Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 May 2021 13:59:38 -0000 On Mon, May 10, 2021 at 3:29 PM H.J. Lu wrote: > > On Mon, May 10, 2021 at 2:39 AM Richard Sandiford > wrote: > > > > Richard Biener via Gcc-patches writes: > > > On Fri, Apr 30, 2021 at 8:30 PM Richard Sandiford via Gcc-patches > > > wrote: > > >> > > >> "H.J. Lu via Gcc-patches" writes: > > >> > On Fri, Apr 30, 2021 at 5:49 AM H.J. Lu wrot= e: > > >> >> > > >> >> On Fri, Apr 30, 2021 at 5:42 AM Richard Sandiford > > >> >> wrote: > > >> >> > > > >> >> > "H.J. Lu via Gcc-patches" writes: > > >> >> > > On Fri, Apr 30, 2021 at 2:06 AM Richard Sandiford > > >> >> > > wrote: > > >> >> > >> > > >> >> > >> "H.J. Lu via Gcc-patches" writes: > > >> >> > >> > gen_reg_rtx tracks stack alignment needed for pseudo regis= ters so that > > >> >> > >> > associated hard registers can be properly spilled onto sta= ck. But there > > >> >> > >> > are cases where associated hard registers will never be sp= illed onto > > >> >> > >> > stack. gen_reg_rtx is changed to take an argument for reg= ister alignment > > >> >> > >> > so that stack realignment can be avoided when not needed. > > >> >> > >> > > >> >> > >> How is it guaranteed that they will never be spilled though? > > >> >> > >> I don't think that that guarantee exists for any kind of pse= udo, > > >> >> > >> except perhaps for the temporary pseudos that the RA creates= to > > >> >> > >> replace (match_scratch =E2=80=A6)es. > > >> >> > >> > > >> >> > > > > >> >> > > The caller of creating pseudo registers with specific alignme= nt must > > >> >> > > guarantee that they will never be spilled. I am only using = it in > > >> >> > > > > >> >> > > /* Make operand1 a register if it isn't already. */ > > >> >> > > if (can_create_pseudo_p () > > >> >> > > && !register_operand (op0, mode) > > >> >> > > && !register_operand (op1, mode)) > > >> >> > > { > > >> >> > > /* NB: Don't increase stack alignment requirement when = forcing > > >> >> > > operand1 into a pseudo register to copy data from on= e memory > > >> >> > > location to another since it doesn't require a spill= . */ > > >> >> > > emit_move_insn (op0, > > >> >> > > force_reg (GET_MODE (op0), op1, > > >> >> > > (UNITS_PER_WORD * BITS_PER_U= NIT))); > > >> >> > > return; > > >> >> > > } > > >> >> > > > > >> >> > > for vector moves. RA shouldn't spill it. > > >> >> > > > >> >> > But this is the point: it's a case of hoping that the RA won't = spill it, > > >> >> > rather than having a guarantee that it won't. > > >> >> > > > >> >> > Even if the moves start out adjacent, they could be separated b= y later > > >> >> > RTL optimisations, particularly scheduling. (I realise pre-RA = scheduling > > >> >> > isn't enabled by default for x86, but it can still be enabled e= xplicitly.) > > >> >> > Or if the same data is being copied to two locations, we might = reuse > > >> >> > values loaded by the first copy for the second copy as well. > > >> > > > >> > There are cases where pseudo vector registers are created as pure > > >> > temporary registers in the backend and they shouldn't ever be spil= led > > >> > to stack. They will be spilled to stack only if there are other = non-temporary > > >> > vector register usage in which case stack will be properly re-alig= ned. > > >> > Caller of creating pseudo registers with specific alignment guaran= tees > > >> > that they are used only as pure temporary registers. > > >> > > >> I don't think there's really a distinct category of pure temporary > > >> registers though. The things I mentioned above can happen for any > > >> kind of pseudo register. > > > > > > I wonder if for the cases HJ thinks of it is appropriate to use hardr= egs? > > > Do we generally handle those well? That is, are they again subject > > > to be allocated by RA when no longer live? > > > > Yeah, using hard registers should work. Of course, any given fixed cho= ice > > of hard register has the potential to be suboptimal in some situation, > > but it should be safe. > > I tried hard registers. The generated code isn't as good as pseudo regis= ters. > But I want to avoid align the shack when YMM registers are only used to > inline memcpy/memset. Any suggestions? I wonder if we can mark pseudos with a new reg flag, like 'nospill' and enforce this in LRA or ICE if we can't? That said, we should be able to verify our assumption holds. Now, we then of course need to avoid CSE re-using such pseudo in ways that could lead to spilling (not sure how that could happen, but ...). Did you investigate closer what made the hardreg case generate worse code? Can we hide the copies behind UNSPECs and split them late after reload? Or is that too awkward to support when generating the sequence from the middle-end (I suppose it's not going via the optabs?) Richard. > Thanks. > > -- > H.J.