From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id D19063858408 for ; Wed, 13 Oct 2021 08:23:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D19063858408 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=aoMM8tR4PDo0nS3wL6kuMBQwPhOqGTns++4cOOPvV/g=; b=nmy+QBuuzyRy4hUcz2F1aZXRjD gQsP9kUU5L8Z/O27j7tQfo7cysXo9DdOtXUGRwNdo+udGFBncvHQMpLCxvcsahYm1izvV3+lhuFpB ah/4IvvOtbiPPJbKd8sYMTGT3jAn9VlksbdJfRL1bX8RE12GW6ZFw/WE7TjVm0pXF0xHXY/HDFX4S AShFNuG33ipcoUswQDYbX6sZy+Dt6ZzbCz/+FmnuO+duKZB6pzl2d6EZnxbZp4uxq4WUoSc6U91PE WPKbjJgNB+7TJbs2TNTfF71v34AzkUxheBosWdJTEXdT4X7nl1/ODEUpOyuvLdXAfgj+h3Sy1CJ2H ENjqVD+w==; Received: from [185.62.158.67] (port=65080 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1maZXe-0004im-2c; Wed, 13 Oct 2021 04:23:14 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [PATCH v2] x86_64: Some SUBREG related optimization tweaks to i386 backend. Date: Wed, 13 Oct 2021 09:23:14 +0100 Message-ID: <001001d7c00b$92174210$b645c630$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0011_01D7C013.F3DCBB80" X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdfACqiUHHFaP3AyQJiHnLFXAzuWIw== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Oct 2021 08:23:20 -0000 This is a multipart message in MIME format. ------=_NextPart_000_0011_01D7C013.F3DCBB80 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Good catch. I agree with Hongtao that although my testing revealed no problems with the previous version of this patch, it makes sense to call gen_reg_rtx to generate an pseudo intermediate instead of = attempting to reuse the existing logic that uses ix86_gen_scratch_sse_rtx as an intermediate. I've left the existing behaviour the same, so that memory-to-memory moves (continue to) use ix86_gen_scatch_sse_rtx. This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-10-13 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.c (ix86_expand_vector_move): Use a pseudo intermediate when moving a SUBREG into a hard register, by checking ix86_hardreg_mov_ok. (ix86_expand_vector_extract): Store zero-extended SImode intermediate in a pseudo, then set target using a SUBREG_PROMOTED annotated subreg. * config/i386/sse.md (mov_internal): Prevent CSE creating complex (SUBREG) sets of (vector) hard registers before reload, by checking ix86_hardreg_mov_ok. Thanks, Roger -----Original Message----- From: Hongtao Liu =20 Sent: 11 October 2021 12:29 To: Roger Sayle Cc: GCC Patches Subject: Re: [PATCH] x86_64: Some SUBREG related optimization tweaks to = i386 backend. On Mon, Oct 11, 2021 at 4:55 PM Roger Sayle = wrote: > gcc/ChangeLog > * config/i386/i386-expand.c (ix86_expand_vector_move): Use a > pseudo intermediate when moving a SUBREG into a hard register, > by checking ix86_hardreg_mov_ok. /* Make operand1 a register if it isn't already. */ if (can_create_pseudo_p () - && !register_operand (op0, mode) - && !register_operand (op1, mode)) + && (!ix86_hardreg_mov_ok (op0, op1) || (!register_operand (op0,=20 + mode) + && !register_operand (op1, mode)))) { rtx tmp =3D ix86_gen_scratch_sse_rtx (GET_MODE (op0)); ix86_gen_scratch_sse_rtx probably returns a hard register, but here you = want a pseudo register. -- BR, Hongtao ------=_NextPart_000_0011_01D7C013.F3DCBB80 Content-Type: text/plain; name="patchv2b.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="patchv2b.txt" diff --git a/gcc/config/i386/i386-expand.c = b/gcc/config/i386/i386-expand.c=0A= index 3e6f7d8e..4a8fa2f 100644=0A= --- a/gcc/config/i386/i386-expand.c=0A= +++ b/gcc/config/i386/i386-expand.c=0A= @@ -615,6 +615,16 @@ ix86_expand_vector_move (machine_mode mode, rtx = operands[])=0A= return;=0A= }=0A= =0A= + /* If operand0 is a hard register, make operand1 a pseudo. */=0A= + if (can_create_pseudo_p ()=0A= + && !ix86_hardreg_mov_ok (op0, op1))=0A= + {=0A= + rtx tmp =3D gen_reg_rtx (GET_MODE (op0));=0A= + emit_move_insn (tmp, op1);=0A= + emit_move_insn (op0, tmp);=0A= + return;=0A= + }=0A= +=0A= /* Make operand1 a register if it isn't already. */=0A= if (can_create_pseudo_p ()=0A= && !register_operand (op0, mode)=0A= @@ -16005,11 +16015,15 @@ ix86_expand_vector_extract (bool mmx_ok, rtx = target, rtx vec, int elt)=0A= /* Let the rtl optimizers know about the zero extension = performed. */=0A= if (inner_mode =3D=3D QImode || inner_mode =3D=3D HImode)=0A= {=0A= + rtx reg =3D gen_reg_rtx (SImode);=0A= tmp =3D gen_rtx_ZERO_EXTEND (SImode, tmp);=0A= - target =3D gen_lowpart (SImode, target);=0A= + emit_move_insn (reg, tmp);=0A= + tmp =3D gen_lowpart (inner_mode, reg);=0A= + SUBREG_PROMOTED_VAR_P (tmp) =3D 1;=0A= + SUBREG_PROMOTED_SET (tmp, 1);=0A= }=0A= =0A= - emit_insn (gen_rtx_SET (target, tmp));=0A= + emit_move_insn (target, tmp);=0A= }=0A= else=0A= {=0A= diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md=0A= index 4559b0c..e43f597 100644=0A= --- a/gcc/config/i386/sse.md=0A= +++ b/gcc/config/i386/sse.md=0A= @@ -1270,7 +1270,8 @@=0A= " C,,vm,v"))]=0A= "TARGET_SSE=0A= && (register_operand (operands[0], mode)=0A= - || register_operand (operands[1], mode))"=0A= + || register_operand (operands[1], mode))=0A= + && ix86_hardreg_mov_ok (operands[0], operands[1])"=0A= {=0A= switch (get_attr_type (insn))=0A= {=0A= ------=_NextPart_000_0011_01D7C013.F3DCBB80--