From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5A4B03857B93 for ; Wed, 27 Sep 2023 10:41:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5A4B03857B93 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C8911FB; Wed, 27 Sep 2023 03:42:28 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2D0793F59C; Wed, 27 Sep 2023 03:41:49 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Cc: gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com Subject: Re: [PATCH]AArch64 Rewrite simd move immediate patterns to new syntax References: Date: Wed, 27 Sep 2023 11:41:47 +0100 In-Reply-To: (Tamar Christina's message of "Wed, 27 Sep 2023 01:52:52 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-24.6 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: > Hi All, > > This rewrites the simd MOV patterns to use the new compact syntax. > No change in semantics is expected. This will be needed in follow on patches. > > This also merges the splits into the define_insn which will also be needed soon. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR tree-optimization/109154 > * config/aarch64/aarch64-simd.md (*aarch64_simd_mov): > Rewrite to new syntax. > (*aarch64_simd_mov splits. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md > index e955691f1be8830efacc237465119764ce2a4942..7b4d5a37a9795fefda785aaacc246918826ed0a2 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -143,54 +143,57 @@ (define_insn "aarch64_dup_lane_" > ) > > (define_insn "*aarch64_simd_mov" > - [(set (match_operand:VDMOV 0 "nonimmediate_operand" > - "=w, r, m, m, m, w, ?r, ?w, ?r, w, w") > - (match_operand:VDMOV 1 "general_operand" > - "m, m, Dz, w, r, w, w, r, r, Dn, Dz"))] > + [(set (match_operand:VDMOV 0 "nonimmediate_operand") > + (match_operand:VDMOV 1 "general_operand"))] > "TARGET_FLOAT > && (register_operand (operands[0], mode) > || aarch64_simd_reg_or_zero (operands[1], mode))" > - "@ > - ldr\t%d0, %1 > - ldr\t%x0, %1 > - str\txzr, %0 > - str\t%d1, %0 > - str\t%x1, %0 > - * return TARGET_SIMD ? \"mov\t%0., %1.\" : \"fmov\t%d0, %d1\"; > - * return TARGET_SIMD ? \"umov\t%0, %1.d[0]\" : \"fmov\t%x0, %d1\"; > - fmov\t%d0, %1 > - mov\t%0, %1 > - * return aarch64_output_simd_mov_immediate (operands[1], 64); > - fmov\t%d0, xzr" > - [(set_attr "type" "neon_load1_1reg, load_8, store_8, neon_store1_1reg,\ > - store_8, neon_logic, neon_to_gp, f_mcr,\ > - mov_reg, neon_move, f_mcr") > - (set_attr "arch" "*,*,*,*,*,*,*,*,*,simd,*")] > -) > - > -(define_insn "*aarch64_simd_mov" > - [(set (match_operand:VQMOV 0 "nonimmediate_operand" > - "=w, Umn, m, w, ?r, ?w, ?r, w, w") > - (match_operand:VQMOV 1 "general_operand" > - "m, Dz, w, w, w, r, r, Dn, Dz"))] > + {@ [cons: =0, 1; attrs: type, arch] > + [w , m ; neon_load1_1reg , * ] ldr\t%d0, %1 > + [r , m ; load_8 , * ] ldr\t%x0, %1 > + [m , Dz; store_8 , * ] str\txzr, %0 > + [m , w ; neon_store1_1reg, * ] str\t%d1, %0 > + [m , r ; store_8 , * ] str\t%x1, %0 > + [w , w ; neon_logic , simd] mov\t%0., %1. > + [w , w ; neon_logic , * ] fmov\t%d0, %d1 > + [?r, w ; neon_to_gp , simd] umov\t%0, %1.d[0] > + [?r, w ; neon_to_gp , * ] fmov\t%x0, %d1 > + [?w, r ; f_mcr , * ] fmov\t%d0, %1 > + [?r, r ; mov_reg , * ] mov\t%0, %1 > + [w , Dn; neon_move , simd] << aarch64_output_simd_mov_immediate (operands[1], 64); > + [w , Dz; f_mcr , * ] fmov\t%d0, xzr > + } > +) > + > +(define_insn_and_split "*aarch64_simd_mov" > + [(set (match_operand:VQMOV 0 "nonimmediate_operand") > + (match_operand:VQMOV 1 "general_operand"))] > "TARGET_FLOAT > && (register_operand (operands[0], mode) > || aarch64_simd_reg_or_zero (operands[1], mode))" > - "@ > - ldr\t%q0, %1 > - stp\txzr, xzr, %0 > - str\t%q1, %0 > - mov\t%0., %1. > - # > - # > - # > - * return aarch64_output_simd_mov_immediate (operands[1], 128); > - fmov\t%d0, xzr" > - [(set_attr "type" "neon_load1_1reg, store_16, neon_store1_1reg,\ > - neon_logic, multiple, multiple,\ > - multiple, neon_move, fmov") > - (set_attr "length" "4,4,4,4,8,8,8,4,4") > - (set_attr "arch" "*,*,*,simd,*,*,*,simd,*")] > + {@ [cons: =0, 1; attrs: type, arch, length] > + [w , m ; neon_load1_1reg , * , 4] ldr\t%q0, %1 > + [Umn, Dz; store_16 , * , 4] stp\txzr, xzr, %0 > + [m , w ; neon_store1_1reg, * , 4] str\t%q1, %0 > + [w , w ; neon_logic , simd, 4] mov\t%0., %1. > + [?r , w ; multiple , * , 8] # > + [?w , r ; multiple , * , 8] # > + [?r , r ; multiple , * , 8] # > + [w , Dn; neon_move , simd, 4] << aarch64_output_simd_mov_immediate (operands[1], 128); > + [w , Dz; fmov , * , 4] fmov\t%d0, xzr > + } > + "&& reload_completed > + && !(FP_REGNUM_P (REGNO (operands[0])) > + && FP_REGNUM_P (REGNO (operands[1])))" Won't this also trigger for the load, store, and Dn alternatives? Looks OK otherwise. Thanks, Richard > + [(const_int 0)] > + { > + if (GP_REGNUM_P (REGNO (operands[0])) > + && GP_REGNUM_P (REGNO (operands[1]))) > + aarch64_simd_emit_reg_reg_move (operands, DImode, 2); > + else > + aarch64_split_simd_move (operands[0], operands[1]); > + DONE; > + } > ) > > ;; When storing lane zero we can use the normal STR and its more permissive > @@ -276,33 +279,6 @@ (define_insn "vec_store_pair" > [(set_attr "type" "neon_stp_q")] > ) > > - > -(define_split > - [(set (match_operand:VQMOV 0 "register_operand" "") > - (match_operand:VQMOV 1 "register_operand" ""))] > - "TARGET_FLOAT > - && reload_completed > - && GP_REGNUM_P (REGNO (operands[0])) > - && GP_REGNUM_P (REGNO (operands[1]))" > - [(const_int 0)] > -{ > - aarch64_simd_emit_reg_reg_move (operands, DImode, 2); > - DONE; > -}) > - > -(define_split > - [(set (match_operand:VQMOV 0 "register_operand" "") > - (match_operand:VQMOV 1 "register_operand" ""))] > - "TARGET_FLOAT > - && reload_completed > - && ((FP_REGNUM_P (REGNO (operands[0])) && GP_REGNUM_P (REGNO (operands[1]))) > - || (GP_REGNUM_P (REGNO (operands[0])) && FP_REGNUM_P (REGNO (operands[1]))))" > - [(const_int 0)] > -{ > - aarch64_split_simd_move (operands[0], operands[1]); > - DONE; > -}) > - > (define_expand "@aarch64_split_simd_mov" > [(set (match_operand:VQMOV 0) > (match_operand:VQMOV 1))]