From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 40F773856B58 for ; Tue, 6 Dec 2022 11:05:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 40F773856B58 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8298C23A; Tue, 6 Dec 2022 03:05:27 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DB4DD3F73D; Tue, 6 Dec 2022 03:05:19 -0800 (PST) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,"gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov Subject: Re: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. References: Date: Tue, 06 Dec 2022 11:05:18 +0000 In-Reply-To: (Tamar Christina's message of "Tue, 6 Dec 2022 10:58:24 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-38.8 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: >> -----Original Message----- >> From: Richard Sandiford >> Sent: Tuesday, December 6, 2022 10:28 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. >> >> Tamar Christina writes: >> > Hi, >> > >> > >> >> This name might cause confusion with the SVE iterators, where FULL >> >> means "every bit of the register is used". How about something like >> >> VMOVE instead? >> >> >> >> With this change, I guess VALL_F16 represents "The set of all modes >> >> for which the vld1 intrinsics are provided" and VMOVE or whatever is >> >> "All Advanced SIMD modes suitable for moving, loading, and storing". >> >> That is, VMOVE extends VALL_F16 with modes that are not manifested >> >> via intrinsics. >> >> >> > >> > Done. >> > >> >> Where is the 2h used, and is it valid syntax in that context? >> >> >> >> Same for later instances of 2h. >> > >> > They are, but they weren't meant to be in this patch. They belong in >> > a separate FP16 series that I won't get to finish for GCC 13 due not >> > being able to finish writing all the tests. I have moved them to that patch >> series though. >> > >> > While the addp patch series has been killed, this patch is still good >> > standalone and improves codegen as shown in the updated testcase. >> > >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >> > >> > Ok for master? >> > >> > Thanks, >> > Tamar >> > >> > gcc/ChangeLog: >> > >> > * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): New. >> > (mov, movmisalign, aarch64_dup_lane, >> > aarch64_store_lane0, aarch64_simd_vec_set, >> > @aarch64_simd_vec_copy_lane, vec_set, >> > reduc__scal_, reduc__scal_, >> > aarch64_reduc__internal, >> aarch64_get_lane, >> > vec_init, vec_extract): Support V2HF. >> > (aarch64_simd_dupv2hf): New. >> > * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): >> > Add E_V2HFmode. >> > * config/aarch64/iterators.md (VHSDF_P): New. >> > (V2F, VMOVE, nunits, Vtype, Vmtype, Vetype, stype, VEL, >> > Vel, q, vp): Add V2HF. >> > * config/arm/types.md (neon_fp_reduc_add_h): New. >> > >> > gcc/testsuite/ChangeLog: >> > >> > * gcc.target/aarch64/sve/slp_1.c: Update testcase. >> > >> > --- inline copy of patch --- >> > >> > diff --git a/gcc/config/aarch64/aarch64-simd.md >> > b/gcc/config/aarch64/aarch64-simd.md >> > index >> > >> f4152160084d6b6f34bd69f0ba6386c1ab50f77e..487a31010245accec28e779661 >> e6 >> > c2d578fca4b7 100644 >> > --- a/gcc/config/aarch64/aarch64-simd.md >> > +++ b/gcc/config/aarch64/aarch64-simd.md >> > @@ -19,10 +19,10 @@ >> > ;; . >> > >> > (define_expand "mov" >> > - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") >> > - (match_operand:VALL_F16 1 "general_operand"))] >> > + [(set (match_operand:VMOVE 0 "nonimmediate_operand") >> > + (match_operand:VMOVE 1 "general_operand"))] >> > "TARGET_SIMD" >> > - " >> > +{ >> > /* Force the operand into a register if it is not an >> > immediate whose use can be replaced with xzr. >> > If the mode is 16 bytes wide, then we will be doing @@ -46,12 >> > +46,11 @@ (define_expand "mov" >> > aarch64_expand_vector_init (operands[0], operands[1]); >> > DONE; >> > } >> > - " >> > -) >> > +}) >> > >> > (define_expand "movmisalign" >> > - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") >> > - (match_operand:VALL_F16 1 "general_operand"))] >> > + [(set (match_operand:VMOVE 0 "nonimmediate_operand") >> > + (match_operand:VMOVE 1 "general_operand"))] >> > "TARGET_SIMD && !STRICT_ALIGNMENT" >> > { >> > /* This pattern is not permitted to fail during expansion: if both >> > arguments @@ -73,6 +72,16 @@ (define_insn >> "aarch64_simd_dup" >> > [(set_attr "type" "neon_dup, neon_from_gp")] >> > ) >> > >> > +(define_insn "aarch64_simd_dupv2hf" >> > + [(set (match_operand:V2HF 0 "register_operand" "=w") >> > + (vec_duplicate:V2HF >> > + (match_operand:HF 1 "register_operand" "0")))] >> >> Seems like this should be "w" rather than "0", since SLI is a two-register >> instruction. > > Yes, but for a dup it's only valid when the same register is used. i.e. it has to > write into the original src register. Ah, right. In that case it might be better to use %d0 for the source operand: For operands to match in a particular case usually means that they are identical-looking RTL expressions. But in a few special cases specific kinds of dissimilarity are allowed. For example, @code{*x} as an input operand will match @code{*x++} as an output operand. For proper results in such cases, the output template should always use the output-operand's number when printing the operand. Thanks, Richard