From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 603123858D1E for ; Wed, 21 Dec 2022 11:09:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 603123858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EDB852F4; Wed, 21 Dec 2022 03:09:54 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.99.50]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0F1163F71E; Wed, 21 Dec 2022 03:09:12 -0800 (PST) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,"gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , Kyrylo Tkachov Subject: Re: [PATCH]AArch64 relax constraints on FP16 insn PR108172 References: Date: Wed, 21 Dec 2022 11:09:11 +0000 In-Reply-To: (Tamar Christina's message of "Wed, 21 Dec 2022 10:49:49 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-37.8 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_LOTSOFHASH,KAM_NUMSUBJECT,KAM_SHORT,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: >> -----Original Message----- >> From: Richard Sandiford >> Sent: Wednesday, December 21, 2022 10:31 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH]AArch64 relax constraints on FP16 insn PR108172 >> >> Tamar Christina writes: >> > Hi All, >> > >> > The move, load, load, store, dup, basically all the non arithmetic >> > FP16 instructions use baseline armv8-a HF support, and so do not >> > require the Armv8.2-a extensions. This relaxes the instructions. >> > >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. >> > >> > Ok for master? >> > >> > Thanks, >> > Tamar >> > >> > gcc/ChangeLog: >> > >> > PR target/108172 >> > * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): Relax >> > TARGET_SIMD_F16INST to TARGET_SIMD. >> > * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Re- >> order. >> > * config/aarch64/iterators.md (VMOVE): Drop >> TARGET_SIMD_F16INST. >> > >> > gcc/testsuite/ChangeLog: >> > >> > PR target/108172 >> > * gcc.target/aarch64/pr108172.c: New test. >> >> OK, thanks. >> >> I think we need better tests for this area though. The VMOVE uses include >> aarch64_dup_lane, which uses , which has no definition for >> V2HF, so we get: >> >> return "dup\t%0., %1.h[%2]"; >> >> The original patch defined Vtype to "2h", but using that here would have >> generated an invalid instruction. >> >> We also have unexpanded s in the pairwise operations: >> >> "faddp\t%h0, %1.", >> "fmaxp\t%h0, %1.", >> "fminp\t%h0, %1.", >> "fmaxnmp\t%h0, %1.", >> "fminnmp\t%h0, %1.", >> > > Right, the pairwise bit should have been reverted, the tests were all In that patch. > >> Would it be easy (using a combination of this patch and a follow-on patch) to >> wind the V2HF support back to a state that makes sense on its own, without >> the postponed pairwise support? Or would it be simpler to revert >> 2cba118e538ba0b7582af7f9fb5ba2dfbb772f8e for GCC 13 and revisit this in >> GCC 14, alongside the original motivating use case? > > The pairwise and the dup should be undone, as evidence that they don't trigger > currently. The mov make sense on their own already and improve various codegen > around shorts on their own. OK, sounds good, thanks. > I won't be revisiting pairwise operations again, so I'll remove the remnants from this > Patch. Ack. Sorry for causing frustration in the review for that series. Richard >> Richard >> >> > --- inline copy of patch -- >> > diff --git a/gcc/config/aarch64/aarch64-simd.md >> > b/gcc/config/aarch64/aarch64-simd.md >> > index >> > >> a62b6deaf4a57e570074d7d894e6fac13779f6fb..8a9f655d547285ec7bdc173086 >> 30 >> > 8d7d44a8d482 100644 >> > --- a/gcc/config/aarch64/aarch64-simd.md >> > +++ b/gcc/config/aarch64/aarch64-simd.md >> > @@ -164,7 +164,7 @@ (define_insn "*aarch64_simd_movv2hf" >> > "=w, m, m, w, ?r, ?w, ?r, w, w") >> > (match_operand:V2HF 1 "general_operand" >> > "m, Dz, w, w, w, r, r, Dz, Dn"))] >> > - "TARGET_SIMD_F16INST >> > + "TARGET_SIMD >> > && (register_operand (operands[0], V2HFmode) >> > || aarch64_simd_reg_or_zero (operands[1], V2HFmode))" >> > "@ >> > diff --git a/gcc/config/aarch64/aarch64.cc >> > b/gcc/config/aarch64/aarch64.cc index >> > >> 73515c174fa4fe7830527e7eabd91c4648130ff4..d1c0476321b79bc6aded350d24 >> ea >> > 5d556c796519 100644 >> > --- a/gcc/config/aarch64/aarch64.cc >> > +++ b/gcc/config/aarch64/aarch64.cc >> > @@ -3616,6 +3616,8 @@ aarch64_classify_vector_mode (machine_mode >> mode) >> > case E_V4x2DFmode: >> > return TARGET_FLOAT ? VEC_ADVSIMD | VEC_STRUCT : 0; >> > >> > + /* 32-bit Advanced SIMD vectors. */ >> > + case E_V2HFmode: >> > /* 64-bit Advanced SIMD vectors. */ >> > case E_V8QImode: >> > case E_V4HImode: >> > @@ -3634,7 +3636,6 @@ aarch64_classify_vector_mode (machine_mode >> mode) >> > case E_V8BFmode: >> > case E_V4SFmode: >> > case E_V2DFmode: >> > - case E_V2HFmode: >> > return TARGET_FLOAT ? VEC_ADVSIMD : 0; >> > >> > default: >> > diff --git a/gcc/config/aarch64/iterators.md >> > b/gcc/config/aarch64/iterators.md index >> > >> a521dbde1ec42c0c442a9ca3dd52c9727d116399..70742520984d30158e62a38c9 >> 2ab >> > ea82b2dac059 100644 >> > --- a/gcc/config/aarch64/iterators.md >> > +++ b/gcc/config/aarch64/iterators.md >> > @@ -204,8 +204,7 @@ (define_mode_iterator VALL_F16 [V8QI V16QI V4HI >> > V8HI V2SI V4SI V2DI ;; All Advanced SIMD modes suitable for moving, >> > loading, and storing ;; including V2HF (define_mode_iterator VMOVE >> > [V8QI V16QI V4HI V8HI V2SI V4SI V2DI >> > - V4HF V8HF V4BF V8BF V2SF V4SF V2DF >> > - (V2HF "TARGET_SIMD_F16INST")]) >> > + V2HF V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) >> > >> > >> > ;; The VALL_F16 modes except the 128-bit 2-element ones. >> > diff --git a/gcc/testsuite/gcc.target/aarch64/pr108172.c >> > b/gcc/testsuite/gcc.target/aarch64/pr108172.c >> > new file mode 100644 >> > index >> > >> 0000000000000000000000000000000000000000..b29054fdb1d6e602755bc9308 >> 9f1 >> > edec4eb53b8e >> > --- /dev/null >> > +++ b/gcc/testsuite/gcc.target/aarch64/pr108172.c >> > @@ -0,0 +1,30 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-options "-O0" } */ >> > + >> > +typedef _Float16 v4hf __attribute__ ((vector_size (8))); typedef >> > +_Float16 v2hf __attribute__ ((vector_size (4))); >> > + >> > +v4hf >> > +v4hf_abi_1 (v4hf a) >> > +{ >> > + return a; >> > +} >> > + >> > +v4hf >> > +v4hf_abi_3 (v4hf a, v4hf b, v4hf c) >> > +{ >> > + return c; >> > +} >> > + >> > +v4hf >> > +v4hf_abi_4 (v4hf a, v4hf b, v4hf c, v4hf d) { >> > + return d; >> > +} >> > + >> > +v2hf >> > +v2hf_test (v2hf a, v2hf b, v2hf c, v2hf d) { >> > + return b; >> > +} >> > +