From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1E6C03858D37 for ; Mon, 3 Apr 2023 09:31:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1E6C03858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2970FFEC; Mon, 3 Apr 2023 02:32:40 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3B2923F840; Mon, 3 Apr 2023 02:31:55 -0700 (PDT) From: Richard Sandiford To: Jan Beulich Mail-Followup-To: Jan Beulich ,binutils@sourceware.org, richard.sandiford@arm.com Cc: binutils@sourceware.org Subject: Re: [PATCH 00/31] aarch64: Add SME2 support References: <20230330102646.3327818-1-richard.sandiford@arm.com> <0368da1c-bdb5-5a79-1f97-10e0e09c019a@suse.com> Date: Mon, 03 Apr 2023 10:31:54 +0100 In-Reply-To: <0368da1c-bdb5-5a79-1f97-10e0e09c019a@suse.com> (Jan Beulich's message of "Mon, 3 Apr 2023 10:37:06 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-25.4 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Jan Beulich writes: > On 03.04.2023 10:27, Richard Sandiford wrote: >> Jan Beulich writes: >>> On 03.04.2023 10:05, Richard Sandiford wrote: >>>> Jan Beulich writes: >>>>> On 30.03.2023 12:26, Richard Sandiford via Binutils wrote: >>>>>> Richard Sandiford (31): >>>>>> aarch64: Add +sme2 >>>>>> aarch64: Add a _10 suffix to FLD_imm3 >>>>>> aarch64: Add _off4 suffix to AARCH64_OPND_SME_ZA_array >>>>>> aarch64: Add support for vgx2 and vgx4 >>>>>> aarch64; Add support for vector offset ranges >>>>>> aarch64: Add support for predicate-as-counter registers >>>>>> aarch64: Add the SME2 MOVA instructions >>>>>> aarch64: Add the SME2 multivector LD1 and ST1 instructions >>>>> >>>>> Less than a 3rd of the patches in this series have made it to my mailbox >>>>> (and the list archives), so commenting on e.g. the one above is difficult. >>>> >>>> Yeah, they got held up in moderation due to the size. >>>> >>>>> Nevertheless - according to the documentation LD1x (scalar plus immediate, >>>>> consecutive registers) and their LDNT1x, ST1x, and STNT1x counterparts >>>>> are (unlike the strided forms) SVE2.1 insns, not SME2 ones (IOW it looks >>>>> as if the use of SME2_INSN() there is wrong, unless the documentation is >>>>> categorizing these incorrectly). >>>> >>>> They're both (but we haven't added SVE2p1 to binutils yet). >>>> E.g. see the pseudocode in: >>>> >>>> https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/LD1B--scalar-plus-immediate--consecutive-registers---Contiguous-load-of-bytes-to-multiple-consecutive-vectors--immediate-index--?lang=en >>>> >>>> where the condition is: >>>> >>>> if !HaveSME2() && !HaveSVE2p1() then UNDEFINED; >>>> >>>> Chronologically, SME2 predates SVE2p1. >>> >>> Yet aiui dependency-wise, like SVE2 is a prereq to SME, SVE2.1 is going >>> to be viewed as a prereq to SVE2.1? >> >> Do you mean SVE2p1 being a prereq to SME2? If so, no. FEAT_SME2 >> && !FEAT_SVE2p1 is a valid combination, and in that case, these >> instructions will only be available in streaming mode. The way the >> pseudo expresses this is: >> >> if HaveSVE2p1() then CheckSVEEnabled(); else CheckStreamingSVEEnabled(); > > That's different from the SME <-> SVE2 relationship then? Or is that > dependency wrong in tc-aarch64.c:aarch64_features[]? Yeah, it's a different relationship from SME <-> SVE2. For one thing, SVE2p1 includes things that SME2 doesn't, such as: https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/ADDQV--Unsigned-add-reduction-of-quadword-vector-segments-?lang=en FEAT_SME && !FEAT_SVE is architecturally valid, but we took the decision not to support it for tools. The rule that +sme implies +sve2 is therefore a software requirement rather than an ISA requirement. Thanks, Richard