From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id 482FA383940B for ; Wed, 18 May 2022 15:08:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 482FA383940B Received: by mail-pl1-x62a.google.com with SMTP id n8so2067654plh.1 for ; Wed, 18 May 2022 08:08:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=e6oQnKDE5wOv8dau6lqFKqQHfdqqGqTWxQ2l0YQnp1I=; b=Ff+eTXbMvjgPRK48nGfzabfu8rEXpgvtWKidGSkfkuMvoASEa7Bc7YjmlQZH8/6xSO HyhImH3CfX7Ffc/3ekbqWiDc1UBtHpRD3hO32R3LXGjskD7yGTok7aNxth6Z8B1GYexN 7JXr3elVX2ZA7sKyS0Nq4vrb9HvqSs1JCRhLToZVbOIyfY7/vqALDVU4hG0St8Axu+he Qr+a54j4C2bCtegtDCKjCs0oZEHOJsopQpW5TlG8zaapVsartzY1slK1Va+UnuJsDCSY kT3uqt5jISWAccaQPSLiWZk0RZiG7Jv8G75W/oDLr3zD32pvN2V/Oyv40PgUgGpp5PXj kuxg== X-Gm-Message-State: AOAM533vSalRKdgLL8a1Jq2TegnOP3yc80ZBCHbtoqOMt+TLp+/7eLp7 UooFC2HdBlgPz9E+zydGaoiY92OmnDNNSgAym9M= X-Google-Smtp-Source: ABdhPJxALd4EJqd3G7S+rOc84Yu84Rnpa3SGgJ+vIvSaJYQUJJ48TqL90k1VGKEhmYSQab+AQam8oJd1HUnlds08ROw= X-Received: by 2002:a17:90b:314e:b0:1dc:d143:a15d with SMTP id ip14-20020a17090b314e00b001dcd143a15dmr446762pjb.111.1652886503356; Wed, 18 May 2022 08:08:23 -0700 (PDT) MIME-Version: 1.0 References: <26c648e6-d76b-052e-6392-48265a859a7c@suse.com> <274695c3-fcaf-9af6-e6be-53c42568225f@suse.com> <86a0e15c-016a-8355-434d-fd2bd0c6f0d1@suse.com> In-Reply-To: <86a0e15c-016a-8355-434d-fd2bd0c6f0d1@suse.com> From: "H.J. Lu" Date: Wed, 18 May 2022 08:07:47 -0700 Message-ID: Subject: Re: [PATCH 0/5] x86/Intel: AVX512 syntax enhancements To: Jan Beulich Cc: "Cui, Lili" , Binutils Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3019.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2022 15:08:26 -0000 On Tue, May 17, 2022 at 11:40 PM Jan Beulich wrote: > > On 18.05.2022 05:15, Cui, Lili wrote: > >> -----Original Message----- > >> From: Jan Beulich > >> Sent: Tuesday, May 17, 2022 8:00 PM > >> To: Cui, Lili > >> Cc: H.J. Lu ; Binutils > >> Subject: Re: [PATCH 0/5] x86/Intel: AVX512 syntax enhancements > >> > >>> 1. If we use BCST instead {1to*}, it cannot directly reflect the broa= dcast > >> number. When the register size is zmm, but broadcast number is not the > >> same. > >>> > >>> -[ ]*[a-f0-9]+:[ ]*62 f5 54 58 58 31[ ]*vaddph zmm6,zmm5,W= ORD PTR > >> \[ecx\]\{1to32\} > >>> +[ ]*[a-f0-9]+:[ ]*62 f5 54 58 58 31[ ]*vaddph zmm6,zmm5,W= ORD > >> BCST \[ecx\] > >>> > >>> -[ ]*[a-f0-9]+:[ ]*62 65 7d df 5b 72 80[ ]*vcvtph2dq > >> zmm30\{k7\}\{z\},WORD PTR \[rdx-0x100\]\{1to16\} > >>> +[ ]*[a-f0-9]+:[ ]*62 65 7d df 5b 72 80[ ]*vcvtph2dq > >> zmm30\{k7\}\{z\},WORD BCST \[rdx-0x100\] > >> > >> This case is clearly disambiguated by the destination register. > >> What I think you're worried about are conversions where the field size > >> shrinks (e.g. from 32 bits to 16 bits, like in vcvtdq2ph). In this cas= e you will > >> note that for the purpose of keeping things unambiguous the disassembl= er > >> will continue to emit {1to}, and the assembler will continue to req= uire > >> that extra bit of information. > >> > > > > The format of appending {1to} for vcvtdq2ph special case is great. > > There is no ambiguity for the format of vcvtph2dq zmm30{k7}{z},WORD BCS= T [rdx-0x100], but we cannot direct know the N ({1to}) for this BCST for= mat, although we can confirm it with the SDM. I just trying to say for the = first impression, BAST format has this disadvantage. > > But that's no different for e.g. VADDPS - the element count isn't explici= t > anywhere, it's known from register kind only. > > I don't, btw, have insight into how MASM disambiguates VCVTDQ2PH and alik= e. > > >>> 2. Just remove the last comma, it's ok for me, I remember FP16 has an > >> instruction with {sae} on the middle position for the ATT format. But = the intel > >> format is placed at the end, I don't know if there is any problem. > >>> > >>> -[ ]*[a-f0-9]+:[ ]*62 f5 54 18 58 f4[ ]*vaddph zmm6,zmm5,z= mm4,\{rn- > >> sae\} > >>> +[ ]*[a-f0-9]+:[ ]*62 f5 54 18 58 f4[ ]*vaddph zmm6,zmm5,z= mm4\{rn- > >> sae\} > >>> > >>> FP16: > >>> vcvtusi2sh %edx, {rn-sae}, %xmm29, %xmm30 vcvtusi2sh > >>> xmm6,xmm5,edx\{rn-sae\} > >> > >> Well, yes, this is not only not a problem, but intended. See how the S= DM > >> places the rounding/SAE modifiers. It's also not FP16-specific in any = way. > >> > > > > Yes, SDM put the rounding/SAE behind the last register operand, if the = last operand is immediate, it will put rounding/SAE before the immediate. B= ut I don't quite understand why ATT format put it after %edx instead of bef= ore. > > That's a question I raised back at the time when introducing the Intel > syntax alternative. I don't recall having got a good answer. I guess I > can only forward to H.J. here ... AT&T syntax order is always different. SAE was new. I don't remember exa= ctly how the choice was made. --=20 H.J.