From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by sourceware.org (Postfix) with ESMTPS id D9336385840C for ; Mon, 24 Oct 2022 19:10:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D9336385840C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x12a.google.com with SMTP id r14so18325568lfm.2 for ; Mon, 24 Oct 2022 12:10:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=xOewVIyayQD87dxzaBH0QxEGvfTM7NO6sobUK83ziP4=; b=qYaOXLtcULpsjbIqF4PRl2Lf4wxHGWhYG/du5VlazYgNQHZwy4RvNsVnySQufapynp EgFclCrwL+2kyEJFDy+Ga/Gn1vHfQc/dyWSGMdsj7H0GD6U9AHMoTi4aynpF0uxf4Odt 6nXggMMffAP1d/cEnpOzhYZ3McOn05+nqjuNtt1L+vWFFLvmKUX0aaMNRf8SdFBylCKj g42VZ+die3ZsMVxFmiGbshMCS5dPpuJze1VxcYn7puXcfQRNChmttAebyTx6YChIcQsx Ta5bUkMfl+BEmouBzhKG6y9AZbNjwFh4AqMFZZmeC2jP0eokI748TSbY5FuDo5gPRu5u z3YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xOewVIyayQD87dxzaBH0QxEGvfTM7NO6sobUK83ziP4=; b=wxavCWwCaUCIQJeT1MDgZqT7Er1neBwYmTl46PwHiyYpD3b8JELAgiIgnL6BAkA0Eu Sn/8i34QLTs2QhDaFfk/KlOdHKEITNPyFudHo/REZvp9K+j03ibm/PFdE/rQqSi29x0A kaWQgGDqJT62dPx6XVDpGkN8zygu+9hRD8e8HAxG/1xu7h9K7Bo1pZ5aPhKndML0hila ePY8I09SidqrEqhmOsvGS6x9uYSFwtlbBBOk09ifniGHmswEtxnrrq+va1nDM9i48M56 FW2itHoEUkiZfwS7tqg6xGcDoLXS0RHv9avP+G73Vbr7pMbftIxN8TWEiiUr+vmWNm2m Up+Q== X-Gm-Message-State: ACrzQf3cnWgaTcL6tNyvfBmOdmEf3hipfOnV1POwS3Ir46NxDwYYkyOx JMMH4LjYnlfH2EBL8wdORBy3iA/B8LUqGkLq/S3JUQRn X-Google-Smtp-Source: AMsMyM7AVl2IORsNs9RkhkLvL6uE9ZO8HEb4B2ApZUB9aRGQnRk1rarlxi1tqfYpULqdv+2x0AB9EFf/87+ZH5rTKcw= X-Received: by 2002:a05:6512:3403:b0:48c:9727:50b0 with SMTP id i3-20020a056512340300b0048c972750b0mr11812584lfr.309.1666638628335; Mon, 24 Oct 2022 12:10:28 -0700 (PDT) MIME-Version: 1.0 References: <20221014091248.4920-1-haochen.jiang@intel.com> <20221014091248.4920-2-haochen.jiang@intel.com> <863655db-f202-477f-c638-00773c25886c@suse.com> In-Reply-To: From: "H.J. Lu" Date: Mon, 24 Oct 2022 12:09:51 -0700 Message-ID: Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA To: "Jiang, Haochen" Cc: "Beulich, Jan" , "Wang, Hongyu" , "binutils@sourceware.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3016.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Oct 23, 2022 at 10:53 PM Jiang, Haochen wrote: > > > -----Original Message----- > > From: Jan Beulich > > Sent: Friday, October 14, 2022 5:53 PM > > To: Jiang, Haochen > > Cc: hjl.tools@gmail.com; Wang, Hongyu ; > > binutils@sourceware.org > > Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA > > > > On 14.10.2022 11:12, Haochen Jiang wrote: > > > From: wwwhhhyyy > > > > > > x86: Support Intel AVX-IFMA > > > > > > Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is > > > cleared by default. Without {vex} pseudo prefix, Intel IFMA instructions > > > are encoded with EVEX prefix. {vex} pseudo prefix will turn on VEX > > > encoding for Intel IFMA instructions. > > > > I firmly object to the proliferation of this mis-feature. As expressed > > before for AVX-VNNI, as long as the user has disabled AVX512 (or > > respective sub-features thereof), there should be no need to use {vex} in > > the source code. There's also no reason at all to make the disassembler > > print {vex} prefixes - we don't do so for any other insns (apart from > > AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings > > (when none of the EVEX-specific features is used). > > > > I actually have a patch queued to undo the odd behavior for AVX-VNNI, at > > least on the assembler side (which also drops the PseudoVexPrefix > > attribute). > > Has rebased the patch to latest trunk and removed PseudoVexPrefix in table. > Also added some testcases just like how your patch did. > > > > > > --- a/opcodes/i386-dis.c > > > +++ b/opcodes/i386-dis.c > > > @@ -1526,6 +1526,8 @@ enum > > > VEX_W_0F385E_X86_64_P_3, > > > VEX_W_0F3878, > > > VEX_W_0F3879, > > > + VEX_W_0F38B4, > > > + VEX_W_0F38B5, > > > VEX_W_0F38CF, > > > VEX_W_0F3A00_L_1, > > > VEX_W_0F3A01_L_1, > > > @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = { > > > { Bad_Opcode }, > > > { Bad_Opcode }, > > > { Bad_Opcode }, > > > - { Bad_Opcode }, > > > - { Bad_Opcode }, > > > + { VEX_W_TABLE (VEX_W_0F38B4) }, > > > + { VEX_W_TABLE (VEX_W_0F38B5) }, > > > { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA }, > > > { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA }, > > > /* b8 */ > > > @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = { > > > /* VEX_W_0F3879 */ > > > { "vpbroadcastw", { XM, EXw }, PREFIX_DATA }, > > > }, > > > + { > > > + /* VEX_W_0F38B4 */ > > > + { Bad_Opcode }, > > > + { "%XV vpmadd52luq", { XM, Vex, EXx }, PREFIX_DATA }, > > > + }, > > > + { > > > + /* VEX_W_0F38B5 */ > > > + { Bad_Opcode }, > > > + { "%XV vpmadd52huq", { XM, Vex, EXx }, PREFIX_DATA }, > > > + }, > > > > Irrespective of the aspect mentioned at the top I think this is yet > > another case where VEX and EVEX table entries can be shared. This would > > (if the {vex} printing really needs retaining for whatever obscure > > reason) merely require the processing of %XV to do nothing for EVEX- > > encoded insns, plus of course the separating blank would then also need > > to be included in the processing of %XV. > > > > I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries, > > which you could then re-base on top of. > > Folded the table of AVX512IFMA and AVX-IFMA. > > > > > > --- a/opcodes/i386-gen.c > > > +++ b/opcodes/i386-gen.c > > > @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] = > > > "CPU_AVX512F_FLAGS|CpuAVX512_BF16" }, > > > { "CPU_AVX512_FP16_FLAGS", > > > "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" }, > > > + { "CPU_AVX_IFMA_FLAGS", > > > + "CPU_AVX2_FLAGS|CpuAVX_IFMA" }, > > > { "CPU_IAMCU_FLAGS", > > > "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" }, > > > { "CPU_ADX_FLAGS", > > > @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] = > > > "CpuHRESET" }, > > > { "CPU_ANY_AVX512_FP16_FLAGS", > > > "CpuAVX512_FP16" }, > > > + { "CPU_ANY_AVX_IFMA_FLAGS", > > > + "CpuAVX_IFMA" }, > > > > If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs > > adjustment, such that disabling of AVX2 also results in disabling of > > AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.) > > > > Added AVX-IFMA to CPU_ANY_AVX2_FLAGS. > > > > --- a/opcodes/i386-opc.tbl > > > +++ b/opcodes/i386-opc.tbl > > > @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16, > > Modrm|Masking=3|EVexMap6|VexW0|Broadcast > > > vrsqrtsh, 0x664f, None, CpuAVX512_FP16, > > Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|N > > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, > > { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM } > > > > > > // FP16 (HFNI) instructions end. > > > + > > > +// AVX_IFMA instructions. > > > > Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples > > of the (needless) use of underscores like this. > > > > > +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA, > > Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N > > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, > > { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, > > RegXMM|RegYMM } > > > +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA, > > Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N > > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, > > { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, > > RegXMM|RegYMM } > > > > Please use plain VexVVVV (without =1) - we want to have as little clutter as > > possible on these usually already overlong lines. > > Changed to VexVVVV. > > Thx for your review and see if there is still something need to be changed. OK. Thanks. > Haochen > > > > > Jan -- H.J.