From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) by sourceware.org (Postfix) with ESMTPS id E9EC5385840C for ; Mon, 24 Oct 2022 19:26:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E9EC5385840C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-x135.google.com with SMTP id bp15so18269218lfb.13 for ; Mon, 24 Oct 2022 12:26:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wx/f8C7geXZCMyfzzH/M8vc6yQtLMOTGisRUiO7X++s=; b=jihKNbBb1l5PI/0o/ZIwxj1xvZJ99xH8q7RAwMJZzDNLT8riTjw63NF+nylGWXrIF0 oRxuDip8OgHEme9lUQPSTwharIu48nY4hRu/Qy6m7Lxi3vINjfzeq8UGlfc7+f+x/LlV BYrDhFT5tPdCdJLVfeSbkEmrOO3xVhzGpTYdgZfLkM9RVsUmJ4ZuqJcth48sIzOPBZxI 6V6U1Itkc8Lj+/JTAxpaB2sD89YxEtEwmAfqyEyRVHsNNgDE/eLYEf13mvJ4ldBo98p2 o1p/uVGCUWlh042OU/Xv9k5SC1p38O2pEADTPQLrBN8S7TO1/pNu2xXLpuYgP3bFh7mA ro4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wx/f8C7geXZCMyfzzH/M8vc6yQtLMOTGisRUiO7X++s=; b=k/RmJA+jFGNdQtJZnv+t568Iz8DSvj7B+0IrGv725gohSMvPPU8V7DkD95rVhsVabz u0TUwSgkvVz736qOrCZ0xiy9XGKDiIMrhJcD/JymmyWj/EbVNTqRvSl97AkfEyQ9F1pq LoLO1FEhSRzNfS5XGDztPKZPorhK7xzRPdcOXTHvu33o5P6k5ykdRijMdO5QzsJ06jtq 9qZlpvnzuuWYA88RXNv3uLM03FWCh4rcszipTxOmm+IghK4Jr1xYrWXw5s/kmpFrWBYT 3XDlCsqnZHBQNMQoqxvcfZltiu4Ub3MLJfT4JxVjVOREKfoVsdK1oL9w8cCTVBSensSB N0Kg== X-Gm-Message-State: ACrzQf03x2GpHVVD571SQSdFjr0QuwH7mT4VuWcJAenY0S1Plv+beyI+ VP3F6ZKgONC3z6p8Zd5wjujDMjzbXJR45JnequxpZ+Qj3So= X-Google-Smtp-Source: AMsMyM7wWJ9fJA6czJhPWYHX6Mriz04Bu+hNV4MMgL8SQoN6nkSA3vaHa7rZW3cJ01HyNZgJcJU6TRXn6Row3F8sEVY= X-Received: by 2002:a05:6512:4002:b0:4a2:6243:8384 with SMTP id br2-20020a056512400200b004a262438384mr11966791lfb.29.1666639580605; Mon, 24 Oct 2022 12:26:20 -0700 (PDT) MIME-Version: 1.0 References: <20221014091248.4920-1-haochen.jiang@intel.com> <20221014091248.4920-4-haochen.jiang@intel.com> <29239b15-58d7-c57d-caf6-a23c40814f3e@suse.com> In-Reply-To: From: "H.J. Lu" Date: Mon, 24 Oct 2022 12:25:43 -0700 Message-ID: Subject: Re: [PATCH 03/10] Support Intel AVX-NE-CONVERT To: "Kong, Lingling" Cc: "Beulich, Jan" , "Jiang, Haochen" , "binutils@sourceware.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3016.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Oct 23, 2022 at 10:59 PM Kong, Lingling wrote: > > > Hi Jan, > > Thanks you so much for reviewing this patch. I really appreciate it. For these review comments, I have made some changes. > Sorry, Forgot to delete i386-init.h and i386-tbl.h in last email attachment. > > > -----Original Message----- > > From: Jan Beulich > > Sent: Friday, October 14, 2022 8:59 PM > > To: Jiang, Haochen > > Cc: hjl.tools@gmail.com; Kong, Lingling ; > > binutils@sourceware.org > > Subject: Re: [PATCH 03/10] Support Intel AVX-NE-CONVERT > > > > On 14.10.2022 11:12, Haochen Jiang wrote: > > > --- a/opcodes/i386-dis.c > > > +++ b/opcodes/i386-dis.c > > > @@ -937,6 +937,8 @@ enum > > > MOD_VEX_0F385E_X86_64_P_3_W_0, > > > MOD_VEX_0F388C, > > > MOD_VEX_0F388E, > > > + MOD_VEX_0F38B0, > > > + MOD_VEX_0F38B1, > > > MOD_VEX_0F3A30_L_0, > > > MOD_VEX_0F3A31_L_0, > > > MOD_VEX_0F3A32_L_0, > > > @@ -1132,6 +1134,9 @@ enum > > > PREFIX_VEX_0F3851_W_0, > > > PREFIX_VEX_0F385C_X86_64, > > > PREFIX_VEX_0F385E_X86_64, > > > + PREFIX_VEX_0F3872, > > > + PREFIX_VEX_0F38B0, > > > + PREFIX_VEX_0F38B1, > > > > PREFIX_VEX_0F38B0_M_1_W_0 and PREFIX_VEX_0F38B1_M_1_W_0 > > respectively, please. These enumerators are supposed to show the already > > decoded components. Plus they need to be unambiguous if insns appear which > > vary in (here) permitted ModR/M forms or the VEX.W bit. > > Yes, I changed it to PREFIX_VEX_0F38B0_M_1_W_0 and PREFIX_VEX_0F38B1_M_1_W_0. > > > > @@ -1526,8 +1531,11 @@ enum > > > VEX_W_0F385E_X86_64_P_1, > > > VEX_W_0F385E_X86_64_P_2, > > > VEX_W_0F385E_X86_64_P_3, > > > + VEX_W_0F3872_P_1, > > > VEX_W_0F3878, > > > VEX_W_0F3879, > > > + VEX_W_0F38B0, > > > + VEX_W_0F38B1, > > > > VEX_W_0F38B0_M_1 and VEX_W_0F38B1_M_1 respectively, please. > > > > > @@ -7618,6 +7654,14 @@ static const struct dis386 vex_w_table[][2] = { > > > /* VEX_W_0F3879 */ > > > { "vpbroadcastw", { XM, EXw }, PREFIX_DATA }, > > > }, > > > + { > > > + /* VEX_W_0F38B0 */ > > > > Nit: Indentation. > > Fixed. > > > > @@ -8428,6 +8472,14 @@ static const struct dis386 mod_table[][2] = { > > > /* MOD_VEX_0F388E */ > > > { "vpmaskmov%DQ", { Mx, Vex, XM }, PREFIX_DATA }, > > > }, > > > + { > > > + /* MOD_VEX_0F38B0*/ > > > + { VEX_W_TABLE (VEX_W_0F38B0) }, > > > + }, > > > + { > > > + /* MOD_VEX_0F38B1*/ > > > + { VEX_W_TABLE (VEX_W_0F38B1) }, > > > + }, > > > > Nit: Blanks missing in both of the comments. > > Fixed. > > > > --- a/opcodes/i386-gen.c > > > +++ b/opcodes/i386-gen.c > > > @@ -249,6 +249,8 @@ static initializer cpu_flag_init[] = > > > "CPU_AVX2_FLAGS|CpuAVX_IFMA" }, > > > { "CPU_AVX_VNNI_INT8_FLAGS", > > > "CPU_AVX2_FLAGS|CpuAVX_VNNI_INT8" }, > > > + { "CPU_AVX_NE_CONVERT_FLAGS", > > > + "CPU_AVX2_FLAGS|CpuAVX_NE_CONVERT" }, > > > { "CPU_IAMCU_FLAGS", > > > "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" }, > > > { "CPU_ADX_FLAGS", > > > @@ -447,6 +449,8 @@ static initializer cpu_flag_init[] = > > > "CpuAVX_IFMA" }, > > > { "CPU_ANY_AVX_VNNI_INT8_FLAGS", > > > "CpuAVX_VNNI_INT8" }, > > > + { "CPU_ANY_AVX_NE_CONVERT_FLAGS", > > > + "CpuAVX_NE_CONVERT" }, > > > }; > > > > Like for patches 1 and 2 - CPU_ANY_AVX2_FLAGS also need adjusting. > > Fixed. > > > > --- a/opcodes/i386-opc.tbl > > > +++ b/opcodes/i386-opc.tbl > > > @@ -3027,6 +3027,21 @@ movdir64b, 0x660f38f8, None, CpuMOVDIR64B, > > > Modrm|AddrPrefixOpReg, { Unspecified| > > > > > > // MOVEDIR instructions end. > > > > > > +// AVX_NE_CONVERT instructions. > > > + > > > +vbcstnebf162ps, 0xf3b1, None, CpuAVX_NE_CONVERT, > > Modrm|Vex|Space0F38|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No > > _sSuf|No_qSuf|No_ldSuf, {Word|Unspecified|BaseIndex, RegXMM|RegYMM} > > > +vbcstnesh2ps, 0x66b1, None, CpuAVX_NE_CONVERT, > > Modrm|Vex|Space0F38|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No > > _sSuf|No_qSuf|No_ldSuf, {Word|Unspecified|BaseIndex, RegXMM|RegYMM} > > > > Why CheckRegSize? > > Removed the CheckRegSize. > > > > +vcvtneoph2ps, 0xb0, None, CpuAVX_NE_CONVERT, > > Modrm|Vex|Space0F38|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No > > _sSuf|No_qSuf|No_ldSuf, {Xmmword|Ymmword|Unspecified|BaseIndex, > > RegXMM|RegYMM} > > > +vcvtneebf162ps, 0xf3b0, None, CpuAVX_NE_CONVERT, > > Modrm|Vex|Space0F38|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No > > _sSuf|No_qSuf|No_ldSuf, {Xmmword|Ymmword|Unspecified|BaseIndex, > > RegXMM|RegYMM} > > > +vcvtneeph2ps, 0x66b0, None, CpuAVX_NE_CONVERT, > > Modrm|Vex|Space0F38|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No > > _sSuf|No_qSuf|No_ldSuf, {Xmmword|Ymmword|Unspecified|BaseIndex, > > RegXMM|RegYMM} > > > +vcvtneobf162ps, 0xf2b0, None, CpuAVX_NE_CONVERT, > > Modrm|Vex|Space0F38|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No > > _sSuf|No_qSuf|No_ldSuf, {Xmmword|Ymmword|Unspecified|BaseIndex, > > RegXMM|RegYMM} > > > +vcvtneps2bf16, 0xf372, None, CpuAVX_NE_CONVERT, > > Modrm|PseudoVexPrefix|Vex128|Space0F38|VexW0|CheckRegSize|No_bSuf| > > No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, > > {Xmmword|RegXMM|Unspecified|BaseIndex, RegXMM} > > > +vcvtneps2bf16, 0xf372, None, CpuAVX_NE_CONVERT, > > Modrm|PseudoVexPrefix|Vex256|Space0F38|VexW0|CheckRegSize|No_bSuf| > > No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, > > {Ymmword|RegYMM|Unspecified|BaseIndex, RegXMM} > > > +vcvtneps2bf16x, 0xf372, None, CpuAVX_NE_CONVERT, > > Modrm|PseudoVexPrefix|Vex128|Space0F38|VexW0|CheckRegSize|No_bSuf| > > No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, > > {Xmmword|RegXMM|Unspecified|BaseIndex, RegXMM} > > > +vcvtneps2bf16y, 0xf372, None, CpuAVX_NE_CONVERT, > > Modrm|PseudoVexPrefix|Vex256|Space0F38|VexW0|CheckRegSize|No_bSuf| > > No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, > > {Ymmword|RegYMM|Unspecified|BaseIndex, RegXMM} > > > > Nit: Excess blank after the first comma on the last of the above lines, and excess > > blanks on all vcvt* lines before CpuAVX_NE_CONVERT. > > Fixed. > > > Unnecessary Xmmword / Ymmword on the last four lines. And there again - why > > CheckRegSize? But these would anyway benefit from using the AVX instance of > > the template - then AT&T syntax handling for the suffix-less variant would > > automatically be correct (it isn't in your code). Problem is that it wasn't clear > > that there'll be a need to re-use xy's AVX instance this late in the file, so the > > overloading of the name (commit e07ae9a3efee) will need undoing. To keep > > names short (see commit 390ddd6f6812) I'd recommend using Vxy and Exy (for > > the VEX and EVEX variants respectively). > > > > Like for the earlier patches I disagree with the uses of PseudoVexPrefix here (as > > said - the attribute should really go away again, and I have a patch doing so), but > > then ... > > Removed the CheckRegSize, Xmmword / Ymmword and PseudoVexPrefix. > And added and . > > > > + > > > +// AVX_NE_CONVERT instructions end. > > > + > > > // AVX512_BF16 instructions. > > > > > > vcvtne2ps2bf16, 0xf272, None, CpuAVX512_BF16, > > > > > Modrm|Space0F38|VexVVVV|Masking=3|VexW0|Broadcast|Disp8ShiftVL|Chec > > kRe > > > gSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { > > > RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, > > > RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } > > > > ... the new insns (at least the ones having AVX512 counterparts) need to be > > placed after the AVX512 ones (which you'll note use the template. > > > > Jan > > Yes, Now placed after the AVX512 ones. > > Thanks, > Lingling Please add some comments to explain how and are used, something like is used on VEX instructions with x/y suffixes and is used on EVEX instructions with x/y suffixes. Thanks. -- H.J.