From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id 1B5113853579 for ; Mon, 31 Oct 2022 16:53:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1B5113853579 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22c.google.com with SMTP id k19so17117848lji.2 for ; Mon, 31 Oct 2022 09:53:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Wb59N0irGLgPnHWVQSNjhrIsO3vKs2UP1OaVOpDY+TU=; b=PgS1DFOqiWETPCvo4J+XA+sLpuG71puAwoi1Ty/HmFMtiKG48C80iHETp0Sd2IAtbQ pnr2vv+zZo9MpJX39LxI6ozqOmSp7N/FehpooTgCa96S4XLGAglXbw7PaI4HiBj9I6tt yGrT1XMMgxFN6jghHMMhNQgqb9iLF2Agvb0TYxvPLuIa/6m6kWPn6f1KZPGz/qH0/mXC GwGeJeJgrcfoN0iugYzbGyX3it0DV3QNMHtyK0+M8WJldjRK/RAsly+gSW+ldEQ6le4E 6t9AtO3DkgTZo8ZgpxeQg62R4G2Ec36KzP6SjBJJv6fLi2VXrWe0BqooA8eNdkfTIQJt D+Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Wb59N0irGLgPnHWVQSNjhrIsO3vKs2UP1OaVOpDY+TU=; b=xR5QLg8EJOOorqrMMOydrY7Ij05g6ksvUjVUJTMNA5Eeeq4ntCEYkV3HSH04oH+lOL oxwz6N1eg7lOTM7n+VKlGSl4JtFYl1kTu2eUqbfGIlM4cPjiaK0OOMdlArvjulEdajZB LgbNRQA/RGuFUQolAXPF+t4VOT1E4t4N5Wi9MyP2Avk4bfOpHySXD9WbChOOjf4AWmvg HeU1f88eJ0AIENEjE9N6tW0tAgLH9vqnDjklfWwOVajiNeDYoEuboevrpPBPgN9HpR5+ D25R8Bo+7wb9nOugb2okuVsIlo9vv69wJ0DHMkD1wRXDvIVPai0rjx3XsfcamULD7DO1 iu+w== X-Gm-Message-State: ACrzQf1Nro9Vw1/OFgE7D/nxcELkmXFa+nyc4AG5vjOIN5HWuKHKmj1V zTi+8glp4Rlb70nOCGxvaWe8dWVX7JXGC1Qmp/ta8FmP X-Google-Smtp-Source: AMsMyM6nCXo4zjw7wZD/L22PsaqKD0EacyxfNGr27c5ro8L8T8WK7r373xRDrvHBeMRbqubRPHrcWN5Jl9v/FOLGX1Q= X-Received: by 2002:a2e:9794:0:b0:277:a8e:eb6 with SMTP id y20-20020a2e9794000000b002770a8e0eb6mr5767876lji.257.1667235225283; Mon, 31 Oct 2022 09:53:45 -0700 (PDT) MIME-Version: 1.0 References: <20221031030507.35588-1-haochen.jiang@intel.com> <20221031030507.35588-3-haochen.jiang@intel.com> In-Reply-To: <20221031030507.35588-3-haochen.jiang@intel.com> From: "H.J. Lu" Date: Mon, 31 Oct 2022 09:53:08 -0700 Message-ID: Subject: Re: [PATCH 2/6] Support Intel AVX-VNNI-INT8 To: Haochen Jiang Cc: binutils@sourceware.org, jbeulich@suse.com, "Cui,Lili" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3022.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_NUMSUBJECT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Oct 30, 2022 at 8:07 PM Haochen Jiang wro= te: > > From: "Cui,Lili" > > gas/ > * NEWS: Support Intel AVX-VNNI-INT8. > * config/tc-i386.c: Add avx_vnni_int8. > * doc/c-i386.texi: Document avx_vnni_int8. > * testsuite/gas/i386/avx-vnni-int8-intel.d: New file. > * testsuite/gas/i386/avx-vnni-int8.d: Likewise. > * testsuite/gas/i386/avx-vnni-int8.s: Likewise. > * testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d: Likewise. > * testsuite/gas/i386/x86-64-avx-vnni-int8.d: Likewise. > * testsuite/gas/i386/x86-64-avx-vnni-int8.s: Likewise. > * testsuite/gas/i386/i386.exp: Run AVX VNNI INT8 tests. > > opcodes/ > * i386-dis.c: (PREFIX_VEX_0F3850) New. > (PREFIX_VEX_0F3851): Likewise. > (VEX_W_0F3850_P_0): Likewise. > (VEX_W_0F3850_P_1): Likewise. > (VEX_W_0F3850_P_2): Likewise. > (VEX_W_0F3850_P_3): Likewise. > (VEX_W_0F3851_P_0): Likewise. > (VEX_W_0F3851_P_1): Likewise. > (VEX_W_0F3851_P_2): Likewise. > (VEX_W_0F3851_P_3): Likewise. > (VEX_W_0F3850): Delete. > (VEX_W_0F3851): Likewise. > (prefix_table): Add PREFIX_VEX_0F3850 and PREFIX_VEX_0F3851. > (vex_table): Add PREFIX_VEX_0F3850 and PREFIX_VEX_0F3851, > delete VEX_W_0F3850 and VEX_W_0F3851. > (vex_w_table): Add VEX_W_0F3850_P_0, VEX_W_0F3850_P_1, VEX_W_0F38= 50_P_2 > VEX_W_0F3850_P_3, VEX_W_0F3851_P_0, VEX_W_0F3851_P_1, VEX_W_0F385= 1_P_2 > and VEX_W_0F3851_P_3, delete VEX_W_0F3850 and VEX_W_0F3851. > * i386-gen.c: (cpu_flag_init): Add CPU_AVX_VNNI_INT8_FLAGS > and CPU_ANY_AVX_VNNI_INT8_FLAGS. > (cpu_flags): Add CpuAVX_VNNI_INT8. > * i386-opc.h (CpuAVX_VNNI_INT8): New. > * i386-opc.tbl: Add Intel AVX_VNNI_INT8 instructions. > * i386-init.h: Regenerated. > * i386-tbl.h: Likewise. > --- > gas/NEWS | 2 + > gas/config/tc-i386.c | 1 + > gas/doc/c-i386.texi | 3 +- > gas/testsuite/gas/i386/avx-vnni-int8-intel.d | 71 ++ > gas/testsuite/gas/i386/avx-vnni-int8.d | 71 ++ > gas/testsuite/gas/i386/avx-vnni-int8.s | 127 +++ > gas/testsuite/gas/i386/i386.exp | 4 + > .../gas/i386/x86-64-avx-vnni-int8-intel.d | 71 ++ > gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d | 71 ++ > gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s | 127 +++ > opcodes/i386-dis.c | 23 +- > opcodes/i386-gen.c | 7 +- > opcodes/i386-init.h | 140 +-- > opcodes/i386-opc.h | 5 +- > opcodes/i386-opc.tbl | 11 + > opcodes/i386-tbl.h | 882 ++++++++++-------- > 16 files changed, 1159 insertions(+), 457 deletions(-) > create mode 100644 gas/testsuite/gas/i386/avx-vnni-int8-intel.d > create mode 100644 gas/testsuite/gas/i386/avx-vnni-int8.d > create mode 100644 gas/testsuite/gas/i386/avx-vnni-int8.s > create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d > create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d > create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s > > diff --git a/gas/NEWS b/gas/NEWS > index 121aaa80c5..1547bfd469 100644 > --- a/gas/NEWS > +++ b/gas/NEWS > @@ -1,5 +1,7 @@ > -*- text -*- > > +* Add support for Intel AVX-VNNI-INT8 instructions. > + > * Add support for Intel AVX-IFMA instructions. > > * Add support for Intel PREFETCHI instructions. > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c > index adbc22de8d..26d8efb47e 100644 > --- a/gas/config/tc-i386.c > +++ b/gas/config/tc-i386.c > @@ -1097,6 +1097,7 @@ static const arch_entry cpu_arch[] =3D > SUBARCH (avx512_fp16, AVX512_FP16, ANY_AVX512_FP16, false), > SUBARCH (prefetchi, PREFETCHI, ANY_PREFETCHI, false), > SUBARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, false), > + SUBARCH (avx_vnni_int8, AVX_VNNI_INT8, ANY_AVX_VNNI_INT8, false), > }; > > #undef SUBARCH > diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi > index 7bdbd26538..029f5f2e04 100644 > --- a/gas/doc/c-i386.texi > +++ b/gas/doc/c-i386.texi > @@ -196,6 +196,7 @@ accept various extension mnemonics. For example, > @code{avx512_fp16}, > @code{prefetchi}, > @code{avx_ifma}, > +@code{avx_vnni_int8}, > @code{amx_int8}, > @code{amx_bf16}, > @code{amx_fp16}, > @@ -1489,7 +1490,7 @@ supported on the CPU specified. The choices for @v= ar{cpu_type} are: > @item @samp{.avx512_bitalg} @tab @samp{.avx512_bf16} @tab @samp{.avx512_= vp2intersect} > @item @samp{.tdx} @tab @samp{.avx_vnni} @tab @samp{.avx512_fp16} > @item @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} @tab @samp{.i= bt} > -@item @samp{.prefetchi} @tab @samp{.avx_ifma} > +@item @samp{.prefetchi} @tab @samp{.avx_ifma} @tab @samp{.avx_vnni_int8} > @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @s= amp{.cldemote} > @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpcl= mulqdq} > @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @s= amp{.tsxldtrk} > diff --git a/gas/testsuite/gas/i386/avx-vnni-int8-intel.d b/gas/testsuite= /gas/i386/avx-vnni-int8-intel.d > new file mode 100644 > index 0000000000..1d7d162f20 > --- /dev/null > +++ b/gas/testsuite/gas/i386/avx-vnni-int8-intel.d > @@ -0,0 +1,71 @@ > +#as: > +#objdump: -dw -Mintel > +#name: i386 AVX-VNNI-INT8 insns (Intel disassembly) > +#source: avx-vnni-int8.s > + > +.*: +file format .* > + > +Disassembly of section \.text: > + > +0+ <_start>: > +\s*[a-f0-9]+:\s*c4 e2 57 50 f4\s+vpdpbssd ymm6,ymm5,ymm4 > +\s*[a-f0-9]+:\s*c4 e2 53 50 f4\s+vpdpbssd xmm6,xmm5,xmm4 > +\s*[a-f0-9]+:\s*c4 e2 57 50 b4 f4 00 00 00 10\s+vpdpbssd ymm6,ymm5,YMMWO= RD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 57 50 31\s+vpdpbssd ymm6,ymm5,YMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 57 50 b1 e0 0f 00 00\s+vpdpbssd ymm6,ymm5,YMMWORD = PTR \[ecx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 e2 57 50 b2 00 f0 ff ff\s+vpdpbssd ymm6,ymm5,YMMWORD = PTR \[edx-0x1000\] > +\s*[a-f0-9]+:\s*c4 e2 53 50 b4 f4 00 00 00 10\s+vpdpbssd xmm6,xmm5,XMMWO= RD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 53 50 31\s+vpdpbssd xmm6,xmm5,XMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 53 50 b1 f0 07 00 00\s+vpdpbssd xmm6,xmm5,XMMWORD = PTR \[ecx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 e2 53 50 b2 00 f8 ff ff\s+vpdpbssd xmm6,xmm5,XMMWORD = PTR \[edx-0x800\] > +\s*[a-f0-9]+:\s*c4 e2 57 51 f4\s+vpdpbssds ymm6,ymm5,ymm4 > +\s*[a-f0-9]+:\s*c4 e2 53 51 f4\s+vpdpbssds xmm6,xmm5,xmm4 > +\s*[a-f0-9]+:\s*c4 e2 57 51 b4 f4 00 00 00 10\s+vpdpbssds ymm6,ymm5,YMMW= ORD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 57 51 31\s+vpdpbssds ymm6,ymm5,YMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 57 51 b1 e0 0f 00 00\s+vpdpbssds ymm6,ymm5,YMMWORD= PTR \[ecx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 e2 57 51 b2 00 f0 ff ff\s+vpdpbssds ymm6,ymm5,YMMWORD= PTR \[edx-0x1000\] > +\s*[a-f0-9]+:\s*c4 e2 53 51 b4 f4 00 00 00 10\s+vpdpbssds xmm6,xmm5,XMMW= ORD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 53 51 31\s+vpdpbssds xmm6,xmm5,XMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 53 51 b1 f0 07 00 00\s+vpdpbssds xmm6,xmm5,XMMWORD= PTR \[ecx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 e2 53 51 b2 00 f8 ff ff\s+vpdpbssds xmm6,xmm5,XMMWORD= PTR \[edx-0x800\] > +\s*[a-f0-9]+:\s*c4 e2 56 50 f4\s+vpdpbsud ymm6,ymm5,ymm4 > +\s*[a-f0-9]+:\s*c4 e2 52 50 f4\s+vpdpbsud xmm6,xmm5,xmm4 > +\s*[a-f0-9]+:\s*c4 e2 56 50 b4 f4 00 00 00 10\s+vpdpbsud ymm6,ymm5,YMMWO= RD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 56 50 31\s+vpdpbsud ymm6,ymm5,YMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 56 50 b1 e0 0f 00 00\s+vpdpbsud ymm6,ymm5,YMMWORD = PTR \[ecx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 e2 56 50 b2 00 f0 ff ff\s+vpdpbsud ymm6,ymm5,YMMWORD = PTR \[edx-0x1000\] > +\s*[a-f0-9]+:\s*c4 e2 52 50 b4 f4 00 00 00 10\s+vpdpbsud xmm6,xmm5,XMMWO= RD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 52 50 31\s+vpdpbsud xmm6,xmm5,XMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 52 50 b1 f0 07 00 00\s+vpdpbsud xmm6,xmm5,XMMWORD = PTR \[ecx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 e2 52 50 b2 00 f8 ff ff\s+vpdpbsud xmm6,xmm5,XMMWORD = PTR \[edx-0x800\] > +\s*[a-f0-9]+:\s*c4 e2 56 51 f4\s+vpdpbsuds ymm6,ymm5,ymm4 > +\s*[a-f0-9]+:\s*c4 e2 52 51 f4\s+vpdpbsuds xmm6,xmm5,xmm4 > +\s*[a-f0-9]+:\s*c4 e2 56 51 b4 f4 00 00 00 10\s+vpdpbsuds ymm6,ymm5,YMMW= ORD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 56 51 31\s+vpdpbsuds ymm6,ymm5,YMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 56 51 b1 e0 0f 00 00\s+vpdpbsuds ymm6,ymm5,YMMWORD= PTR \[ecx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 e2 56 51 b2 00 f0 ff ff\s+vpdpbsuds ymm6,ymm5,YMMWORD= PTR \[edx-0x1000\] > +\s*[a-f0-9]+:\s*c4 e2 52 51 b4 f4 00 00 00 10\s+vpdpbsuds xmm6,xmm5,XMMW= ORD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 52 51 31\s+vpdpbsuds xmm6,xmm5,XMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 52 51 b1 f0 07 00 00\s+vpdpbsuds xmm6,xmm5,XMMWORD= PTR \[ecx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 e2 52 51 b2 00 f8 ff ff\s+vpdpbsuds xmm6,xmm5,XMMWORD= PTR \[edx-0x800\] > +\s*[a-f0-9]+:\s*c4 e2 54 50 f4\s+vpdpbuud ymm6,ymm5,ymm4 > +\s*[a-f0-9]+:\s*c4 e2 50 50 f4\s+vpdpbuud xmm6,xmm5,xmm4 > +\s*[a-f0-9]+:\s*c4 e2 54 50 b4 f4 00 00 00 10\s+vpdpbuud ymm6,ymm5,YMMWO= RD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 54 50 31\s+vpdpbuud ymm6,ymm5,YMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 54 50 b1 e0 0f 00 00\s+vpdpbuud ymm6,ymm5,YMMWORD = PTR \[ecx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 e2 54 50 b2 00 f0 ff ff\s+vpdpbuud ymm6,ymm5,YMMWORD = PTR \[edx-0x1000\] > +\s*[a-f0-9]+:\s*c4 e2 50 50 b4 f4 00 00 00 10\s+vpdpbuud xmm6,xmm5,XMMWO= RD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 50 50 31\s+vpdpbuud xmm6,xmm5,XMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 50 50 b1 f0 07 00 00\s+vpdpbuud xmm6,xmm5,XMMWORD = PTR \[ecx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 e2 50 50 b2 00 f8 ff ff\s+vpdpbuud xmm6,xmm5,XMMWORD = PTR \[edx-0x800\] > +\s*[a-f0-9]+:\s*c4 e2 54 51 f4\s+vpdpbuuds ymm6,ymm5,ymm4 > +\s*[a-f0-9]+:\s*c4 e2 50 51 f4\s+vpdpbuuds xmm6,xmm5,xmm4 > +\s*[a-f0-9]+:\s*c4 e2 54 51 b4 f4 00 00 00 10\s+vpdpbuuds ymm6,ymm5,YMMW= ORD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 54 51 31\s+vpdpbuuds ymm6,ymm5,YMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 54 51 b1 e0 0f 00 00\s+vpdpbuuds ymm6,ymm5,YMMWORD= PTR \[ecx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 e2 54 51 b2 00 f0 ff ff\s+vpdpbuuds ymm6,ymm5,YMMWORD= PTR \[edx-0x1000\] > +\s*[a-f0-9]+:\s*c4 e2 50 51 b4 f4 00 00 00 10\s+vpdpbuuds xmm6,xmm5,XMMW= ORD PTR \[esp\+esi\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 e2 50 51 31\s+vpdpbuuds xmm6,xmm5,XMMWORD PTR \[ecx\] > +\s*[a-f0-9]+:\s*c4 e2 50 51 b1 f0 07 00 00\s+vpdpbuuds xmm6,xmm5,XMMWORD= PTR \[ecx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 e2 50 51 b2 00 f8 ff ff\s+vpdpbuuds xmm6,xmm5,XMMWORD= PTR \[edx-0x800\] > +#pass > diff --git a/gas/testsuite/gas/i386/avx-vnni-int8.d b/gas/testsuite/gas/i= 386/avx-vnni-int8.d > new file mode 100644 > index 0000000000..cd4499e59f > --- /dev/null > +++ b/gas/testsuite/gas/i386/avx-vnni-int8.d > @@ -0,0 +1,71 @@ > +#as: > +#objdump: -dw > +#name: i386 AVX-VNNI-INT8 insns > +#source: avx-vnni-int8.s > + > +.*: +file format .* > + > +Disassembly of section \.text: > + > +0+ <_start>: > +\s*[a-f0-9]+:\s*c4 e2 57 50 f4\s+vpdpbssd %ymm4,%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 53 50 f4\s+vpdpbssd %xmm4,%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 57 50 b4 f4 00 00 00 10\s+vpdpbssd 0x10000000\(%es= p,%esi,8\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 57 50 31\s+vpdpbssd \(%ecx\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 57 50 b1 e0 0f 00 00\s+vpdpbssd 0xfe0\(%ecx\),%ymm= 5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 57 50 b2 00 f0 ff ff\s+vpdpbssd -0x1000\(%edx\),%y= mm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 53 50 b4 f4 00 00 00 10\s+vpdpbssd 0x10000000\(%es= p,%esi,8\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 53 50 31\s+vpdpbssd \(%ecx\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 53 50 b1 f0 07 00 00\s+vpdpbssd 0x7f0\(%ecx\),%xmm= 5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 53 50 b2 00 f8 ff ff\s+vpdpbssd -0x800\(%edx\),%xm= m5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 57 51 f4\s+vpdpbssds %ymm4,%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 53 51 f4\s+vpdpbssds %xmm4,%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 57 51 b4 f4 00 00 00 10\s+vpdpbssds 0x10000000\(%e= sp,%esi,8\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 57 51 31\s+vpdpbssds \(%ecx\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 57 51 b1 e0 0f 00 00\s+vpdpbssds 0xfe0\(%ecx\),%ym= m5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 57 51 b2 00 f0 ff ff\s+vpdpbssds -0x1000\(%edx\),%= ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 53 51 b4 f4 00 00 00 10\s+vpdpbssds 0x10000000\(%e= sp,%esi,8\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 53 51 31\s+vpdpbssds \(%ecx\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 53 51 b1 f0 07 00 00\s+vpdpbssds 0x7f0\(%ecx\),%xm= m5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 53 51 b2 00 f8 ff ff\s+vpdpbssds -0x800\(%edx\),%x= mm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 56 50 f4\s+vpdpbsud %ymm4,%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 52 50 f4\s+vpdpbsud %xmm4,%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 56 50 b4 f4 00 00 00 10\s+vpdpbsud 0x10000000\(%es= p,%esi,8\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 56 50 31\s+vpdpbsud \(%ecx\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 56 50 b1 e0 0f 00 00\s+vpdpbsud 0xfe0\(%ecx\),%ymm= 5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 56 50 b2 00 f0 ff ff\s+vpdpbsud -0x1000\(%edx\),%y= mm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 52 50 b4 f4 00 00 00 10\s+vpdpbsud 0x10000000\(%es= p,%esi,8\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 52 50 31\s+vpdpbsud \(%ecx\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 52 50 b1 f0 07 00 00\s+vpdpbsud 0x7f0\(%ecx\),%xmm= 5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 52 50 b2 00 f8 ff ff\s+vpdpbsud -0x800\(%edx\),%xm= m5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 56 51 f4\s+vpdpbsuds %ymm4,%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 52 51 f4\s+vpdpbsuds %xmm4,%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 56 51 b4 f4 00 00 00 10\s+vpdpbsuds 0x10000000\(%e= sp,%esi,8\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 56 51 31\s+vpdpbsuds \(%ecx\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 56 51 b1 e0 0f 00 00\s+vpdpbsuds 0xfe0\(%ecx\),%ym= m5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 56 51 b2 00 f0 ff ff\s+vpdpbsuds -0x1000\(%edx\),%= ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 52 51 b4 f4 00 00 00 10\s+vpdpbsuds 0x10000000\(%e= sp,%esi,8\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 52 51 31\s+vpdpbsuds \(%ecx\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 52 51 b1 f0 07 00 00\s+vpdpbsuds 0x7f0\(%ecx\),%xm= m5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 52 51 b2 00 f8 ff ff\s+vpdpbsuds -0x800\(%edx\),%x= mm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 54 50 f4\s+vpdpbuud %ymm4,%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 50 50 f4\s+vpdpbuud %xmm4,%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 54 50 b4 f4 00 00 00 10\s+vpdpbuud 0x10000000\(%es= p,%esi,8\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 54 50 31\s+vpdpbuud \(%ecx\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 54 50 b1 e0 0f 00 00\s+vpdpbuud 0xfe0\(%ecx\),%ymm= 5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 54 50 b2 00 f0 ff ff\s+vpdpbuud -0x1000\(%edx\),%y= mm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 50 50 b4 f4 00 00 00 10\s+vpdpbuud 0x10000000\(%es= p,%esi,8\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 50 50 31\s+vpdpbuud \(%ecx\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 50 50 b1 f0 07 00 00\s+vpdpbuud 0x7f0\(%ecx\),%xmm= 5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 50 50 b2 00 f8 ff ff\s+vpdpbuud -0x800\(%edx\),%xm= m5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 54 51 f4\s+vpdpbuuds %ymm4,%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 50 51 f4\s+vpdpbuuds %xmm4,%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 54 51 b4 f4 00 00 00 10\s+vpdpbuuds 0x10000000\(%e= sp,%esi,8\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 54 51 31\s+vpdpbuuds \(%ecx\),%ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 54 51 b1 e0 0f 00 00\s+vpdpbuuds 0xfe0\(%ecx\),%ym= m5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 54 51 b2 00 f0 ff ff\s+vpdpbuuds -0x1000\(%edx\),%= ymm5,%ymm6 > +\s*[a-f0-9]+:\s*c4 e2 50 51 b4 f4 00 00 00 10\s+vpdpbuuds 0x10000000\(%e= sp,%esi,8\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 50 51 31\s+vpdpbuuds \(%ecx\),%xmm5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 50 51 b1 f0 07 00 00\s+vpdpbuuds 0x7f0\(%ecx\),%xm= m5,%xmm6 > +\s*[a-f0-9]+:\s*c4 e2 50 51 b2 00 f8 ff ff\s+vpdpbuuds -0x800\(%edx\),%x= mm5,%xmm6 > +#pass > diff --git a/gas/testsuite/gas/i386/avx-vnni-int8.s b/gas/testsuite/gas/i= 386/avx-vnni-int8.s > new file mode 100644 > index 0000000000..e3cfeb6680 > --- /dev/null > +++ b/gas/testsuite/gas/i386/avx-vnni-int8.s > @@ -0,0 +1,127 @@ > +# Check 32bit AVX-VNNI-INT8 instructions > + > + .allow_index_reg > + .text > +_start: > + vpdpbssd %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbssd %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbssd 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNN= I-INT8 > + vpdpbssd (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbssd 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbssd -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbssd 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNN= I-INT8 > + vpdpbssd (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbssd 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbssd -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbssds %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbssds %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbssds 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNN= I-INT8 > + vpdpbssds (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbssds 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbssds -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbssds 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNN= I-INT8 > + vpdpbssds (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbssds 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbssds -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbsud %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbsud %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbsud 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNN= I-INT8 > + vpdpbsud (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbsud 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbsud -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbsud 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNN= I-INT8 > + vpdpbsud (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbsud 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbsud -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbsuds %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbsuds %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbsuds 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNN= I-INT8 > + vpdpbsuds (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbsuds 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbsuds -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbsuds 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNN= I-INT8 > + vpdpbsuds (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbsuds 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbsuds -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbuud %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbuud %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbuud 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNN= I-INT8 > + vpdpbuud (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbuud 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbuud -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbuud 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNN= I-INT8 > + vpdpbuud (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbuud 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbuud -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbuuds %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbuuds %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbuuds 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNN= I-INT8 > + vpdpbuuds (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 > + vpdpbuuds 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbuuds -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbuuds 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNN= I-INT8 > + vpdpbuuds (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 > + vpdpbuuds 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbuuds -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + > +.intel_syntax noprefix > + vpdpbssd ymm6, ymm5, ymm4 #AVX-VNNI-INT8 > + vpdpbssd xmm6, xmm5, xmm4 #AVX-VNNI-INT8 > + vpdpbssd ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssd ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbssd ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbssd ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbssd xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssd xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbssd xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbssd xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbssds ymm6, ymm5, ymm4 #AVX-VNNI-INT8 > + vpdpbssds xmm6, xmm5, xmm4 #AVX-VNNI-INT8 > + vpdpbssds ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssds ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbssds ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbssds ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbssds xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssds xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbssds xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbssds xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbsud ymm6, ymm5, ymm4 #AVX-VNNI-INT8 > + vpdpbsud xmm6, xmm5, xmm4 #AVX-VNNI-INT8 > + vpdpbsud ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsud ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbsud ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbsud ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbsud xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsud xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbsud xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbsud xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbsuds ymm6, ymm5, ymm4 #AVX-VNNI-INT8 > + vpdpbsuds xmm6, xmm5, xmm4 #AVX-VNNI-INT8 > + vpdpbsuds ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsuds ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbsuds ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbsuds ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbsuds xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsuds xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbsuds xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbsuds xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbuud ymm6, ymm5, ymm4 #AVX-VNNI-INT8 > + vpdpbuud xmm6, xmm5, xmm4 #AVX-VNNI-INT8 > + vpdpbuud ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuud ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbuud ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbuud ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbuud xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuud xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbuud xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbuud xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbuuds ymm6, ymm5, ymm4 #AVX-VNNI-INT8 > + vpdpbuuds xmm6, xmm5, xmm4 #AVX-VNNI-INT8 > + vpdpbuuds ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuuds ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbuuds ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbuuds ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbuuds xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuuds xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 > + vpdpbuuds xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbuuds xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i38= 6.exp > index 96ab1a02d1..b75fe85cb3 100644 > --- a/gas/testsuite/gas/i386/i386.exp > +++ b/gas/testsuite/gas/i386/i386.exp > @@ -477,6 +477,8 @@ if [gas_32_check] then { > run_dump_test "avx-ifma" > run_dump_test "avx-ifma-intel" > run_list_test "avx-ifma-inval" > + run_dump_test "avx-vnni-int8" > + run_dump_test "avx-vnni-int8-intel" > run_list_test "sg" > run_dump_test "clzero" > run_dump_test "invlpgb" > @@ -1148,6 +1150,8 @@ if [gas_64_check] then { > run_dump_test "x86-64-avx-ifma" > run_dump_test "x86-64-avx-ifma-intel" > run_list_test "x86-64-avx-ifma-inval" > + run_dump_test "x86-64-avx-vnni-int8" > + run_dump_test "x86-64-avx-vnni-int8-intel" > run_dump_test "x86-64-clzero" > run_dump_test "x86-64-mwaitx-bdver4" > run_list_test "x86-64-mwaitx-reg" > diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d b/gas/te= stsuite/gas/i386/x86-64-avx-vnni-int8-intel.d > new file mode 100644 > index 0000000000..61c01124ef > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d > @@ -0,0 +1,71 @@ > +#as: > +#objdump: -dw -Mintel > +#name: x86_64 AVX-VNNI-INT8 insns (Intel disassembly) > +#source: x86-64-avx-vnni-int8.s > + > +.*: +file format .* > + > +Disassembly of section \.text: > + > +0+ <_start>: > +\s*[a-f0-9]+:\s*c4 42 37 50 d0\s+vpdpbssd ymm10,ymm9,ymm8 > +\s*[a-f0-9]+:\s*c4 42 33 50 d0\s+vpdpbssd xmm10,xmm9,xmm8 > +\s*[a-f0-9]+:\s*c4 22 37 50 94 f5 00 00 00 10\s+vpdpbssd ymm10,ymm9,YMMW= ORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 37 50 11\s+vpdpbssd ymm10,ymm9,YMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 37 50 91 e0 0f 00 00\s+vpdpbssd ymm10,ymm9,YMMWORD= PTR \[rcx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 62 37 50 92 00 f0 ff ff\s+vpdpbssd ymm10,ymm9,YMMWORD= PTR \[rdx-0x1000\] > +\s*[a-f0-9]+:\s*c4 22 33 50 94 f5 00 00 00 10\s+vpdpbssd xmm10,xmm9,XMMW= ORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 33 50 11\s+vpdpbssd xmm10,xmm9,XMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 33 50 91 f0 07 00 00\s+vpdpbssd xmm10,xmm9,XMMWORD= PTR \[rcx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 62 33 50 92 00 f8 ff ff\s+vpdpbssd xmm10,xmm9,XMMWORD= PTR \[rdx-0x800\] > +\s*[a-f0-9]+:\s*c4 42 37 51 d0\s+vpdpbssds ymm10,ymm9,ymm8 > +\s*[a-f0-9]+:\s*c4 42 33 51 d0\s+vpdpbssds xmm10,xmm9,xmm8 > +\s*[a-f0-9]+:\s*c4 22 37 51 94 f5 00 00 00 10\s+vpdpbssds ymm10,ymm9,YMM= WORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 37 51 11\s+vpdpbssds ymm10,ymm9,YMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 37 51 91 e0 0f 00 00\s+vpdpbssds ymm10,ymm9,YMMWOR= D PTR \[rcx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 62 37 51 92 00 f0 ff ff\s+vpdpbssds ymm10,ymm9,YMMWOR= D PTR \[rdx-0x1000\] > +\s*[a-f0-9]+:\s*c4 22 33 51 94 f5 00 00 00 10\s+vpdpbssds xmm10,xmm9,XMM= WORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 33 51 11\s+vpdpbssds xmm10,xmm9,XMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 33 51 91 f0 07 00 00\s+vpdpbssds xmm10,xmm9,XMMWOR= D PTR \[rcx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 62 33 51 92 00 f8 ff ff\s+vpdpbssds xmm10,xmm9,XMMWOR= D PTR \[rdx-0x800\] > +\s*[a-f0-9]+:\s*c4 42 36 50 d0\s+vpdpbsud ymm10,ymm9,ymm8 > +\s*[a-f0-9]+:\s*c4 42 32 50 d0\s+vpdpbsud xmm10,xmm9,xmm8 > +\s*[a-f0-9]+:\s*c4 22 36 50 94 f5 00 00 00 10\s+vpdpbsud ymm10,ymm9,YMMW= ORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 36 50 11\s+vpdpbsud ymm10,ymm9,YMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 36 50 91 e0 0f 00 00\s+vpdpbsud ymm10,ymm9,YMMWORD= PTR \[rcx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 62 36 50 92 00 f0 ff ff\s+vpdpbsud ymm10,ymm9,YMMWORD= PTR \[rdx-0x1000\] > +\s*[a-f0-9]+:\s*c4 22 32 50 94 f5 00 00 00 10\s+vpdpbsud xmm10,xmm9,XMMW= ORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 32 50 11\s+vpdpbsud xmm10,xmm9,XMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 32 50 91 f0 07 00 00\s+vpdpbsud xmm10,xmm9,XMMWORD= PTR \[rcx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 62 32 50 92 00 f8 ff ff\s+vpdpbsud xmm10,xmm9,XMMWORD= PTR \[rdx-0x800\] > +\s*[a-f0-9]+:\s*c4 42 36 51 d0\s+vpdpbsuds ymm10,ymm9,ymm8 > +\s*[a-f0-9]+:\s*c4 42 32 51 d0\s+vpdpbsuds xmm10,xmm9,xmm8 > +\s*[a-f0-9]+:\s*c4 22 36 51 94 f5 00 00 00 10\s+vpdpbsuds ymm10,ymm9,YMM= WORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 36 51 11\s+vpdpbsuds ymm10,ymm9,YMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 36 51 91 e0 0f 00 00\s+vpdpbsuds ymm10,ymm9,YMMWOR= D PTR \[rcx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 62 36 51 92 00 f0 ff ff\s+vpdpbsuds ymm10,ymm9,YMMWOR= D PTR \[rdx-0x1000\] > +\s*[a-f0-9]+:\s*c4 22 32 51 94 f5 00 00 00 10\s+vpdpbsuds xmm10,xmm9,XMM= WORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 32 51 11\s+vpdpbsuds xmm10,xmm9,XMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 32 51 91 f0 07 00 00\s+vpdpbsuds xmm10,xmm9,XMMWOR= D PTR \[rcx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 62 32 51 92 00 f8 ff ff\s+vpdpbsuds xmm10,xmm9,XMMWOR= D PTR \[rdx-0x800\] > +\s*[a-f0-9]+:\s*c4 42 34 50 d0\s+vpdpbuud ymm10,ymm9,ymm8 > +\s*[a-f0-9]+:\s*c4 42 30 50 d0\s+vpdpbuud xmm10,xmm9,xmm8 > +\s*[a-f0-9]+:\s*c4 22 34 50 94 f5 00 00 00 10\s+vpdpbuud ymm10,ymm9,YMMW= ORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 34 50 11\s+vpdpbuud ymm10,ymm9,YMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 34 50 91 e0 0f 00 00\s+vpdpbuud ymm10,ymm9,YMMWORD= PTR \[rcx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 62 34 50 92 00 f0 ff ff\s+vpdpbuud ymm10,ymm9,YMMWORD= PTR \[rdx-0x1000\] > +\s*[a-f0-9]+:\s*c4 22 30 50 94 f5 00 00 00 10\s+vpdpbuud xmm10,xmm9,XMMW= ORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 30 50 11\s+vpdpbuud xmm10,xmm9,XMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 30 50 91 f0 07 00 00\s+vpdpbuud xmm10,xmm9,XMMWORD= PTR \[rcx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 62 30 50 92 00 f8 ff ff\s+vpdpbuud xmm10,xmm9,XMMWORD= PTR \[rdx-0x800\] > +\s*[a-f0-9]+:\s*c4 42 34 51 d0\s+vpdpbuuds ymm10,ymm9,ymm8 > +\s*[a-f0-9]+:\s*c4 42 30 51 d0\s+vpdpbuuds xmm10,xmm9,xmm8 > +\s*[a-f0-9]+:\s*c4 22 34 51 94 f5 00 00 00 10\s+vpdpbuuds ymm10,ymm9,YMM= WORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 34 51 11\s+vpdpbuuds ymm10,ymm9,YMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 34 51 91 e0 0f 00 00\s+vpdpbuuds ymm10,ymm9,YMMWOR= D PTR \[rcx\+0xfe0\] > +\s*[a-f0-9]+:\s*c4 62 34 51 92 00 f0 ff ff\s+vpdpbuuds ymm10,ymm9,YMMWOR= D PTR \[rdx-0x1000\] > +\s*[a-f0-9]+:\s*c4 22 30 51 94 f5 00 00 00 10\s+vpdpbuuds xmm10,xmm9,XMM= WORD PTR \[rbp\+r14\*8\+0x10000000\] > +\s*[a-f0-9]+:\s*c4 42 30 51 11\s+vpdpbuuds xmm10,xmm9,XMMWORD PTR \[r9\] > +\s*[a-f0-9]+:\s*c4 62 30 51 91 f0 07 00 00\s+vpdpbuuds xmm10,xmm9,XMMWOR= D PTR \[rcx\+0x7f0\] > +\s*[a-f0-9]+:\s*c4 62 30 51 92 00 f8 ff ff\s+vpdpbuuds xmm10,xmm9,XMMWOR= D PTR \[rdx-0x800\] > +#pass > diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d b/gas/testsuit= e/gas/i386/x86-64-avx-vnni-int8.d > new file mode 100644 > index 0000000000..90faed581b > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d > @@ -0,0 +1,71 @@ > +#as: > +#objdump: -dw > +#name: x86_64 AVX-VNNI-INT8 insns > +#source: x86-64-avx-vnni-int8.s > + > +.*: +file format .* > + > +Disassembly of section \.text: > + > +0+ <_start>: > +\s*[a-f0-9]+:\s*c4 42 37 50 d0\s+vpdpbssd %ymm8,%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 33 50 d0\s+vpdpbssd %xmm8,%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 22 37 50 94 f5 00 00 00 10\s+vpdpbssd 0x10000000\(%rb= p,%r14,8\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 37 50 11\s+vpdpbssd \(%r9\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 37 50 91 e0 0f 00 00\s+vpdpbssd 0xfe0\(%rcx\),%ymm= 9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 37 50 92 00 f0 ff ff\s+vpdpbssd -0x1000\(%rdx\),%y= mm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 22 33 50 94 f5 00 00 00 10\s+vpdpbssd 0x10000000\(%rb= p,%r14,8\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 33 50 11\s+vpdpbssd \(%r9\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 33 50 91 f0 07 00 00\s+vpdpbssd 0x7f0\(%rcx\),%xmm= 9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 33 50 92 00 f8 ff ff\s+vpdpbssd -0x800\(%rdx\),%xm= m9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 37 51 d0\s+vpdpbssds %ymm8,%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 33 51 d0\s+vpdpbssds %xmm8,%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 22 37 51 94 f5 00 00 00 10\s+vpdpbssds 0x10000000\(%r= bp,%r14,8\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 37 51 11\s+vpdpbssds \(%r9\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 37 51 91 e0 0f 00 00\s+vpdpbssds 0xfe0\(%rcx\),%ym= m9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 37 51 92 00 f0 ff ff\s+vpdpbssds -0x1000\(%rdx\),%= ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 22 33 51 94 f5 00 00 00 10\s+vpdpbssds 0x10000000\(%r= bp,%r14,8\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 33 51 11\s+vpdpbssds \(%r9\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 33 51 91 f0 07 00 00\s+vpdpbssds 0x7f0\(%rcx\),%xm= m9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 33 51 92 00 f8 ff ff\s+vpdpbssds -0x800\(%rdx\),%x= mm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 36 50 d0\s+vpdpbsud %ymm8,%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 32 50 d0\s+vpdpbsud %xmm8,%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 22 36 50 94 f5 00 00 00 10\s+vpdpbsud 0x10000000\(%rb= p,%r14,8\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 36 50 11\s+vpdpbsud \(%r9\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 36 50 91 e0 0f 00 00\s+vpdpbsud 0xfe0\(%rcx\),%ymm= 9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 36 50 92 00 f0 ff ff\s+vpdpbsud -0x1000\(%rdx\),%y= mm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 22 32 50 94 f5 00 00 00 10\s+vpdpbsud 0x10000000\(%rb= p,%r14,8\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 32 50 11\s+vpdpbsud \(%r9\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 32 50 91 f0 07 00 00\s+vpdpbsud 0x7f0\(%rcx\),%xmm= 9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 32 50 92 00 f8 ff ff\s+vpdpbsud -0x800\(%rdx\),%xm= m9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 36 51 d0\s+vpdpbsuds %ymm8,%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 32 51 d0\s+vpdpbsuds %xmm8,%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 22 36 51 94 f5 00 00 00 10\s+vpdpbsuds 0x10000000\(%r= bp,%r14,8\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 36 51 11\s+vpdpbsuds \(%r9\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 36 51 91 e0 0f 00 00\s+vpdpbsuds 0xfe0\(%rcx\),%ym= m9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 36 51 92 00 f0 ff ff\s+vpdpbsuds -0x1000\(%rdx\),%= ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 22 32 51 94 f5 00 00 00 10\s+vpdpbsuds 0x10000000\(%r= bp,%r14,8\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 32 51 11\s+vpdpbsuds \(%r9\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 32 51 91 f0 07 00 00\s+vpdpbsuds 0x7f0\(%rcx\),%xm= m9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 32 51 92 00 f8 ff ff\s+vpdpbsuds -0x800\(%rdx\),%x= mm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 34 50 d0\s+vpdpbuud %ymm8,%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 30 50 d0\s+vpdpbuud %xmm8,%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 22 34 50 94 f5 00 00 00 10\s+vpdpbuud 0x10000000\(%rb= p,%r14,8\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 34 50 11\s+vpdpbuud \(%r9\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 34 50 91 e0 0f 00 00\s+vpdpbuud 0xfe0\(%rcx\),%ymm= 9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 34 50 92 00 f0 ff ff\s+vpdpbuud -0x1000\(%rdx\),%y= mm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 22 30 50 94 f5 00 00 00 10\s+vpdpbuud 0x10000000\(%rb= p,%r14,8\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 30 50 11\s+vpdpbuud \(%r9\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 30 50 91 f0 07 00 00\s+vpdpbuud 0x7f0\(%rcx\),%xmm= 9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 30 50 92 00 f8 ff ff\s+vpdpbuud -0x800\(%rdx\),%xm= m9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 34 51 d0\s+vpdpbuuds %ymm8,%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 30 51 d0\s+vpdpbuuds %xmm8,%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 22 34 51 94 f5 00 00 00 10\s+vpdpbuuds 0x10000000\(%r= bp,%r14,8\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 42 34 51 11\s+vpdpbuuds \(%r9\),%ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 34 51 91 e0 0f 00 00\s+vpdpbuuds 0xfe0\(%rcx\),%ym= m9,%ymm10 > +\s*[a-f0-9]+:\s*c4 62 34 51 92 00 f0 ff ff\s+vpdpbuuds -0x1000\(%rdx\),%= ymm9,%ymm10 > +\s*[a-f0-9]+:\s*c4 22 30 51 94 f5 00 00 00 10\s+vpdpbuuds 0x10000000\(%r= bp,%r14,8\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 42 30 51 11\s+vpdpbuuds \(%r9\),%xmm9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 30 51 91 f0 07 00 00\s+vpdpbuuds 0x7f0\(%rcx\),%xm= m9,%xmm10 > +\s*[a-f0-9]+:\s*c4 62 30 51 92 00 f8 ff ff\s+vpdpbuuds -0x800\(%rdx\),%x= mm9,%xmm10 > +#pass > diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s b/gas/testsuit= e/gas/i386/x86-64-avx-vnni-int8.s > new file mode 100644 > index 0000000000..bc9145b26f > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s > @@ -0,0 +1,127 @@ > +# Check 64bit AVX-VNNI-INT8 instructions > + > + .allow_index_reg > + .text > +_start: > + vpdpbssd %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbssd %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbssd 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 = #AVX-VNNI-INT8 > + vpdpbssd (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbssd 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbssd -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbssd 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 = #AVX-VNNI-INT8 > + vpdpbssd (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbssd 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbssd -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbssds %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbssds %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbssds 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 = #AVX-VNNI-INT8 > + vpdpbssds (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbssds 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbssds -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbssds 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 = #AVX-VNNI-INT8 > + vpdpbssds (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbssds 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbssds -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbsud %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbsud %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbsud 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 = #AVX-VNNI-INT8 > + vpdpbsud (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbsud 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbsud -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbsud 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 = #AVX-VNNI-INT8 > + vpdpbsud (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbsud 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbsud -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbsuds %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbsuds %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbsuds 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 = #AVX-VNNI-INT8 > + vpdpbsuds (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbsuds 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbsuds -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbsuds 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 = #AVX-VNNI-INT8 > + vpdpbsuds (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbsuds 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbsuds -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbuud %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbuud %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbuud 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 = #AVX-VNNI-INT8 > + vpdpbuud (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbuud 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbuud -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbuud 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 = #AVX-VNNI-INT8 > + vpdpbuud (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbuud 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbuud -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + vpdpbuuds %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbuuds %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbuuds 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 = #AVX-VNNI-INT8 > + vpdpbuuds (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 > + vpdpbuuds 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(e00f0000) > + vpdpbuuds -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 D= isp32(00f0ffff) > + vpdpbuuds 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 = #AVX-VNNI-INT8 > + vpdpbuuds (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 > + vpdpbuuds 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(f0070000) > + vpdpbuuds -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 D= isp32(00f8ffff) > + > +.intel_syntax noprefix > + vpdpbssd ymm10, ymm9, ymm8 #AVX-VNNI-INT8 > + vpdpbssd xmm10, xmm9, xmm8 #AVX-VNNI-INT8 > + vpdpbssd ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssd ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbssd ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbssd ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbssd xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssd xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbssd xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbssd xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbssds ymm10, ymm9, ymm8 #AVX-VNNI-INT8 > + vpdpbssds xmm10, xmm9, xmm8 #AVX-VNNI-INT8 > + vpdpbssds ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssds ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbssds ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbssds ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbssds xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbssds xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbssds xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbssds xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbsud ymm10, ymm9, ymm8 #AVX-VNNI-INT8 > + vpdpbsud xmm10, xmm9, xmm8 #AVX-VNNI-INT8 > + vpdpbsud ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsud ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbsud ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbsud ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbsud xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsud xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbsud xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbsud xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbsuds ymm10, ymm9, ymm8 #AVX-VNNI-INT8 > + vpdpbsuds xmm10, xmm9, xmm8 #AVX-VNNI-INT8 > + vpdpbsuds ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsuds ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbsuds ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbsuds ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbsuds xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbsuds xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbsuds xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbsuds xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbuud ymm10, ymm9, ymm8 #AVX-VNNI-INT8 > + vpdpbuud xmm10, xmm9, xmm8 #AVX-VNNI-INT8 > + vpdpbuud ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuud ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbuud ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbuud ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbuud xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuud xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbuud xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbuud xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > + vpdpbuuds ymm10, ymm9, ymm8 #AVX-VNNI-INT8 > + vpdpbuuds xmm10, xmm9, xmm8 #AVX-VNNI-INT8 > + vpdpbuuds ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuuds ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbuuds ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNN= I-INT8 Disp32(e00f0000) > + vpdpbuuds ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNN= I-INT8 Disp32(00f0ffff) > + vpdpbuuds xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] = #AVX-VNNI-INT8 > + vpdpbuuds xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 > + vpdpbuuds xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNN= I-INT8 Disp32(f0070000) > + vpdpbuuds xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNN= I-INT8 Disp32(00f8ffff) > diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c > index ba232939d7..436d2e7a08 100644 > --- a/opcodes/i386-dis.c > +++ b/opcodes/i386-dis.c > @@ -1132,6 +1132,8 @@ enum > PREFIX_VEX_0FF0, > PREFIX_VEX_0F3849_X86_64, > PREFIX_VEX_0F384B_X86_64, > + PREFIX_VEX_0F3850_W_0, > + PREFIX_VEX_0F3851_W_0, > PREFIX_VEX_0F385C_X86_64, > PREFIX_VEX_0F385E_X86_64, > PREFIX_VEX_0F38F5_L_0, > @@ -4014,6 +4016,21 @@ static const struct dis386 prefix_table[][4] =3D { > { VEX_W_TABLE (VEX_W_0F384B_X86_64_P_3) }, > }, > > + /* PREFIX_VEX_0F3850_W_0 */ > + { > + { "vpdpbuud", { XM, Vex, EXx }, 0 }, > + { "vpdpbsud", { XM, Vex, EXx }, 0 }, > + { "%XVvpdpbusd", { XM, Vex, EXx }, 0 }, > + { "vpdpbssd", { XM, Vex, EXx }, 0 }, > + }, > + > + /* PREFIX_VEX_0F3851_W_0 */ > + { > + { "vpdpbuuds", { XM, Vex, EXx }, 0 }, > + { "vpdpbsuds", { XM, Vex, EXx }, 0 }, > + { "%XVvpdpbusds", { XM, Vex, EXx }, 0 }, > + { "vpdpbssds", { XM, Vex, EXx }, 0 }, > + }, > /* PREFIX_VEX_0F385C_X86_64 */ > { > { Bad_Opcode }, > @@ -7575,11 +7592,11 @@ static const struct dis386 vex_w_table[][2] =3D { > }, > { > /* VEX_W_0F3850 */ > - { "%XVvpdpbusd", { XM, Vex, EXx }, PREFIX_DATA }, > + { PREFIX_TABLE (PREFIX_VEX_0F3850_W_0) }, > }, > { > - /* VEX_W_0F3851 */ > - { "%XVvpdpbusds", { XM, Vex, EXx }, PREFIX_DATA }, > + /* VEX_W_0F3851_P_0 */ > + { PREFIX_TABLE (PREFIX_VEX_0F3851_W_0) }, > }, > { > /* VEX_W_0F3852 */ > diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c > index dd759fbc7c..21986220d6 100644 > --- a/opcodes/i386-gen.c > +++ b/opcodes/i386-gen.c > @@ -249,6 +249,8 @@ static initializer cpu_flag_init[] =3D > "CpuPREFETCHI"}, > { "CPU_AVX_IFMA_FLAGS", > "CPU_AVX2_FLAGS|CpuAVX_IFMA" }, > + { "CPU_AVX_VNNI_INT8_FLAGS", > + "CPU_AVX2_FLAGS|CpuAVX_VNNI_INT8" }, > { "CPU_IAMCU_FLAGS", > "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" }, > { "CPU_ADX_FLAGS", > @@ -376,7 +378,7 @@ static initializer cpu_flag_init[] =3D > { "CPU_ANY_AVX_FLAGS", > "CPU_ANY_AVX2_FLAGS|CpuF16C|CpuFMA|CpuFMA4|CpuXOP|CpuAVX" }, > { "CPU_ANY_AVX2_FLAGS", > - "CPU_ANY_AVX512F_FLAGS|CpuAVX2|CpuAVX_VNNI|CpuAVX_IFMA" }, > + "CPU_ANY_AVX512F_FLAGS|CpuAVX2|CpuAVX_VNNI|CpuAVX_IFMA|CpuAVX_VNNI_I= NT8" }, > { "CPU_ANY_AVX512F_FLAGS", > "CpuAVX512F|CpuAVX512CD|CpuAVX512ER|CpuAVX512PF|CpuAVX512DQ|CPU_ANY_= AVX512BW_FLAGS|CpuAVX512VL|CpuAVX512IFMA|CpuAVX512VBMI|CpuAVX512_4FMAPS|Cpu= AVX512_4VNNIW|CpuAVX512_VPOPCNTDQ|CpuAVX512_VBMI2|CpuAVX512_VNNI|CpuAVX512_= BITALG|CpuAVX512_BF16|CpuAVX512_VP2INTERSECT" }, > { "CPU_ANY_AVX512CD_FLAGS", > @@ -449,6 +451,8 @@ static initializer cpu_flag_init[] =3D > "CpuPREFETCHI" }, > { "CPU_ANY_AVX_IFMA_FLAGS", > "CpuAVX_IFMA" }, > + { "CPU_ANY_AVX_VNNI_INT8_FLAGS", > + "CpuAVX_VNNI_INT8" }, > }; > > static initializer operand_type_init[] =3D > @@ -652,6 +656,7 @@ static bitfield cpu_flags[] =3D > BITFIELD (CpuAVX512_FP16), > BITFIELD (CpuPREFETCHI), > BITFIELD (CpuAVX_IFMA), > + BITFIELD (CpuAVX_VNNI_INT8), > BITFIELD (CpuMWAITX), > BITFIELD (CpuCLZERO), > BITFIELD (CpuOSPKE), > diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h > index 7cd601e924..905908749b 100644 > --- a/opcodes/i386-opc.h > +++ b/opcodes/i386-opc.h > @@ -213,6 +213,8 @@ enum > CpuPREFETCHI, > /* Intel AVX IFMA Instructions support required. */ > CpuAVX_IFMA, > + /* Intel AVX VNNI-INT8 Instructions support required. */ > + CpuAVX_VNNI_INT8, > /* mwaitx instruction required */ > CpuMWAITX, > /* Clzero instruction required */ > @@ -296,7 +298,7 @@ enum > > /* If you get a compiler error for zero width of the unused field, > comment it out. */ > -#define CpuUnused (CpuMax + 1) > +// #define CpuUnused (CpuMax + 1) > > /* We can check if an instruction is available with array instead > of bitfield. */ > @@ -396,6 +398,7 @@ typedef union i386_cpu_flags > unsigned int cpuavx512_fp16:1; > unsigned int cpuprefetchi:1; > unsigned int cpuavx_ifma:1; > + unsigned int cpuavx_vnni_int8:1; > unsigned int cpumwaitx:1; > unsigned int cpuclzero:1; > unsigned int cpuospke:1; > diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl > index 489a5335e2..77a5787c4b 100644 > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -2888,6 +2888,17 @@ vpdpwssds, 0x6653, None, CpuAVX_VNNI, Modrm|Vex|Sp= ace0F38|VexVVVV|VexW0|CheckReg > > // AVX_VNNI instructions end > > +// AVX-VNNI-INT8 instructions. > + > +vpdpbuud, 0x50, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW= 0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|R= egYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } > +vpdpbuuds, 0x51, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|Vex= W0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|= RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } > +vpdpbssd, 0xf250, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|Ve= xW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM= |RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } > +vpdpbssds, 0xf251, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|V= exW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXM= M|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } > +vpdpbsud, 0xf350, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|Ve= xW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM= |RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } > +vpdpbsuds, 0xf351, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|V= exW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXM= M|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } > + > +// AVX-VNNI-INT8 instructions end. > + > // AVX512_BITALG instructions > > vpopcnt, 0x6654, None, CpuAVX512_BITALG, Modrm|Masking=3D3|Space0F38= ||Disp8ShiftVL|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSu= f|No_ldSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|Reg= ZMM } > -- > 2.18.1 > OK. Thanks. --=20 H.J.