From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2089.outbound.protection.outlook.com [40.107.7.89]) by sourceware.org (Postfix) with ESMTPS id 6395D3857738 for ; Fri, 15 Sep 2023 08:47:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6395D3857738 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F8iMPG0X8nzDlsx13JP8Zmil9D/bFfQ6PylSPBg4cZ4FHrHRGI1WwgLgawx/t46b6EIkkOyl6lTvx8CD0+HLljwG3gwei4a3yZMACe1fYdEGCtzz+xihedmOvLrmK9SIBsht41GcFeYeN6NuR+VeC7HIIVF+BSRMmZzj0VZLGUt90TybNfnTrJomzmP+jKEoB6uAQDpeFgPW6mHCInlAIMujXzSU9bYyYCcj9x1r+JKP9Vns0QxAZwkIpMtt30Scn0v4L1BMEzxdckHzPTUANEPZHl7Qk+J3aiX1p5M8WlTGVadeAyA8VF6lBBB2rJxwcc2AirmcJEfhcoA7gkwh3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FOke4AUXtx7Zexxt0e5uwAhtatnVqOVDDW+xupID1pU=; b=aD2+Cbuz5Klz0+spjGqm4SCNSdFNAtNIqyDBiQJw5TXNpMs7FAao+blON3MIrzg5DzqukZodZtprsbgkJsHq5FDQQjMSFB9daMm/h3vlQB4AbIvznbly5AhWoSccVeLP/ofbe3A42HUhcQbe8iVufYA/iYLeJxzr+26oLeit7yj70sHl5GakLLVF7mtck+AZc7OTxW+SxEX/XFZo+xRC+BsVR0ErXcD8REoqRl14+Tu/saSgCzNVqsTDTdVbDafS73bRMUBKybZKqFfHDeg33l497FPf3+E5R2n/iCUFx3rOTjHQDjjVN7/QzkDAADlMQNPArEWZ4qBRijFHI2d5zw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FOke4AUXtx7Zexxt0e5uwAhtatnVqOVDDW+xupID1pU=; b=JQXf4J94Um10GE+j3Fql4OoQRwyKY9sJ3Hns1GvgwR2RaacRgNkJfJkBTO1Es2qQG78fWJ1vmRNDcG1zdWE9xH1Fwsr5KZCvo0h+6ONUPGJS1FQ2/uJ6EKnPy+w5UcE0TVooSLOO/qIQpHvwGUnkruLgZKCzhD8IGrdaPdutO7ZStRdVNnyeDvAqhhAhqAA5PoKIVnefJapTaY0SdO0gHGwkOumHvpUpgHvoww+S9QrOq71dGqOuMh/c+KTUkn3K/mbO9Nzo1gMskoLTgXf2Mgi2yEDVmsPeF8WI7ppobSmz0pbNfSM6iagL2jMO5cGpTCeEpe8lZkYNe3vH0h1QuQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by AM9PR04MB8382.eurprd04.prod.outlook.com (2603:10a6:20b:3ea::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.21; Fri, 15 Sep 2023 08:47:44 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654%6]) with mapi id 15.20.6792.020; Fri, 15 Sep 2023 08:47:44 +0000 Message-ID: Date: Fri, 15 Sep 2023 10:47:42 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: [PATCH 1/4] x86: fold certain VEX and EVEX templates Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> From: Jan Beulich In-Reply-To: <0690c179-ac98-d127-5ff4-b5abb725b6ae@suse.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR0P281CA0241.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:af::19) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|AM9PR04MB8382:EE_ X-MS-Office365-Filtering-Correlation-Id: a0f1ff5e-1952-4aba-efd9-08dbb5c86d9c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: x4OPLY+TK4w//5Mh8GouyNIsq+IEXbzcTkYzckhGOQBsTEgeHes9h5x8uMxxjkWE7hXTNzhifaKyZz5KW7vFEHuSI+nGUO0y2lQMtm6w73gLVAgXa2QRi8HedIELspItxGvt7KWd7rM1AL3+xvDbPNNbWDmcb5B6CJTYhIm/nlYfX18IS6iJ0UU8+eSmvppgq3yI5OP52XHVbW+O+Y3rTnJNN2GU1vf565OaFcBOqJU8mLZTiAJtwuucQrTuU7gsmW2KPLoRXyZxpi6aT5Q8nXVkkBNcDQtFFLdMDc8fTjQcZsfx4uH0djxp2Ngxt7BlsnkOEj5EVDHjGLPKubVraAt1EhA+IFv5+jj8n26PrNAPHQ2mVU0tMm2XciXT2baqBxux4j9gTWnnZRv2Wo05c8zXre/TWDZfZmINElFV4fmPtgx9rvXAQJJOukcTtBn7PSa+kCv7nA4/D/w+7ps9kkTdZKISYMKfKQbPJhN0IoOLEik40ewsG7huTAUTYUHLFNGQt/qjIkL0WoYzBGpVjXuX1Tuek+TuSnBsSSmghWlRnq7gfPs0VivPRl8GECkeZsvkYOnEbOeqXartBZeY/Hv99+FuNpDNfSPX/dbVp3oQsmck1ccWoleZczbl6RdzHXW0x/veSjzUtlZhEMYOTQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DU2PR04MB8790.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(396003)(376002)(346002)(366004)(136003)(39860400002)(1800799009)(451199024)(186009)(31686004)(5660300002)(41300700001)(6916009)(316002)(478600001)(86362001)(31696002)(66946007)(66556008)(66476007)(36756003)(19627235002)(38100700002)(8676002)(4326008)(8936002)(6486002)(6506007)(26005)(2906002)(83380400001)(6512007)(30864003)(2616005)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?ODBmcXJpSFA2Q2tlcExubnE1Ri8rNlZkdjNLOVNBU3NyUnN2Uk1wZVRNcFB6?= =?utf-8?B?ZXhqZGJjcHlEVXI3ejhicUxDWTVrRW5tYU9yWjBnNVlvMFN1aE45TWorWUln?= =?utf-8?B?REMxZTI1NllnV2JIM2RmSk1KMFpEK3lZZmk2YVB0L3VUUld6VmF0dWFWUmdG?= =?utf-8?B?aE5kbEhaeXJpK0FyWG92T1R6SGZWWlBwZSs1RnNFbVNONUtqZ014MGpycGt1?= =?utf-8?B?MEs2S1lKK0JaUElrQ3BmY3FZbmhuQThDZWg5ZmtyQ25PMGJNL1hGcHFzeDQx?= =?utf-8?B?U3BRVVgzcU4zWjV0OEh4SnN2MFJJM0F5N3ZwSXUwbjNpaDhlMWVsMG9vVldu?= =?utf-8?B?VUQza0F6Zk90L2FEMmdmRGxXaTJrQW43NTlZb1ZoV1phb202RzZlZ2FZbEgw?= =?utf-8?B?VVU1V1RIaXYreS90REIrK2dSc2JXd2ZjSVhzMWRkNEJoQ1JLOWZzQ2xnellq?= =?utf-8?B?QWZKVmZVNFcrSE45c2FxRkxwSjlnZElPZW1XTnZaM0NMM2tSR0tCSnN2bE9W?= =?utf-8?B?dVBaMUVHUlV6WCtoZitDQ1RmeTNvUzBvOTNWQ1pmVlVqWlVtRUFkNGdGRXdY?= =?utf-8?B?dDNzcVErUEh4TDQ5b3ZmRUY0bHBpbndMcFNVVEgwWWhqT0RuUUNxR0ZJMzY1?= =?utf-8?B?NnZvSFptWmZyUWRFbUNCMHV0QWR2TTFXbXBtMS85SzJDTjJXMTJsWFdzNzBy?= =?utf-8?B?QjNjSE81MUd1TitYM0t5cExPcStWVjNWRmlmMUtBaGEvRFlNbnJoaENnMGc1?= =?utf-8?B?WUNDcGkvU1ljQnM1NUU4Y2tEeCtjRE1aRyt3YzNXZm1VMnZ0NVM5RllQR2hn?= =?utf-8?B?b3ZOMlNkMHpTUjZJdHR1SE84RWRaOEtaWkh0OHhUZS80U3ZaM2ZWakY5QU1s?= =?utf-8?B?SGNGbHc0Qy9POGNCcE5UYUhCalJiT1BiZTlhVU0rL2xoeG56Nm5vS3dab25z?= =?utf-8?B?VDBpN0xXaGFzbEtnbG0rdjhOS094RWs4cmpVRG95LzZJV2ZOMlFkYUs5OFhO?= =?utf-8?B?NmNzdlFqTmJIVnczRm1ncHJkTGxZZVRJSUJTZVhNWnd1RU1hUExVZlllcjhV?= =?utf-8?B?TWFib040cFJlQ2F3QlY2Z2VuSXBsMlFqdEZLS3V0R2pBTzd1L0QyNE9LTWdX?= =?utf-8?B?ZjFmR2o1V2hnTFdWRDhEb3FtcGdjY2lHOHNPWmNnUlcxN3MrK01RTDNRdmx4?= =?utf-8?B?TTUvYjVYSEVPWVE3UVM1N3pJUFRXYXhMa2JyeitaeDgxRzZZVCs3ZzdKTmJa?= =?utf-8?B?eFdtd3BNRFo1dFBmT01oQXZZUEdFTWpOUmZ5bXVRRS8zVEYybHhOQ2lzQmJo?= =?utf-8?B?bU5DUTl5R2IxdXdCb0pWbHlkL1dSdUh4OFhGWEg2dkR3UllXM0JhbUtyYWUw?= =?utf-8?B?SmRYVEMwTlJ2MjRTTUkvOHJuY3YvOVp2ZURaNk92SGV3OXFYMGFMdlZ2eW5w?= =?utf-8?B?T09RamRyQTFkUkdtTnNVQXNvR3QyTHd2TkRYS1R5VnQ4eTBSNk1HSnUzMDF2?= =?utf-8?B?RkxCdTRUQllNcUo3YUFyajNKUUpESndPL0QvbStMUms0cHR2Z3dCbnp1Z1d3?= =?utf-8?B?c2JFeENCYTgzT2lBSmtzQ01lemRKbFQ1L2dNZ0drS3FBS3J3WkdCZHV0L2F4?= =?utf-8?B?WjNwSFV0N1puNG1KUnk0OXQxZmpJTGxHaGpXNFRuR21kRnNITXg5dXcvRS9s?= =?utf-8?B?bTlOdlhZcG4wakYwM1R1U3d0a0FYM05laG8rOG11Q1VLRmJzVDRoRURPaTM3?= =?utf-8?B?a2Y5QVV1MU5TbkF5OUcwUU9NcUpKb1pOQk9jT0NkV0VGNjNxVkhmUGpKWGps?= =?utf-8?B?dHJQZmNHT3NZRmkwNDFiYW5kZ1FjNWFETndTa2t2NTM3V0N1WDZNdktPNVUz?= =?utf-8?B?NkRoVm94MFJTNkYyS3VyOGhjWTRudHVjUHJzakZwQUVUdkNFaC9IdW00MUtx?= =?utf-8?B?Z0FyMjdaL0hQTmQ0S1RpZXRxMmRzckQyQkZRUHlHVkpUWnFJSk4yUjlBL1dF?= =?utf-8?B?TTZxVkRzenJ1U21KLzhlWnN4RmdGVWZCdzBLSUtzcUxObStnNFY5STZaMFlS?= =?utf-8?B?LzEyaVRNbzlYTHF2OSsxRVliUHZOd2dzRkM0dm5QWEFXMTkranFuSmVBdmZL?= =?utf-8?Q?TWKIs9dwIRJkQGHbqwgMdafZl?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: a0f1ff5e-1952-4aba-efd9-08dbb5c86d9c X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Sep 2023 08:47:44.1698 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 2nTaDxi4fJTcrorz7NBtkiBnGVmgIIspuPSAlPaqaaw9WZegjMTNvnOGA103e0wq0atqVL+wC6AkCuiDsffckA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR04MB8382 X-Spam-Status: No, score=-3026.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: In anticipation of APX introduce logic to reduce the number of templates we have now, allowing to limit some the number of ones we then need to gain. The fundamental requirements are that - attributes be compatible, which specifically means VexW needs to be the same in the templates (which often isn't the case, for VEX encodings having far more WIG tha, EVEX ones), - the EVEX form being AVX512F (with or without AVX512VL), not any of its extensions (the same will then be required for APX - it'll need to be APX_F). Note that in check_register() there's now a redundant zmm check. Since this logic will need revisiting for APX anyway, I'd like to keep it that way for now. (Similarly a couple of if()-s which could be folded are kept separate, to reduce code churn when adding APX support.) --- RFC: Of course there are quite a few code changes, so there is the question of the savings being worth it. The AVX512F constraint may be possible to relax, but the change is big enough already. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -436,6 +436,7 @@ struct _i386_insn vex_encoding_vex, vex_encoding_vex3, vex_encoding_evex, + vex_encoding_evex512, vex_encoding_error } vec_encoding; @@ -1872,6 +1873,13 @@ cpu_flags_and_not (i386_cpu_flags x, i38 static const i386_cpu_flags avx512 = CPU_ANY_AVX512F_FLAGS; +static INLINE bool need_evex_encoding (void) +{ + return i.vec_encoding == vex_encoding_evex + || i.vec_encoding == vex_encoding_evex512 + || i.mask.reg; +} + #define CPU_FLAGS_ARCH_MATCH 0x1 #define CPU_FLAGS_64BIT_MATCH 0x2 @@ -1899,6 +1907,27 @@ cpu_flags_match (const insn_template *t) /* This instruction is available only on some archs. */ i386_cpu_flags cpu = cpu_arch_flags; + /* Dual VEX/EVEX templates may need stripping of one of the flags. */ + if (t->opcode_modifier.vex && t->opcode_modifier.evex) + { + /* Dual AVX/AVX512F templates need to retain AVX512F only if we already + know that EVEX encoding will be needed. */ + if ((x.bitfield.cpuavx || x.bitfield.cpuavx2) + && x.bitfield.cpuavx512f) + { + if (need_evex_encoding ()) + { + x.bitfield.cpuavx = 0; + x.bitfield.cpuavx2 = 0; + } + else + { + x.bitfield.cpuavx512f = 0; + x.bitfield.cpuavx512vl = 0; + } + } + } + /* AVX512VL is no standalone feature - match it and then strip it. */ if (x.bitfield.cpuavx512vl && !cpu.bitfield.cpuavx512vl) return match; @@ -3646,6 +3675,27 @@ install_template (const insn_template *t i.tm = *t; + /* Dual VEX/EVEX templates need stripping one of the possible variants. */ + if (t->opcode_modifier.vex && t->opcode_modifier.evex) + { + if ((is_cpu (t, CpuAVX) || is_cpu (t, CpuAVX2)) + && is_cpu (t, CpuAVX512F)) + { + if (need_evex_encoding ()) + { + i.tm.opcode_modifier.vex = 0; + i.tm.cpu.bitfield.cpuavx = 0; + if (is_cpu (&i.tm, CpuAVX2)) + i.tm.cpu.bitfield.isa = 0; + } + else + { + i.tm.opcode_modifier.evex = 0; + i.tm.cpu.bitfield.cpuavx512f = 0; + } + } + } + /* Note that for pseudo prefixes this produces a length of 1. But for them the length isn't interesting at all. */ for (l = 1; l < 4; ++l) @@ -4553,6 +4603,8 @@ optimize_encoding (void) i.tm.opcode_modifier.vex = VEX128; i.tm.opcode_modifier.vexw = VEXW0; i.tm.opcode_modifier.evex = 0; + i.vec_encoding = vex_encoding_vex; + i.mask.reg = NULL; } else if (optimize > 1) i.tm.opcode_modifier.evex = EVEX128; @@ -5438,6 +5490,11 @@ md_assemble (char *line) if (optimize && !i.no_optimize && i.tm.opcode_modifier.optimize) optimize_encoding (); + /* Past optimization there's no need to distinguish vex_encoding_evex and + vex_encoding_evex512 anymore. */ + if (i.vec_encoding == vex_encoding_evex512) + i.vec_encoding = vex_encoding_evex; + if (use_unaligned_vector_move) encode_with_unaligned_vector_move (); @@ -5467,6 +5524,7 @@ md_assemble (char *line) if (i.tm.operand_types[j].bitfield.tmmword) i.xstate |= xstate_tmm; else if (i.tm.operand_types[j].bitfield.zmmword + && !i.tm.opcode_modifier.vex && vector_size >= VSZ512) i.xstate |= xstate_zmm; else if (i.tm.operand_types[j].bitfield.ymmword @@ -6468,7 +6526,8 @@ check_VecOperands (const insn_template * cpu = cpu_flags_and (cpu_flags_from_attr (t->cpu), avx512); if (!cpu_flags_all_zero (&cpu) && !is_cpu (t, CpuAVX512VL) - && !cpu_arch_flags.bitfield.cpuavx512vl) + && !cpu_arch_flags.bitfield.cpuavx512vl + && (!t->opcode_modifier.vex || need_evex_encoding())) { for (op = 0; op < t->operands; ++op) { @@ -6779,6 +6838,8 @@ check_VecOperands (const insn_template * /* Check vector Disp8 operand. */ if (t->opcode_modifier.disp8memshift + && (!t->opcode_modifier.vex + || need_evex_encoding ()) && i.disp_encoding <= disp_encoding_8bit) { if (i.broadcast.type || i.broadcast.bytes) @@ -6874,7 +6935,8 @@ VEX_check_encoding (const insn_template return 1; } - if (i.vec_encoding == vex_encoding_evex) + if (i.vec_encoding == vex_encoding_evex + || i.vec_encoding == vex_encoding_evex512) { /* This instruction must be encoded with EVEX prefix. */ if (!is_evex_encoding (t)) @@ -11211,6 +11273,10 @@ s_insn (int dummy ATTRIBUTE_UNUSED) goto done; } + /* No need to distinguish vex_encoding_evex and vex_encoding_evex512. */ + if (i.vec_encoding == vex_encoding_evex512) + i.vec_encoding = vex_encoding_evex; + /* Are we to emit ModR/M encoding? */ if (!i.short_form && (i.mem_operands @@ -11633,6 +11699,12 @@ RC_SAE_specifier (const char *pstr) return NULL; } + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex512; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + return NULL; + i.rounding.type = RC_NamesTable[j].type; return (char *)(pstr + RC_NamesTable[j].len); @@ -11692,6 +11764,12 @@ check_VecOperations (char *op_string) } op_string++; + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + goto unknown_vec_op; + i.broadcast.type = bcst_type; i.broadcast.operand = this_operand; @@ -13953,8 +14031,17 @@ static bool check_register (const reg_en } } - if (vector_size < VSZ512 && r->reg_type.bitfield.zmmword) - return false; + if (r->reg_type.bitfield.zmmword) + { + if (vector_size < VSZ512) + return false; + + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex512; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + i.vec_encoding = vex_encoding_error; + } if (vector_size < VSZ256 && r->reg_type.bitfield.ymmword) return false; @@ -13979,7 +14066,8 @@ static bool check_register (const reg_en || flag_code != CODE_64BIT) return false; - if (i.vec_encoding == vex_encoding_default) + if (i.vec_encoding == vex_encoding_default + || i.vec_encoding == vex_encoding_evex512) i.vec_encoding = vex_encoding_evex; else if (i.vec_encoding != vex_encoding_evex) i.vec_encoding = vex_encoding_error; --- a/gas/config/tc-i386-intel.c +++ b/gas/config/tc-i386-intel.c @@ -209,6 +209,11 @@ operatorT i386_operator (const char *nam || i386_types[j].sz[0] > 8 || (i386_types[j].sz[0] & (i386_types[j].sz[0] - 1))) return O_illegal; + if (i.vec_encoding == vex_encoding_default) + i.vec_encoding = vex_encoding_evex; + else if (i.vec_encoding != vex_encoding_evex + && i.vec_encoding != vex_encoding_evex512) + return O_illegal; if (!i.broadcast.bytes && !i.broadcast.type) { i.broadcast.bytes = i386_types[j].sz[0]; --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -131,6 +131,8 @@ #define EVexLIG EVex=EVEXLIG #define EVexDYN EVex=EVEXDYN +#define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL + #define Vsz256 Vsz=VSZ256 #define Vsz512 Vsz=VSZ512 @@ -1518,8 +1520,8 @@ vdivs, 0x5e, AVX, Modrm|Vex vdppd, 0x6641, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } vdpps, 0x6640, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vextractf128, 0x6619, AVX, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } -vextractps, 0x6617, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex } -vextractps, 0x6617, AVX|x64, RegMem|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 } +vextractps, 0x6617, AVX|AVX512F, Modrm|Vex128|EVex128|Space0F3A|VexWIG|Disp8MemShift=2|NoSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex } +vextractps, 0x6617, AVX|AVX512F|x64, RegMem|Vex128|EVex128|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 } vhaddpd, 0x667c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vhaddps, 0xf27c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vhsubpd, 0x667d, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } @@ -1541,7 +1543,7 @@ vmovap, 0x28, AVX, D|Modrm| // by Intel AVX spec). To avoid extra template in gcc x86 backend and // support assembler for AMD64, we accept 64bit operand on vmovd so // that we can use one template for both SSE and AVX instructions. -vmovd, 0x666e, AVX, D|Modrm|Vex=1|Space0F|NoSuf, { Reg32|Unspecified|BaseIndex, RegXMM } +vmovd, 0x666e, AVX|AVX512F, D|Modrm|Vex128|EVex128|Space0F|Disp8MemShift=2|NoSuf, { Reg32|Unspecified|BaseIndex, RegXMM } vmovd, 0x667e, AVX|x64, D|RegMem|Vex=1|Space0F|VexW=2|NoSuf|Size64, { RegXMM, Reg64 } vmovddup, 0xf212, AVX, Modrm|Vex|Space0F|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vmovddup, 0xf212, AVX, Modrm|Vex=2|Space0F|VexWIG|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM } @@ -1559,7 +1561,7 @@ vmovntdqa, 0x662a, AVX|AVX2, Modrm|Vex|S vmovntp, 0x2b, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex } vmovq, 0xf37e, AVX, Load|Modrm|Vex=1|Space0F|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vmovq, 0x66d6, AVX, Modrm|Vex=1|Space0F|VexWIG|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM } -vmovq, 0x666e, AVX|x64, D|Modrm|Vex=1|Space0F|VexW=2|NoSuf, { Reg64|Unspecified|BaseIndex, RegXMM } +vmovq, 0x666e, AVX|AVX512F|x64, D|Modrm|Vex128|EVex128|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Reg64|Unspecified|BaseIndex, RegXMM } vmovs, 0x10, AVX, D|Modrm|VexLIG|Space0F|VexWIG|NoSuf, { |Unspecified|BaseIndex, RegXMM } vmovs, 0x10, AVX, D|Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM } vmovshdup, 0xf316, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM } @@ -1599,8 +1601,10 @@ vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Sp vpcmpistri, 0x6663, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpistrm, 0x6662, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vperm2f128, 0x6606, AVX, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermilp, 0x660c | , AVX, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpermilp, 0x6604 | , AVX, Modrm|Vex|Space0F3A|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM } +vpermilps, 0x660c, AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpermilps, 0x6604, AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F3A|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpermilpd, 0x660d, AVX, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpermilpd, 0x6605, AVX, Modrm|Vex|Space0F3A|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM } vpextr, 0x6616, AVX|, Modrm|Vex|Space0F3A||NoSuf, { Imm8, RegXMM, |Unspecified|BaseIndex } vpextrw, 0x66c5, AVX, Load|Modrm|Vex|Space0F|VexWIG|No_bSuf|No_wSuf|No_sSuf, { Imm8, RegXMM, Reg32|Reg64 } vpextr, 0x6614 | , AVX, RegMem|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg32|Reg64 } @@ -1632,18 +1636,18 @@ vpminub, 0x66da, AVX|AVX2, Modrm|C|Vex|S vpminud, 0x663b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpminuw, 0x663a, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpmovmskb, 0x66d7, AVX|AVX2, Modrm|Vex|Space0F|VexWIG|No_bSuf|No_wSuf|No_sSuf, { RegXMM|RegYMM, Reg32|Reg64 } -vpmovsxbd, 0x6621, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovsxbq, 0x6622, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM } +vpmovsxbd, 0x6621, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } +vpmovsxbq, 0x6622, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } vpmovsxbw, 0x6620, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vpmovsxdq, 0x6625, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovsxwd, 0x6623, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovsxwq, 0x6624, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxbd, 0x6631, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxbq, 0x6632, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Word|Unspecified|BaseIndex|RegXMM, RegXMM } +vpmovsxwd, 0x6623, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } +vpmovsxwq, 0x6624, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } +vpmovzxbd, 0x6631, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } +vpmovzxbq, 0x6632, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } vpmovzxbw, 0x6630, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vpmovzxdq, 0x6635, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxwd, 0x6633, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpmovzxwq, 0x6634, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM } +vpmovzxwd, 0x6633, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } +vpmovzxwq, 0x6634, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } vpmuldq, 0x6628, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpmulhrsw, 0x660b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpmulhuw, 0x66e4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } @@ -1710,39 +1714,40 @@ vzeroupper, 0x77, AVX, Vex|Space0F|VexWI // 256bit integer AVX2 instructions. -vpmovsxbd, 0x6621, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovsxbq, 0x6622, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegYMM } +vpmovsxbd, 0x6621, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } +vpmovsxbq, 0x6622, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } vpmovsxbw, 0x6620, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } vpmovsxdq, 0x6625, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovsxwd, 0x6623, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovsxwq, 0x6624, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxbd, 0x6631, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxbq, 0x6632, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegYMM } +vpmovsxwd, 0x6623, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } +vpmovsxwq, 0x6624, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } +vpmovzxbd, 0x6631, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } +vpmovzxbq, 0x6632, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } vpmovzxbw, 0x6630, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } vpmovzxdq, 0x6635, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxwd, 0x6633, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vpmovzxwq, 0x6634, AVX2, Modrm|Vex=2|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegYMM } +vpmovzxwd, 0x6633, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } +vpmovzxwq, 0x6634, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } // New AVX2 instructions. vbroadcasti128, 0x665A, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } vbroadcastsd, 0x6619, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { RegXMM, RegYMM } -vbroadcastss, 0x6618, AVX2, Modrm|Vex|Space0F38|VexW=1|NoSuf, { RegXMM, RegXMM|RegYMM } +vbroadcastss, 0x6618, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpbroadcast, 0x6678 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } -vpbroadcast, 0x6658 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf|Optimize, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } +vpbroadcastd, 0x6658, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexW0|Disp8MemShift|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpbroadcastq, 0x6659, AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf|Optimize, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM } vperm2i128, 0x6646, AVX2, Modrm|Vex=2|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermd, 0x6636, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermpd, 0x6601, AVX2, Modrm|Vex=2|Space0F3A|VexW1|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM } -vpermps, 0x6616, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } -vpermq, 0x6600, AVX2, Modrm|Vex=2|Space0F3A|VexW1|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM } +vpermd, 0x6636, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } +vpermpd, 0x6601, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } +vpermps, 0x6616, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } +vpermq, 0x6600, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } vextracti128, 0x6639, AVX2, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } vinserti128, 0x6638, AVX2, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM } vpmaskmov, 0x668e, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex } vpmaskmov, 0x668c, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } -vpsllv, 0x6647, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpsravd, 0x6646, AVX2, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpsrlv, 0x6645, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpsllv, 0x6647, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpsravd, 0x6646, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpsrlv, 0x6645, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // AVX gather instructions vgatherdpd, 0x6692, AVX2, Modrm|Vex|Space0F38|VexVVVV|VexW1|SwapSources|CheckOperandSize|NoSuf|VecSIB128, { RegXMM|RegYMM, Qword|Unspecified|BaseIndex, RegXMM|RegYMM } @@ -1779,7 +1784,7 @@ vpclmulhqhqdq, 0x6644/0x11, AVX|PCLMULQD vgf2p8affineinvqb, 0x66cf, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vgf2p8affineqb, 0x66ce, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vgf2p8mulb, 0x66cf, AVX|GFNI, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vgf2p8mulb, 0x66cf, GFNI|AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // FSGSBASE, RDRND and F16C @@ -2082,8 +2087,6 @@ vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ, // AVX512F instructions. -#define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL - , 0x6615, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vprorv, 0x6614, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpsllv, 0x6647, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpsrav, 0x6646, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpsrlv, 0x6645, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpsravq, 0x6646, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vpternlog, 0x6625, AVX512F, Modrm|Masking|Space0F3A|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vbroadcastf32x4, 0x661A, AVX512F, Modrm|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { XMMword|Unspecified|BaseIndex, RegYMM|RegZMM } @@ -2153,10 +2154,9 @@ vbroadcasti32x4, 0x665A, AVX512F, Modrm| vbroadcastf64x4, 0x661B, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=2|Disp8MemShift=5|NoSuf, { YMMword|Unspecified|BaseIndex, RegZMM } vbroadcasti64x4, 0x665B, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=2|Disp8MemShift=5|NoSuf, { YMMword|Unspecified|BaseIndex, RegZMM } -vbroadcastss, 0x6618, AVX512F, Modrm|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vbroadcastsd, 0x6619, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } -vpbroadcast, 0x6658 | , AVX512F, Modrm|Masking|Space0F38||Disp8MemShift|NoSuf, { RegXMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpbroadcastq, 0x6659, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpbroadcast, 0x667c, AVX512F, Modrm|Masking|Space0F38||NoSuf, { , RegXMM|RegYMM|RegZMM } vcmpp, 0xC2/0x, AVX512F, Modrm|Masking|Space0F|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt|SAE, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask } @@ -2246,9 +2246,6 @@ vextracti32x4, 0x6639, AVX512F, Modrm|Ma vextractf64x4, 0x661B, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexW=2|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex } vextracti64x4, 0x663B, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexW=2|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex } -vextractps, 0x6617, AVX512F, Modrm|EVex128|Space0F3A|VexWIG|Disp8MemShift=2|NoSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex } -vextractps, 0x6617, AVX512F|x64, RegMem|EVex128|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 } - vfixupimmp, 0x6654, AVX512F, Modrm|Masking|Space0F3A|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vfixupimms, 0x6655, AVX512F, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV||Disp8MemShift|NoSuf|SAE, { Imm8|Imm8S, RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } @@ -2304,8 +2301,6 @@ vmovap, 0x28, AVX512F, D|Mo vmovntp, 0x2B, AVX512F, Modrm|Space0F||Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM, XMMword|YMMword|ZMMword|Unspecified|BaseIndex } vmovup, 0x10, AVX512F, D|Modrm|Masking|Space0F||Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vmovd, 0x666E, AVX512F, D|Modrm|EVex=2|Space0F|Disp8MemShift=2|NoSuf, { Reg32|Unspecified|BaseIndex, RegXMM } - vmovddup, 0xF212, AVX512F, Modrm|Masking|Space0F|VexW=2|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Unspecified|BaseIndex, RegYMM|RegZMM } vmovdqa64, 0x666F, AVX512F, D|Modrm|Masking|Space0F|VexW=2|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } @@ -2322,7 +2317,6 @@ vmovhp, 0x17, AVX512F, Modr vmovlp, 0x12, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV||Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM } vmovlp, 0x13, AVX512F, Modrm|EVexLIG|Space0F||Disp8MemShift=3|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex } -vmovq, 0x666E, AVX512F|x64, D|Modrm|EVex128|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Reg64|Unspecified|BaseIndex, RegXMM } vmovq, 0xF37E, AVX512F, Load|Modrm|EVex=2|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } vmovq, 0x66D6, AVX512F, Modrm|EVex=2|Space0F|VexW1|Disp8MemShift=3|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM } @@ -2360,15 +2354,10 @@ vpcmpu, 0x661e/, AVX vptestm, 0x6627, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask } vptestnm, 0xf327, AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask } -vpermd, 0x6636, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } -vpermps, 0x6616, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } - -vpermilp, 0x6604 | , AVX512F, Modrm|Masking|Space0F3A||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } -vpermilp, 0x660C | , AVX512F, Modrm|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } +vpermilpd, 0x6605, AVX512F, Modrm|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } +vpermilpd, 0x660d, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vpermpd, 0x6601, AVX512F, Modrm|Masking|Space0F3A|VexW=2|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } vpermpd, 0x6616, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } -vpermq, 0x6600, AVX512F, Modrm|Masking|Space0F3A|VexW=2|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } vpermq, 0x6636, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } vpmovdb, 0xF331, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegZMM, RegXMM|Unspecified|BaseIndex } @@ -2593,31 +2582,11 @@ vpmovsqw, 0xF324, AVX512F|AVX512VL, Modr vpmovusqw, 0xF314, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM, RegXMM|Dword|Unspecified|BaseIndex } vpmovusqw, 0xF314, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegYMM, RegXMM|Qword|Unspecified|BaseIndex } -vpmovsxbd, 0x6621, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovsxbd, 0x6621, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } -vpmovzxbd, 0x6631, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovzxbd, 0x6631, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } - -vpmovsxbq, 0x6622, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } -vpmovsxbq, 0x6622, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vpmovzxbq, 0x6632, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM } -vpmovzxbq, 0x6632, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } - vpmovsxdq, 0x6625, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } vpmovsxdq, 0x6625, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } vpmovzxdq, 0x6635, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } vpmovzxdq, 0x6635, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } -vpmovsxwd, 0x6623, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } -vpmovsxwd, 0x6623, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } -vpmovzxwd, 0x6633, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } -vpmovzxwd, 0x6633, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } - -vpmovsxwq, 0x6624, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovsxwq, 0x6624, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } -vpmovzxwq, 0x6634, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM } -vpmovzxwq, 0x6634, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM } - // AVX512VL instructions end. // AVX512BW instructions. @@ -2960,7 +2929,6 @@ vpshufbitqmb, 0x668f, AVX512_BITALG, Mod vgf2p8affineinvqb, 0x66cf, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } vgf2p8affineqb, 0x66ce, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } -vgf2p8mulb, 0x66cf, GFNI|AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM } // AVX512 + GFNI instructions end