From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by sourceware.org (Postfix) with ESMTPS id 8CB8E38582AF for ; Fri, 2 Feb 2024 10:41:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8CB8E38582AF Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8CB8E38582AF Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::531 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706870463; cv=none; b=if1IUNxgYFaCI0Ziaiq9jMtXLSruzO4COa/ubhabX9+JqlDbldQfSFbS0n5t+tgPSVl7XJem12kqJ4NvAzJigo4RacOKP6UwTOzqH8Le+LIqbeGkiGTklxg0AV3hf/PWjGfnaDUAgHUeif6G6VfefV7wue+OItITBno/h7R/tnw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706870463; c=relaxed/simple; bh=9212j9mq5uRKJHBAMlcsc/t8yLOYDP3Q/nx63UA+Hl0=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:From:To; b=mmGO8IIl3ASZE4nitcL5RI6SL8kSxAI/MsDoX8Ufn25PRjRTRIp2K6Obxeo6OezfMVFSdRQjAJUnP4wipZr6HUINc+vQupgGfXiKmyA2UFo/6aVWVuFlzEns1xxZKz9QOzfoVAUycD0JgDz/mUo2AdJ3ZIQwRo5V2tvYYWCMTGU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-55f50cf2021so2617490a12.1 for ; Fri, 02 Feb 2024 02:41:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1706870459; x=1707475259; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=OChoRKe4lgd5U3FRYF3rErza/DqjEE2gA3jddJNWaHc=; b=VLnMrEGUsHqVGsDhXkkytERhYC6xbO3GE4P/dON/Tqn/Ki81AZqKDvdiaGkSj/w4qF /wRvSjRSynEjoJVrfzhxPicJjez0oL/9pJ+9835iQWlToZvRUiKx3w5nPSnvca7hWk6z fJZoZX9Ka6hLmE8WbBnVJUPsz7yJGfqNT3QEwoZTSrC+m6X58Mmx7kkP7uF8pnYahfWr DSPtQDZxAQHKudtVFwqHTJeH1qTRl2uGwBMvMlOL8JVstCIxT7cQDqkIRjRmZ+7s0Q4+ bxcwHZbaisv2FEwx5d5mXxL/R/70x7MeqjNRUsMGf9/MZ+JeqD4ZDyEfWsYQf9cT2JBh KjUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706870459; x=1707475259; h=content-transfer-encoding:in-reply-to:autocrypt:references:cc:to :from:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OChoRKe4lgd5U3FRYF3rErza/DqjEE2gA3jddJNWaHc=; b=OfUfVN8eFQCpR3nhug6Dnsdi+NK7Xl6MOlQ5ML0HJ3hyNWZMuuJ9QHINxKJ4AGR62G BHZ9OEiq7HCS0ye28OrUjC05BiU5MRdTbPoPL+o8bLneXn6beoXHYdjJFcHLtRp8/Dmc Khh9Jsh9rwWXVRx2uliyUtdcbC1MBPJR2Buc8RGg7FYd3byO6Mi+/TmeEACEfdXglHn7 14JDRbPpeOlTEKDh2RCFXsRMMGMiphLYblH5mm2z9iTxy+rKcA6uTC1Nx3qPODPOYdPP txlB9+2gYl5H/rfCnt0XCgwmTPom6rPSl5g0+W+HmynrF1WAjzbxr+hLCVOymY4lWM7v FUqQ== X-Gm-Message-State: AOJu0YxSwKGFMUe8I1IkYCV7Y1Msf0cGOe7SEL6jM5Fk6IFdeVy6AIoF Ipnwfz6goUWW/6RpdF0/yldQOiUIwJk4NBngrxyTWayCEVng+H1Ryno4hG38ug5DUyJlAFZzMMY = X-Google-Smtp-Source: AGHT+IFxPdUnCMVddzMypsEfy5ZvFXEdbNWFQQvxMR5eQz0VbD+iKZrHK+lqu3mze2mj29mz+rLmSw== X-Received: by 2002:a17:907:960c:b0:a36:5e44:a2dd with SMTP id gb12-20020a170907960c00b00a365e44a2ddmr1573853ejc.5.1706870459047; Fri, 02 Feb 2024 02:40:59 -0800 (PST) X-Forwarded-Encrypted: i=0; AJvYcCUAtEJj+ZpQd2pYMk3L1GsIc7Pargl6uGoY24nI5zQNGdwJIix+IoCeIYgZX7GxgJx5XPcGEB1FsRvUPOcGTB3Zqxs= Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id wn7-20020a170907068700b00a2e9f198cffsm742112ejb.72.2024.02.02.02.40.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 02 Feb 2024 02:40:58 -0800 (PST) Message-ID: Date: Fri, 2 Feb 2024 11:40:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH 2/2] x86/APX: V{BROADCAST,EXTRACT,INSERT}{F,I}128 can also be expressed Content-Language: en-US From: Jan Beulich To: Binutils Cc: "H.J. Lu" , Lili Cui References: <4a62c3b1-dbfc-4291-8d22-c8966abe0573@suse.com> Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <4a62c3b1-dbfc-4291-8d22-c8966abe0573@suse.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3025.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Interestingly unlike VROUND{P,S}{S,D} and VPERM{F,I}128 they weren't even present in the x86-64-apx-egpr-inval testcase, hence why I overlooked that these can actually be encoded, (again) using suitable AVX512 counterparts. While there also "modernize" the adjacent AVX/AVX2 entries. --- Here and also for the VROUND{P,S}{S,D} surrogates: If we omitted the APX_F part of the conditions, dual VEX/EVEX templates ought to be possible to use. Afaics the only (further) "anomaly" that would result is that then {evex} prefixes also become usable with these mnemonics even when APX is disabled. Of course for consistency the restriction to memory-only forms should then also be lifted. (As to "further": The other one is that these insns permit use of the high 16 XMM/YMM registers along with the [intended] permitting of the eGPR-s.) Thoughts? Some VPERM{F,I}128 forms could be expressed by VSHUF{F,I}32x4, others by VPERM{PD,Q}, but perhaps this isn't worth it overall, first and foremost because we wouldn't be able to reach full coverage. --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d @@ -158,6 +158,12 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 da 7f 08 4b b4 87 23 01 00 00[ ]+tileloadd tmm6,\[r31\+rax\*4\+0x123\] [ ]*[a-f0-9]+:[ ]*62 da 7d 08 4b b4 87 23 01 00 00[ ]+tileloaddt1 tmm6,\[r31\+rax\*4\+0x123\] [ ]*[a-f0-9]+:[ ]*62 da 7e 08 4b b4 87 23 01 00 00[ ]+tilestored[ ]+\[r31\+rax\*4\+0x123\],tmm6 +[ ]*[a-f0-9]+:[ ]*62 fa 7d 28 1a 18[ ]+vbroadcastf32x4 ymm3,XMMWORD PTR \[r16\] +[ ]*[a-f0-9]+:[ ]*62 fa 7d 28 5a 18[ ]+vbroadcasti32x4 ymm3,XMMWORD PTR \[r16\] +[ ]*[a-f0-9]+:[ ]*62 fb 7d 28 19 18 01[ ]+vextractf32x4 XMMWORD PTR \[r16\],ymm3,(0x)?1 +[ ]*[a-f0-9]+:[ ]*62 fb 7d 28 39 18 01[ ]+vextracti32x4 XMMWORD PTR \[r16\],ymm3,(0x)?1 +[ ]*[a-f0-9]+:[ ]*62 7b 65 28 18 00 01[ ]+vinsertf32x4 ymm8,ymm3,XMMWORD PTR \[r16\],(0x)?1 +[ ]*[a-f0-9]+:[ ]*62 7b 65 28 38 00 01[ ]+vinserti32x4 ymm8,ymm3,XMMWORD PTR \[r16\],(0x)?1 [ ]*[a-f0-9]+:[ ]*62 db fd 08 09 30 01[ ]+vrndscalepd xmm6,XMMWORD PTR \[r24\],(0x)?1 [ ]*[a-f0-9]+:[ ]*62 db 7d 08 08 30 02[ ]+vrndscaleps xmm6,XMMWORD PTR \[r24\],(0x)?2 [ ]*[a-f0-9]+:[ ]*62 db cd 08 0b 18 03[ ]+vrndscalesd xmm3,xmm6,QWORD PTR \[r24\],(0x)?3 --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d @@ -158,6 +158,12 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 da 7f 08 4b b4 87 23 01 00 00[ ]+tileloadd[ ]+0x123\(%r31,%rax,4\),%tmm6 [ ]*[a-f0-9]+:[ ]*62 da 7d 08 4b b4 87 23 01 00 00[ ]+tileloaddt1[ ]+0x123\(%r31,%rax,4\),%tmm6 [ ]*[a-f0-9]+:[ ]*62 da 7e 08 4b b4 87 23 01 00 00[ ]+tilestored[ ]+%tmm6,0x123\(%r31,%rax,4\) +[ ]*[a-f0-9]+:[ ]*62 fa 7d 28 1a 18[ ]+vbroadcastf32x4 \(%r16\),%ymm3 +[ ]*[a-f0-9]+:[ ]*62 fa 7d 28 5a 18[ ]+vbroadcasti32x4 \(%r16\),%ymm3 +[ ]*[a-f0-9]+:[ ]*62 fb 7d 28 19 18 01[ ]+vextractf32x4 \$(0x)?1,%ymm3,\(%r16\) +[ ]*[a-f0-9]+:[ ]*62 fb 7d 28 39 18 01[ ]+vextracti32x4 \$(0x)?1,%ymm3,\(%r16\) +[ ]*[a-f0-9]+:[ ]*62 7b 65 28 18 00 01[ ]+vinsertf32x4 \$(0x)?1,\(%r16\),%ymm3,%ymm8 +[ ]*[a-f0-9]+:[ ]*62 7b 65 28 38 00 01[ ]+vinserti32x4 \$(0x)?1,\(%r16\),%ymm3,%ymm8 [ ]*[a-f0-9]+:[ ]*62 db fd 08 09 30 01[ ]+vrndscalepd \$0x1,\(%r24\),%xmm6 [ ]*[a-f0-9]+:[ ]*62 db 7d 08 08 30 02[ ]+vrndscaleps \$0x2,\(%r24\),%xmm6 [ ]*[a-f0-9]+:[ ]*62 db cd 08 0b 18 03[ ]+vrndscalesd \$0x3,\(%r24\),%xmm6,%xmm3 --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s @@ -152,6 +152,12 @@ _start: tileloadd 0x123(%r31,%rax,4),%tmm6 tileloaddt1 0x123(%r31,%rax,4),%tmm6 tilestored %tmm6,0x123(%r31,%rax,4) + vbroadcastf128 (%r16),%ymm3 + vbroadcasti128 (%r16),%ymm3 + vextractf128 $1,%ymm3,(%r16) + vextracti128 $1,%ymm3,(%r16) + vinsertf128 $1,(%r16),%ymm3,%ymm8 + vinserti128 $1,(%r16),%ymm3,%ymm8 vroundpd $1,(%r24),%xmm6 vroundps $2,(%r24),%xmm6 vroundsd $3,(%r24),%xmm6,%xmm3 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1588,7 +1588,9 @@ vandnp, 0x55, AVX, Modrm|Ve vandp, 0x54, AVX, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vblendp, 0x660c | , AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vblendvp, 0x664a | , AVX, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vbroadcastf128, 0x661a, AVX, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } +vbroadcastf128, 0x661a, AVX, Modrm|Vex256|Space0F38|VexW0|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } +// vbroadcastf32x4 in disguise (see vround{p,s}{s,d} comment) +vbroadcastf128, 0x661a, APX_F&AVX512VL, Modrm|EVex256|Space0F38|VexW0|Disp8MemShift=4|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } vbroadcastsd, 0x6619, AVX, Modrm|Vex256|Space0F38|VexW0|NoSuf, { Qword|Unspecified|BaseIndex, RegYMM } vbroadcastss, 0x6618, AVX, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Dword|Unspecified|BaseIndex, RegXMM|RegYMM } vcmpp, 0xc2/0x, AVX, Modrm||Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } @@ -1616,7 +1618,9 @@ vdivp, 0x5e, AVX, Modrm|Vex vdivs, 0x5e, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } vdppd, 0x6641, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } vdpps, 0x6640, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vextractf128, 0x6619, AVX, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } +vextractf128, 0x6619, AVX, Modrm|Vex256|Space0F3A|VexW0|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } +// vextractf32x4 in disguise (see vround{p,s}{s,d} comment) +vextractf128, 0x6619, APX_F&AVX512VL, Modrm|EVex256|Space0F3A|VexW0|Disp8MemShift=4|NoSuf, { Imm8, RegYMM, Xmmword|Unspecified|BaseIndex } vextractps, 0x6617, AVX|AVX512F, Modrm|Vex128|EVex128|Space0F3A|VexWIG|Disp8MemShift=2|NoSuf, { Imm8, RegXMM, Reg32|Unspecified|BaseIndex } vextractps, 0x6617, x64&(AVX|AVX512F), RegMem|Vex128|EVex128|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 } vhaddpd, 0x667c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } @@ -1624,6 +1628,8 @@ vhaddps, 0xf27c, AVX, Modrm|Vex|Space0F| vhsubpd, 0x667d, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vhsubps, 0xf27d, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vinsertf128, 0x6618, AVX, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM } +// vinsertf32x4 in disguise (see vround{p,s}{s,d} comment) +vinsertf128, 0x6618, APX_F&AVX512VL, Modrm|EVex256|Space0F3A|VexVVVV|VexW0|Disp8MemShift=4|NoSuf, { Imm8, Xmmword|Unspecified|BaseIndex, RegYMM, RegYMM } vinsertps, 0x6621, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM } vlddqu, 0xf2f0, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM } vldmxcsr, 0xae/2, AVX, Modrm|Vex128|Space0F|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex } @@ -1830,7 +1836,9 @@ vpmovzxwq, 0x6634, AVX2|AVX512VL, Modrm| // New AVX2 instructions. -vbroadcasti128, 0x665A, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } +vbroadcasti128, 0x665A, AVX2, Modrm|Vex256|Space0F38|VexW0|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } +// vbroadcasti32x4 in disguise (see vround{p,s}{s,d} comment) +vbroadcasti128, 0x665a, APX_F&AVX512VL, Modrm|EVex256|Space0F38|VexW0|Disp8MemShift=4|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } vbroadcastsd, 0x6619, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { RegXMM, RegYMM } vbroadcastss, 0x6618, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } @@ -1842,8 +1850,12 @@ vpermd, 0x6636, AVX2|AVX512F, Modrm|Vex2 vpermpd, 0x6601, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } vpermps, 0x6616, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } vpermq, 0x6600, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } -vextracti128, 0x6639, AVX2, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } +vextracti128, 0x6639, AVX2, Modrm|Vex256|Space0F3A|VexW0|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } +// vextracti32x4 in disguise (see vround{p,s}{s,d} comment) +vextracti128, 0x6639, APX_F&AVX512VL, Modrm|EVex256|Space0F3A|VexW0|Disp8MemShift=4|NoSuf, { Imm8, RegYMM, Xmmword|Unspecified|BaseIndex } vinserti128, 0x6638, AVX2, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM } +// vinserti32x4 in disguise (see vround{p,s}{s,d} comment) +vinserti128, 0x6638, APX_F&AVX512VL, Modrm|EVex256|Space0F3A|VexVVVV|VexW0|Disp8MemShift=4|NoSuf, { Imm8, Xmmword|Unspecified|BaseIndex, RegYMM, RegYMM } vpmaskmov, 0x668e, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex } vpmaskmov, 0x668c, AVX2, Modrm|Vex|Space0F38|VexVVVV||CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } vpsllv, 0x6647, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV||Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }