From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 88054 invoked by alias); 11 Feb 2020 10:26:20 -0000 Mailing-List: contact binutils-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: binutils-owner@sourceware.org Received: (qmail 87974 invoked by uid 89); 11 Feb 2020 10:26:19 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.5 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 11 Feb 2020 10:26:14 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id BF14FADFF; Tue, 11 Feb 2020 10:26:12 +0000 (UTC) Subject: [PATCH v5 4/5] x86: correct VFPCLASSP{S,D} operand size handling From: Jan Beulich To: "binutils@sourceware.org" Cc: "H.J. Lu" References: <1e1b8eba-93ff-39ed-460a-a922d12af27e@suse.com> Message-ID: <2668485f-205e-21d9-ef9f-a20c6979ad3e@suse.com> Date: Tue, 11 Feb 2020 10:26:00 -0000 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: <1e1b8eba-93ff-39ed-460a-a922d12af27e@suse.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2020-02/txt/msg00169.txt.bz2 With AVX512VL disabled (e.g. when writing code for the Knights family of processors) these insns aren't ambiguous when used with a memory source, and hence should be accepted without suffix or operand size specifier. When AVX512VL is enabled, to be consistent with this as well as other ambiguous operand size handling it would seem better to just warn about the ambiguity in AT&T mode, and still default to 512-bit operands (on the assumption that the code may have been written without AVX512VL in mind yet), but it was requested to leave AT&T syntax mode alone here. gas/ 2020-02-XX Jan Beulich * config/tc-i386.c (avx512): New (at file scope), moved from (check_VecOperands): ... here. (process_suffix): Add [XYZ]MMword operand size handling. * testsuite/gas/i386/avx512dq-inval.s: Add VFPCLASS tests. * testsuite/gas/i386/noavx512-2.s: Add Intel syntax VFPCLASS tests. * testsuite/gas/i386/avx512dq-inval.l, testsuite/gas/i386/noavx512-2.l: Adjust expectations. opcodes/ 2020-02-XX Jan Beulich * i386-opc.tbl (vfpclasspd, vfpclassps): Add Intel sytax form with Unspecified, making the present one AT&T syntax only. * i386-tbl.h: Re-generate. --- v5: Re-base. v4: Restrict to just Intel syntax mode. Re-base. v3: Re-base. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1840,6 +1840,8 @@ cpu_flags_and_not (i386_cpu_flags x, i38 return x; } +static const i386_cpu_flags avx512 = CPU_ANY_AVX512F_FLAGS; + #define CPU_FLAGS_ARCH_MATCH 0x1 #define CPU_FLAGS_64BIT_MATCH 0x2 @@ -5352,7 +5354,6 @@ check_VecOperands (const insn_template * { unsigned int op; i386_cpu_flags cpu; - static const i386_cpu_flags avx512 = CPU_ANY_AVX512F_FLAGS; /* Templates allowing for ZMMword as well as YMMword and/or XMMword for any one operand are implicity requiring AVX512VL support if the actual @@ -6447,7 +6448,8 @@ process_suffix (void) /* Accept FLDENV et al without suffix. */ && (i.tm.opcode_modifier.no_ssuf || i.tm.opcode_modifier.floatmf)) { - unsigned int suffixes; + unsigned int suffixes, evex = 0; + i386_cpu_flags cpu; suffixes = !i.tm.opcode_modifier.no_bsuf; if (!i.tm.opcode_modifier.no_wsuf) @@ -6461,7 +6463,55 @@ process_suffix (void) if (flag_code == CODE_64BIT && !i.tm.opcode_modifier.no_qsuf) suffixes |= 1 << 5; - /* Are multiple suffixes allowed? */ + /* For [XYZ]MMWORD operands inspect operand sizes. */ + cpu = cpu_flags_and (i.tm.cpu_flags, avx512); + if (!cpu_flags_all_zero (&cpu) && !i.broadcast) + { + unsigned int op; + + for (op = 0; op < i.tm.operands; ++op) + { + if (!cpu_arch_flags.bitfield.cpuavx512vl) + { + if (i.tm.operand_types[op].bitfield.ymmword) + i.tm.operand_types[op].bitfield.xmmword = 0; + if (i.tm.operand_types[op].bitfield.zmmword) + i.tm.operand_types[op].bitfield.ymmword = 0; + if (!i.tm.opcode_modifier.evex + || i.tm.opcode_modifier.evex == EVEXDYN) + i.tm.opcode_modifier.evex = EVEX512; + } + + if (i.tm.operand_types[op].bitfield.xmmword + + i.tm.operand_types[op].bitfield.ymmword + + i.tm.operand_types[op].bitfield.zmmword < 2) + continue; + + /* Any properly sized operand disambiguates the insn. */ + if (i.types[op].bitfield.xmmword + || i.types[op].bitfield.ymmword + || i.types[op].bitfield.zmmword) + { + suffixes &= ~(7 << 6); + evex = 0; + break; + } + + if ((i.flags[op] & Operand_Mem) + && i.tm.operand_types[op].bitfield.unspecified) + { + if (i.tm.operand_types[op].bitfield.xmmword) + suffixes |= 1 << 6; + if (i.tm.operand_types[op].bitfield.ymmword) + suffixes |= 1 << 7; + if (i.tm.operand_types[op].bitfield.zmmword) + suffixes |= 1 << 8; + evex = EVEX512; + } + } + } + + /* Are multiple suffixes / operand sizes allowed? */ if (suffixes & (suffixes - 1)) { if (intel_syntax @@ -6491,6 +6541,8 @@ process_suffix (void) || (i.tm.base_opcode == 0x63 && i.tm.cpu_flags.bitfield.cpu64)) /* handled below */; + else if (evex) + i.tm.opcode_modifier.evex = evex; else if (flag_code == CODE_16BIT) i.suffix = WORD_MNEM_SUFFIX; else if (!i.tm.opcode_modifier.no_lsuf) --- a/gas/testsuite/gas/i386/avx512dq-inval.l +++ b/gas/testsuite/gas/i386/avx512dq-inval.l @@ -11,3 +11,7 @@ .*:[0-9]*: Error:.* `vpinsrq' .* .*:[0-9]*: Error:.* `vpinsrq' .* .*:[0-9]*: Error:.* `vpinsrq' .* +.*:[0-9]*: Error:.* `vfpclasspd' +.*:[0-9]*: Error:.* `vfpclassps' +.*:[0-9]*: Error:.* `vfpclasspd' +.*:[0-9]*: Error:.* `vfpclassps' --- a/gas/testsuite/gas/i386/avx512dq-inval.s +++ b/gas/testsuite/gas/i386/avx512dq-inval.s @@ -1,4 +1,4 @@ -# Check AVX512DQ instructions not to be accepted outside of 64-bit mode +# Check AVX512DQ instructions not to be accepted (in part only outside of 64-bit mode) .text _start: @@ -20,3 +20,10 @@ _start: vpinsrq xmm0, xmm0, qword ptr [eax], 0 {evex} vpinsrq xmm0, xmm0, qword ptr [eax], 0 + vfpclasspd k0, [eax], 0 + vfpclassps k0, [eax], 0 + + .att_syntax prefix + + vfpclasspd $0, (%eax), %k0 + vfpclassps $0, (%eax), %k0 --- a/gas/testsuite/gas/i386/noavx512-2.l +++ b/gas/testsuite/gas/i386/noavx512-2.l @@ -101,5 +101,10 @@ GAS LISTING .* [ ]*50[ ]+F5 [ ]*51[ ]+\?\?\?\? 660F58F4 addpd %xmm4, %xmm6 [ ]*52[ ]+ -[ ]*53[ ]+\?\?\?\? 0F1F00 \.p2align 4 +[ ]*[1-9][0-9]*[ ]+\.intel_syntax noprefix +[ ]*[1-9][0-9]*[ ]+\?\?\?\? 62F3FD48 vfpclasspd k0, \[eax], 0 +[ ]*[1-9][0-9]*[ ]+660000 +[ ]*[1-9][0-9]*[ ]+\?\?\?\? 62F37D48 vfpclassps k0, \[eax], 0 +[ ]*[1-9][0-9]*[ ]+660000 +[ ]*[1-9][0-9]*[ ]+ #pass --- a/gas/testsuite/gas/i386/noavx512-2.s +++ b/gas/testsuite/gas/i386/noavx512-2.s @@ -50,4 +50,8 @@ pabsb %xmm5, %xmm6 addpd %xmm4, %xmm6 + .intel_syntax noprefix + vfpclasspd k0, [eax], 0 + vfpclassps k0, [eax], 0 + .p2align 4 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -4504,12 +4504,14 @@ vextracti64x2, 3, 0x6639, None, 1, CpuAV vinsertf64x2, 4, 0x6618, None, 1, CpuAVX512DQ, Modrm|Masking=3|VexOpcode=2|VexVVVV=1|VexW=2|Disp8MemShift=4|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } vinserti64x2, 4, 0x6638, None, 1, CpuAVX512DQ, Modrm|Masking=3|VexOpcode=2|VexVVVV=1|VexW=2|Disp8MemShift=4|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM } -vfpclasspd, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|Masking=2|VexOpcode=2|VexW=2|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|BaseIndex, RegMask } +vfpclasspd, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|Masking=2|VexOpcode=2|VexW=2|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Imm8, RegXMM|RegYMM|RegZMM|Qword|BaseIndex, RegMask } +vfpclasspd, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|Masking=2|VexOpcode=2|VexW=2|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegMask } vfpclasspdz, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=2|VexOpcode=2|VexW=2|Broadcast|Disp8MemShift=6|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegZMM|Qword|Unspecified|BaseIndex, RegMask } vfpclasspdx, 3, 0x6666, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=2|VexOpcode=2|VexW=2|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Qword|Unspecified|BaseIndex, RegMask } vfpclasspdy, 3, 0x6666, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=2|VexOpcode=2|VexW=2|Broadcast|Disp8MemShift=5|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM|Qword|Unspecified|BaseIndex, RegMask } -vfpclassps, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|Masking=2|VexOpcode=2|VexW=1|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|RegYMM|RegZMM|Dword|BaseIndex, RegMask } +vfpclassps, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|Masking=2|VexOpcode=2|VexW=1|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|ATTSyntax, { Imm8, RegXMM|RegYMM|RegZMM|Dword|BaseIndex, RegMask } +vfpclassps, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|Masking=2|VexOpcode=2|VexW=1|Broadcast|Disp8ShiftVL|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|IntelSyntax, { Imm8, RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegMask } vfpclasspsz, 3, 0x6666, None, 1, CpuAVX512DQ, Modrm|EVex=1|Masking=2|VexOpcode=2|VexW=1|Broadcast|Disp8MemShift=6|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegZMM|Dword|Unspecified|BaseIndex, RegMask } vfpclasspsx, 3, 0x6666, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=2|Masking=2|VexOpcode=2|VexW=1|Broadcast|Disp8MemShift=4|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegXMM|Dword|Unspecified|BaseIndex, RegMask } vfpclasspsy, 3, 0x6666, None, 1, CpuAVX512DQ|CpuAVX512VL, Modrm|EVex=3|Masking=2|VexOpcode=2|VexW=1|Broadcast|Disp8MemShift=5|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Imm8, RegYMM|Dword|Unspecified|BaseIndex, RegMask }