From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by sourceware.org (Postfix) with ESMTPS id E62EC3858D28 for ; Mon, 4 Dec 2023 16:30:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E62EC3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E62EC3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::333 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701707431; cv=none; b=fS0ZZNe8F2fbtjCJY44ahv7DYEQjsOUmyVmFSYwQyT8ilGnCGXZwfeywtWgOg+YwBQwNnfdknL46Ql5dwDoA+KmycVGdggHYVntAjlmzFT/uNMcMCuvxeJyjVkYgPG2Vcn6slVs9cp0R+xsoex/yfePBo7S+EabSvuh9JwTZv9c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701707431; c=relaxed/simple; bh=wHbBOQ18pSyw8crmsJuioG9S72hlX9ynd3QDIRLtxFo=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=wcV4Ha2jruyUups5VZExPq9lc1GU6kaXjFuWzie7QZqXL0fin/INpNVdOJ8Ysuvx7rbv2uu/+IK3veygtZa3W2eJi92SElBkL/DYEqGlpstYzYar1wtZkfEJIQofipQ2sYWiQ+806SfnUE5MvMFcMpxhzlFgT714uErBhwizxFM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x333.google.com with SMTP id 5b1f17b1804b1-40c0a11a914so13346895e9.2 for ; Mon, 04 Dec 2023 08:30:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1701707426; x=1702312226; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=/muX3/IPaVGfpdt3tNAVpZj0EQmBp1UojawqSwe/T2M=; b=RDknjL0c7nMHR29XV21Y4ySWeDXQLnN8LqJhznIhUYLo3hvfnjr2/K3HL+yBJO3blL kVYIPbJ+tnlATk3ux/r0wFz9vW2fIgG88NzwehF45eb65T8MaAGF3POQuHxetSpbxUw8 HaOl6V00V0fM2pwyWQp3Co5OribD+8aY5gLXe9XgQXD2Pt7Yaz4XsvbSprpMRbdbRC5P z2yoxxzQg/hgfunLzAPVVjQNpFUXCAIJ/pnUIqjxQ/S2uzGdjtuyznTsxKTa+ubvjUE3 0SKIlOGhqBv2V8+pGSx8O2AKyIXXy4QVt4g/OtDJcnIk6EHSDnm69d9LrcMCHndvF1fs GR3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701707426; x=1702312226; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/muX3/IPaVGfpdt3tNAVpZj0EQmBp1UojawqSwe/T2M=; b=nnWOvpnfaOOeHn87Srdd1ZKxbgWa7Xyv6gXS7luB47+Wxgue2WzuL9CFFoffNHf6jh o0d7rs+4Ip4x1dnX/DKZwogalBULmmCZ5VKcZG/7LIUBDZv2f9qCMY2eEYeHTI1g/Xy5 facCKXm68oEc1BGJDY2UtvxpIJF/gRo0Hu+oyAqbjFGyhslXOZuPKJBvD3oZm5/TBd27 zQFtrmPvq8u5xJvwIYT0EohfatDV0pech9SbCbQ42bhCpmKE0RUb8WFhdTan1qNXjOAn gKpynCvhFQYK39W0X8NjgjxRQtuEe445MOmdKpt60ipeb8N6MfIjW4GhUISXNnU/paUl jdag== X-Gm-Message-State: AOJu0Yx7lmOi2iY3dtQQLprHdCzohk4WKCgveSMlWRnLLsdSyRAw3jFC qloHdwr+hUgCyk8781n7rF6Q X-Google-Smtp-Source: AGHT+IEyaE1qwiA9WPf4LSM6xnEITd89IDEyvRRt8tUCOrLO2ZORlz2gTv/Xk3b5uivy2+Paa6J8lA== X-Received: by 2002:a05:600c:4ec8:b0:40b:5e1e:b3be with SMTP id g8-20020a05600c4ec800b0040b5e1eb3bemr2693502wmq.60.1701707426591; Mon, 04 Dec 2023 08:30:26 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id jg23-20020a05600ca01700b0040b30be6244sm15588830wmb.24.2023.12.04.08.30.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 04 Dec 2023 08:30:26 -0800 (PST) Message-ID: <14e62b44-2c5b-4a3b-a66a-cffa8f33578c@suse.com> Date: Mon, 4 Dec 2023 17:30:25 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 2/9] Support APX GPR32 with rex2 prefix Content-Language: en-US To: "Cui, Lili" Cc: hongjiu.lu@intel.com, binutils@sourceware.org References: <20231124070213.3886483-1-lili.cui@intel.com> <20231124070213.3886483-2-lili.cui@intel.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <20231124070213.3886483-2-lili.cui@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3026.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 24.11.2023 08:02, Cui, Lili wrote: > @@ -3865,6 +3873,12 @@ is_any_vex_encoding (const insn_template *t) > return t->opcode_modifier.vex || t->opcode_modifier.evex; > } > > +static INLINE bool > +is_apx_rex2_encoding (void) > +{ > + return i.rex2 || i.rex2_encoding; > +} This function is used just once. Do we really need it? Or else why don't you use it near the end of md_assemble()? > @@ -4120,6 +4134,21 @@ build_evex_prefix (void) > i.vex.bytes[3] |= i.mask.reg->reg_num; > } > > +/* Build (2 bytes) rex2 prefix. > + | D5h | > + | m | R4 X4 B4 | W R X B | > +*/ > +static void > +build_rex2_prefix (void) > +{ > + /* Rex2 reuses i.vex because they handle i.tm.opcode_space the same. */ How do they handle it the same? (Also I don't think this is useful as a code comment; it instead belongs in the description imo.) > + i.vex.length = 2; > + i.vex.bytes[0] = 0xd5; > + /* For the W R X B bits, the variables of rex prefix will be reused. */ > + i.vex.bytes[1] = ((i.tm.opcode_space << 7) > + | (i.rex2 << 4) | i.rex); > +} > + > static void > process_immext (void) > { > @@ -4385,12 +4414,16 @@ optimize_encoding (void) > i.suffix = 0; > /* Convert to byte registers. */ > if (i.types[1].bitfield.word) > - j = 16; > - else if (i.types[1].bitfield.dword) > + /* There are 40 8-bit registers. */ > j = 32; > + else if (i.types[1].bitfield.dword) > + /* 32 8-bit registers + 32 16-bit registers. */ > + j = 64; > else > - j = 48; > - if (!(i.op[1].regs->reg_flags & RegRex) && base_regnum < 4) > + /* 32 8-bit registers + 32 16-bit registers > + + 32 32-bit registers. */ > + j = 96; > + if (!(i.op[1].regs->reg_flags & (RegRex | RegRex2)) && base_regnum < 4) > j += 8; > i.op[1].regs -= j; > } I did comment on, in particular, the 8-bit register counts before. Afaict the comments above are nevertheless unchanged and hence still not really correct. > @@ -5576,6 +5615,13 @@ md_assemble (char *line) > return; > } > > + /* Check for explicit REX2 prefix. */ > + if (i.rex2_encoding) > + { > + as_bad (_("REX2 prefix invalid with `%s'"), insn_name (&i.tm)); > + return; > + } Again I'm pretty sure I pointed out before that i.rex2_encoding reflects use of {rex2}. Which then the error message should correctly refer to. > @@ -5615,11 +5661,12 @@ md_assemble (char *line) > && (i.op[1].regs->reg_flags & RegRex64) != 0) > || (((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte) > || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte)) > - && i.rex != 0)) > + && (i.rex != 0 || i.rex2 != 0))) > { > int x; > > - i.rex |= REX_OPCODE; > + if (!i.rex2) > + i.rex |= REX_OPCODE; > for (x = 0; x < 2; x++) > { > /* Look for 8 bit operand that uses old registers. */ > @@ -5630,7 +5677,7 @@ md_assemble (char *line) > /* In case it is "hi" register, give up. */ > if (i.op[x].regs->reg_num > 3) > as_bad (_("can't encode register '%s%s' in an " > - "instruction requiring REX prefix."), > + "instruction requiring REX/REX2 prefix."), > register_prefix, i.op[x].regs->reg_name); > > /* Otherwise it is equivalent to the extended register. > @@ -5642,11 +5689,11 @@ md_assemble (char *line) > } > } > > - if (i.rex == 0 && i.rex_encoding) > + if ((i.rex == 0 && i.rex_encoding) || (i.rex2 == 0 && i.rex2_encoding)) Doesn't this want to be if (i.rex == 0 && i.rex2 == 0 && (i.rex_encoding || i.rex2_encoding)) ? > { > /* Check if we can add a REX_OPCODE byte. Look for 8 bit operand > that uses legacy register. If it is "hi" register, don't add > - the REX_OPCODE byte. */ > + rex and rex2 prefix. */ > int x; > for (x = 0; x < 2; x++) > if (i.types[x].bitfield.class == Reg > @@ -5656,6 +5703,7 @@ md_assemble (char *line) > { > gas_assert (!(i.op[x].regs->reg_flags & RegRex)); > i.rex_encoding = false; > + i.rex2_encoding = false; > break; > } > > @@ -5663,7 +5711,13 @@ md_assemble (char *line) > i.rex = REX_OPCODE; > } > > - if (i.rex != 0) > + if (i.rex2 != 0 || i.rex2_encoding) > + { > + build_rex2_prefix (); > + /* The individual REX.RXBW bits got consumed. */ > + i.rex &= REX_OPCODE; > + } > + else if (i.rex != 0) > add_prefix (REX_OPCODE | i.rex); > > insert_lfence_before (); > @@ -5834,6 +5888,10 @@ parse_insn (const char *line, char *mnemonic, bool prefix_only) > /* {rex} */ > i.rex_encoding = true; > break; > + case Prefix_REX2: > + /* {rex2} */ > + i.rex2_encoding = true; > + break; > case Prefix_NoOptimize: > /* {nooptimize} */ > i.no_optimize = true; > @@ -6971,6 +7029,45 @@ VEX_check_encoding (const insn_template *t) > return 0; > } > > +/* Check if Egprs operands are valid for the instruction. */ > + > +static int > +check_EgprOperands (const insn_template *t) > +{ > + if (!t->opcode_modifier.noegpr) > + return 0; > + > + for (unsigned int op = 0; op < i.operands; op++) > + { > + if (i.types[op].bitfield.class != Reg > + /* Special case for (%dx) while doing input/output op */ > + || i.input_output_operand) Didn't we agree that this extra condition isn't necessary, once the producer site correctly updates all state (which was supposed to be done in a small prereq patch)? > @@ -7107,7 +7204,9 @@ match_template (char mnem_suffix) > /* Do not verify operands when there are none. */ > if (!t->operands) > { > - if (VEX_check_encoding (t)) > + /* When there are no operands, we still need to use the > + check_EgprOperands function to check whether {rex2} is valid. */ > + if (VEX_check_encoding (t) || check_EgprOperands (t)) As before imo either the function name wants changing (so it becomes reasonable to use here, without the need for a comment explaining the oddity), or you simply open-code the sole check that is needed here (afaict: t->opcode_modifier.noegpr && i.rex2_encoding). > @@ -7443,6 +7542,13 @@ match_template (char mnem_suffix) > continue; > } > > + /* Check if EGRPS operands(r16-r31) are valid. */ EGPR? > --- a/gas/doc/c-i386.texi > +++ b/gas/doc/c-i386.texi > @@ -217,6 +217,7 @@ accept various extension mnemonics. For example, > @code{avx10.1/256}, > @code{avx10.1/128}, > @code{user_msr}, > +@code{apx_f}, > @code{amx_int8}, > @code{amx_bf16}, > @code{amx_fp16}, > @@ -983,6 +984,9 @@ Different encoding options can be specified via pseudo prefixes: > instructions (x86-64 only). Note that this differs from the @samp{rex} > prefix which generates REX prefix unconditionally. > > +@item > +@samp{@{rex2@}} -- encode with REX2 prefix This isn't in line with what's said for {rex}. Iirc we were in agreement that we want both to behave consistently. In which case documentation also needs to describe them consistently. > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.s > @@ -0,0 +1,86 @@ > +# Check 64bit instructions with rex2 prefix encoding > + > + .allow_index_reg > + .text > +_start: > + test $0x7, %r24b > + test $0x7, %r24d > + test $0x7, %r24 > + test $0x7, %r24w > +## REX2.M bit > + imull %eax, %r15d > + imull %eax, %r16d > + punpckldq (%r18), %mm2 > +## REX2.R4 bit > + leal (%rax), %r16d > + leal (%rax), %r17d > + leal (%rax), %r18d > + leal (%rax), %r19d > + leal (%rax), %r20d > + leal (%rax), %r21d > + leal (%rax), %r22d > + leal (%rax), %r23d > + leal (%rax), %r24d > + leal (%rax), %r25d > + leal (%rax), %r26d > + leal (%rax), %r27d > + leal (%rax), %r28d > + leal (%rax), %r29d > + leal (%rax), %r30d > + leal (%rax), %r31d > +## REX2.X4 bit > + leal (,%r16), %eax > + leal (,%r17), %eax > + leal (,%r18), %eax > + leal (,%r19), %eax > + leal (,%r20), %eax > + leal (,%r21), %eax > + leal (,%r22), %eax > + leal (,%r23), %eax > + leal (,%r24), %eax > + leal (,%r25), %eax > + leal (,%r26), %eax > + leal (,%r27), %eax > + leal (,%r28), %eax > + leal (,%r29), %eax > + leal (,%r30), %eax > + leal (,%r31), %eax > +## REX.B4 bit Further up you properly say REX2. Here and below it's only REX? > --- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s > +++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s > @@ -5,3 +5,61 @@ pseudos: > {rex} vmovaps %xmm7,%xmm2 > {rex} vmovaps %xmm17,%xmm2 > {rex} rorx $7,%eax,%ebx > + {rex2} vmovaps %xmm7,%xmm2 > + {rex2} xsave (%rax) > + {rex2} xsaves (%ecx) > + {rex2} xsaves64 (%ecx) > + {rex2} xsavec (%ecx) > + {rex2} xrstors (%ecx) > + {rex2} xrstors64 (%ecx) > + > + #All opcodes in the row 0xa* prefixed REX2 are illegal. > + #{rex2} test (0xa8) is a special case, it will remap to test (0xf6) > + {rex2} mov 0x90909090,%al > + {rex2} movabs 0x1,%al > + {rex2} cmpsb %es:(%edi),%ds:(%esi) > + {rex2} lodsb > + {rex2} lods %ds:(%esi),%al > + {rex2} lodsb (%esi) > + {rex2} movs > + {rex2} movs (%esi), (%edi) > + {rex2} scasl > + {rex2} scas %es:(%edi),%eax > + {rex2} scasb (%edi) > + {rex2} stosb > + {rex2} stosb (%edi) > + {rex2} stos %eax,%es:(%edi) > + > + #All opcodes in the row 0x7* prefixed REX2 are illegal. This also covers map 1 row 8, doesn't it? > + {rex2} jo .+2-0x70 > + {rex2} jno .+2-0x70 > + {rex2} jb .+2-0x70 > + {rex2} jae .+2-0x70 > + {rex2} je .+2-0x70 > + {rex2} jne .+2-0x70 > + {rex2} jbe .+2-0x70 > + {rex2} ja .+2-0x70 > + {rex2} js .+2-0x70 > + {rex2} jns .+2-0x70 > + {rex2} jp .+2-0x70 > + {rex2} jnp .+2-0x70 > + {rex2} jl .+2-0x70 > + {rex2} jge .+2-0x70 > + {rex2} jle .+2-0x70 > + {rex2} jg .+2-0x70 > + > + #All opcodes in the row 0x7* prefixed REX2 are illegal. > + {rex2} in $0x90,%al > + {rex2} in $0x90 > + {rex2} out $0x90,%al > + {rex2} out $0x90 > + {rex2} jmp *%eax > + {rex2} loop foo Isn't this row 0xE? > + #All opcodes in the row 0xf3* prefixed REX2 are illegal. This comment continues to be confusing: 0xf3 is a REP prefix. Perhaps best to either say "map 1" and omit the "f" or at least write 0x0f3* or slightly better 0x0f 0x3*. > --- a/gas/testsuite/gas/i386/x86-64-pseudos.s > +++ b/gas/testsuite/gas/i386/x86-64-pseudos.s > @@ -360,6 +360,19 @@ _start: > {rex} movaps (%r8),%xmm2 > {rex} phaddw (%rcx),%mm0 > {rex} phaddw (%r8),%mm0 > + {rex2} mov %al,%ah > + {rex2} shl %cl, %eax > + {rex2} cmp %cl, %dl > + {rex2} mov $1, %bl > + {rex2} movl %eax,%ebx > + {rex2} movl %eax,%r14d > + {rex2} movl %eax,(%r8) > + {rex2} movaps %xmm7,%xmm2 > + {rex2} movaps %xmm7,%xmm12 > + {rex2} movaps (%rcx),%xmm2 > + {rex2} movaps (%r8),%xmm2 > + {rex2} pmullw %mm0,%mm6 > + > > movb (%rbp),%al > {disp8} movb (%rbp),%al No double blank lines please. > --- a/opcodes/i386-dis.c > +++ b/opcodes/i386-dis.c Disassembler comments (if any) in a separate (later) mail again. > --- a/opcodes/i386-gen.c > +++ b/opcodes/i386-gen.c > @@ -275,6 +275,8 @@ static const dependency isa_dependencies[] = > "64" }, > { "USER_MSR", > "64" }, > + { "APX_F", > + "XSAVE|64" }, > }; > > /* This array is populated as process_i386_initializers() walks cpu_flags[]. */ > @@ -397,6 +399,7 @@ static bitfield cpu_flags[] = > BITFIELD (FRED), > BITFIELD (LKGS), > BITFIELD (USER_MSR), > + BITFIELD (APX_F), > BITFIELD (MWAITX), > BITFIELD (CLZERO), > BITFIELD (OSPKE), > @@ -486,6 +489,7 @@ static bitfield opcode_modifiers[] = > BITFIELD (ATTSyntax), > BITFIELD (IntelSyntax), > BITFIELD (ISA64), > + BITFIELD (NoEgpr), > }; > > #define CLASS(n) #n, n > @@ -1072,10 +1076,48 @@ get_element_size (char **opnd, int lineno) > return elem_size; > } > > +static bool > +rex2_disallowed (const unsigned long long opcode, unsigned int space, > + const char *cpu_flags) > +{ > + /* Prefixing XSAVE* and XRSTOR* instructions with REX2 triggers #UD. */ > + if (strcmp (cpu_flags, "XSAVES") >= 0 > + || strcmp (cpu_flags, "XSAVEC") >= 0 > + || strcmp (cpu_flags, "Xsave") >= 0 > + || strcmp (cpu_flags, "Xsaveopt") >= 0) > + return true; Wasn't this intended to be dropped, being redundant with the opcode table attributes? > + /* All opcodes listed map0 0x4*, 0x7*, 0xa*, 0xe* and map1 0x3*, 0x8* > + are reserved under REX2 and triggers #UD when prefixed with REX2 */ > + if (space == 0) > + switch (opcode >> 4) Both here and ... > + { > + case 0x4: > + case 0x7: > + case 0xA: > + case 0xE: > + return true; > + default: > + return false; > + } > + > + if (space == SPACE_0F) > + switch (opcode >> 4) ... here, don't you also need to mask off further bits? There are quite a few opcodes which have a kind-of ModR/M byte encoded directly in the opcode, for example. > + { > + case 0x3: > + case 0x8: > + return true; > + default: > + return false; > + } > + > + return false; > +} > + > static void > process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space, > unsigned int prefix, const char *extension_opcode, > - char **opnd, int lineno) > + char **opnd, int lineno, bool rex2_disallowed) > { > char *str, *next, *last; > bitfield modifiers [ARRAY_SIZE (opcode_modifiers)]; > @@ -1202,6 +1244,12 @@ process_i386_opcode_modifier (FILE *table, char *mod, unsigned int space, > || modifiers[SAE].value)) > modifiers[EVex].value = EVEXDYN; > > + /* Vex, legacy map2 and map3 and rex2_disallowed do not support EGPR. > + For template supports both Vex and EVex allowing EGPR. */ "Templates supporting both Vex and EVex allow EGPR." > + if ((modifiers[Vex].value || space > SPACE_0F || rex2_disallowed) > + && !modifiers[EVex].value) > + modifiers[NoEgpr].value = 1; > + > output_opcode_modifier (table, modifiers, ARRAY_SIZE (modifiers)); > } > > @@ -1425,8 +1473,11 @@ output_i386_opcode (FILE *table, const char *name, char *str, > ident, 2 * (int)length, opcode, end, i); > free (ident); > > + /* Add some specilal handle for current entry. */ > + bool has_special_handle = rex2_disallowed (opcode, space, cpu_flags); The local variable (if one is needed in the first place) wants naming as usefully as the function now is named. Similarly the comment would want improving alonmg those lines. > process_i386_opcode_modifier (table, opcode_modifier, space, prefix, > - extension_opcode, operand_types, lineno); > + extension_opcode, operand_types, lineno, > + has_special_handle); > > process_i386_cpu_flag (table, cpu_flags, NULL, ",", " ", lineno, CpuMax); > > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -138,6 +138,7 @@ > #define Vsz256 Vsz=VSZ256 > #define Vsz512 Vsz=VSZ512 > > + > // The EVEX purpose of StaticRounding appears only together with SAE. Re-use > // the bit to mark commutative VEX encodings where swapping the source > // operands may allow to switch from 3-byte to 2-byte VEX encoding. Stray change (in general please avoid introducing double blank lines, as those make patch context less useful). Jan