From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by sourceware.org (Postfix) with ESMTPS id BEC7E3858D33 for ; Fri, 22 Dec 2023 13:08:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BEC7E3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BEC7E3858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::332 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703250486; cv=none; b=QoEovau5H5Kc19Uj+rY4/0zrSwTtsvt88bGfaB6nwG5XqtU90Xvrd8uY4+KiqRtsa7JnSGnjMzalnVRdDRsgIybOQ8gQ1lpqFDFubva+uwWlVUNB2er5k9qIU9qyRDI+nD1W7x7P7S03HfcDHMpgPpciJr+HhTl2arfE+RWfVyA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703250486; c=relaxed/simple; bh=2Kc0Hp0d3JoJBqdBP0L+roo/35Gck0jTWf8yBMzxitU=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=nX6FmTeXA4KUS3f8oNBprlMzJ3WHMNYgy/AO3CCVkbWYCrOId3/s/Jq3+XmO+JpfXNOt6qtS/ZQ8zW4ObAiYdE0ic2/u8JOUsrnsDbjs01mWmP4A3QDTiSSVESOUMaYs/T/5l1H3RhqTMftL8td9xNwIfaXOZHB8t9xiurDwYSA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-40d22d3e751so17554175e9.1 for ; Fri, 22 Dec 2023 05:08:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1703250482; x=1703855282; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=IJUrLNbL08n5/cLPIi0aUH6VBSEh0BOX8LEBl3blPcY=; b=Erow1VuALO8VclJIh3HcHQ90flzqq5P/S8e62RP68/smrxasNj7HZQOv+1+ECCJNls h+6jt/Fe7BXwJUiyZ4zU3an2yWMK2/gltwUcqCLES7O7DfU09kNjDrUxgM/36FRS3txa SnFicBfQ+nu1TOEdmkEQFcEWnmyCWN6iF2iJvPERsAyr89tIC9y8npoLzENgKgtZNgoq u9QLcikLOv1tGq6+D+JAFezZDlzV/6kKQywYgrPQnoxfi+OWXUhZA53jLD3K8zuzkJM9 xPN9l9ZodTs6NeiFb5+4ndVkatbkv1dkqToAvACNJX6DsTiTLMQcOkwksN+NuhrdSh+g snxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703250482; x=1703855282; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IJUrLNbL08n5/cLPIi0aUH6VBSEh0BOX8LEBl3blPcY=; b=K0nd8qgqNP8ZEDO5iuhB483oYAlHC4sRY2/O5FqgEktZVpJE2O9rRR60O0hp6efugr /8n9sln1rAckYokHDarKBy3KeZ/1DeYN49yHZT3blJmSTcjpbgQ2tGOlGiloJFEmrz6G Cx+RiAy/61u+DJnZa6XtTuY3lPFkfjD1VEd4bPSCJHqAWuagBMqAV1HjCU8af3Yoqra5 n3mkmIOkPie9mHmyX0sw8rvGvDWgsTII8VJRtv6Ck6mwe/6PYxvnHsvDPJqiXbaUPsed S6EVYyo/M6juzmmzHR46u+3Z5vLvfV/5Y0jiL0TSeZPFqlk5YY3LyntwXbQas2R1pEU6 4mSA== X-Gm-Message-State: AOJu0YwCPfhcP39HAaf3dQDnPieI6edt5nLKNNpFMPX+gPXzgkCpr54s y8M5tNpmyZTaMmOkaEcHlZHFhndogfA8RC+EuDBSysnRnKwA X-Google-Smtp-Source: AGHT+IEQNdba54cDs2jDt9j3XApaw5y7bgXdj19sQKO0KqXz7kuXCkDPQYtjkmn78VnflNi87s7fDA== X-Received: by 2002:a05:600c:1e89:b0:40b:5e4a:2368 with SMTP id be9-20020a05600c1e8900b0040b5e4a2368mr725616wmb.106.1703250482448; Fri, 22 Dec 2023 05:08:02 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id q17-20020adfcd91000000b003362d0eefd3sm4247049wrj.20.2023.12.22.05.08.02 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 22 Dec 2023 05:08:02 -0800 (PST) Message-ID: Date: Fri, 22 Dec 2023 14:08:01 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 1/9] Support APX GPR32 with rex2 prefix Content-Language: en-US To: "Cui, Lili" Cc: hongjiu.lu@intel.com, binutils@sourceware.org References: <20231219121218.974012-1-lili.cui@intel.com> <20231219121218.974012-2-lili.cui@intel.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <20231219121218.974012-2-lili.cui@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3026.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 19.12.2023 13:12, Cui, Lili wrote: > @@ -5283,6 +5321,9 @@ md_assemble (char *line) > case unsupported_syntax: > err_msg = _("unsupported syntax"); > break; > + case unsupported_EGPR_for_addressing: > + err_msg = _("extended GPR cannot be used as base/index"); > + break; While this one's now suitable for the as_bad() below the switch, ... > @@ -5336,6 +5377,9 @@ md_assemble (char *line) > case invalid_dest_and_src_register_set: > err_msg = _("destination and source registers must be distinct"); > break; > + case invalid_pseudo_prefix: > + err_msg = _("rex2 pseudo prefix cannot be used here"); > + break; ... this one still doesn't really fit the "... for `'" there. At least the "here" needs dropping. > @@ -5630,11 +5681,12 @@ md_assemble (char *line) > && (i.op[1].regs->reg_flags & RegRex64) != 0) > || (((i.types[0].bitfield.class == Reg && i.types[0].bitfield.byte) > || (i.types[1].bitfield.class == Reg && i.types[1].bitfield.byte)) > - && i.rex != 0)) > + && (i.rex != 0 || i.rex2 != 0))) > { > int x; > > - i.rex |= REX_OPCODE; > + if (!is_apx_rex2_encoding () && !is_any_vex_encoding(&i.tm)) > + i.rex |= REX_OPCODE; > for (x = 0; x < 2; x++) > { > /* Look for 8 bit operand that uses old registers. */ > @@ -5645,7 +5697,7 @@ md_assemble (char *line) > /* In case it is "hi" register, give up. */ > if (i.op[x].regs->reg_num > 3) > as_bad (_("can't encode register '%s%s' in an " > - "instruction requiring REX prefix."), > + "instruction requiring REX/REX2 prefix."), > register_prefix, i.op[x].regs->reg_name); > > /* Otherwise it is equivalent to the extended register. > @@ -5657,11 +5709,11 @@ md_assemble (char *line) > } > } > > - if (i.rex == 0 && i.rex_encoding) > + if (i.rex == 0 && i.rex2 == 0 && (i.rex_encoding || i.rex2_encoding)) > { > /* Check if we can add a REX_OPCODE byte. Look for 8 bit operand > that uses legacy register. If it is "hi" register, don't add > - the REX_OPCODE byte. */ > + rex and rex2 prefix. */ > int x; > for (x = 0; x < 2; x++) > if (i.types[x].bitfield.class == Reg > @@ -5671,6 +5723,7 @@ md_assemble (char *line) > { > gas_assert (!(i.op[x].regs->reg_flags & RegRex)); > i.rex_encoding = false; > + i.rex2_encoding = false; > break; > } > > @@ -5678,7 +5731,13 @@ md_assemble (char *line) > i.rex = REX_OPCODE; > } > > - if (i.rex != 0) > + if (is_apx_rex2_encoding ()) > + { > + build_rex2_prefix (); > + /* The individual REX.RXBW bits got consumed. */ > + i.rex &= REX_OPCODE; > + } > + else if (i.rex != 0) > add_prefix (REX_OPCODE | i.rex); > > insert_lfence_before (last_insn); All of this will need re-basing over "x86: properly respect rex/{rex}", with the result (I hope) that .insn will then also be covered REX2-wise. > @@ -5752,6 +5811,20 @@ parse_insn (const char *line, char *mnemonic, bool prefix_only) > goto too_long; > *mnem_p = '\0'; > > + /* Point l at the closing brace if there's no other separator. */ > + if (*l != END_OF_INSN && !is_space_char (*l) > + && *l != PREFIX_SEPARATOR) > + --l; > + } > + /* Skip the immediate 0x** of {rex2 0x00} prefix. */ > + else if (*mnemonic == '{'&& is_space_char (*l)) Nit: Missing blank. > + { > + while ( *l != '}') Nit: Stray blank. > + ++l; What if there's no '}' on the line? > + *mnem_p++ = *l++; > + if (mnem_p >= mnemonic + MAX_MNEM_SIZE) > + goto too_long; > + *mnem_p = '\0'; You skip everything, not just 0xNN. You also skip stuff after other pseudo prefixes, if I'm not mistaken. That's both too lax. However, skipping isn't an option here anyway. Either we actually respect what the user has written, or we error out. Skipping therefore is an option only if the provided expression (not just plain number!) evaluates to 0. I don't understand anyway why this code was added: When I asked about the specific plans, H.J. clearly said the form with a constant would be a disassembler- only thing for now. > @@ -7005,6 +7082,43 @@ VEX_check_encoding (const insn_template *t) > return 0; > } > > +/* Check if Egprs operands are valid for the instruction. */ > + > +static int > +check_EgprOperands (const insn_template *t) Hmm, I thought I had asked before to make functions with boolean return values have a return type of bool, and then use "true" for success. An alternative would be to return the error indicator, rather than putting it in i.error here. Then again I realize this is in line with VEX_check_encoding() and check_VecOperands() (which I think would better be changed, but anyway). > @@ -7149,6 +7263,14 @@ match_template (char mnem_suffix) > continue; > } > > + /* Check if pseudo prefix {rex2} is valid. */ > + if (t->opcode_modifier.noegpr && i.rex2_encoding) > + { > + i.error = invalid_pseudo_prefix; What is this needed for? I.e. why can't you pass the value ... > + specific_error = progress (i.error); ... in here directly (as is done elsewhere as well)? > --- /dev/null > +++ b/gas/testsuite/gas/i386/x86-64-apx-rex2.s > @@ -0,0 +1,86 @@ > +# Check 64bit instructions with rex2 prefix encoding > + > + .allow_index_reg > + .text > +_start: > + test $0x7, %r24b > + test $0x7, %r24d > + test $0x7, %r24 > + test $0x7, %r24w > +## REX2.M bit > + imull %eax, %r15d > + imull %eax, %r16d > + punpckldq (%r18), %mm2 > +## REX2.R4 bit > + leal (%rax), %r16d > + leal (%rax), %r17d > + leal (%rax), %r18d > + leal (%rax), %r19d > + leal (%rax), %r20d > + leal (%rax), %r21d > + leal (%rax), %r22d > + leal (%rax), %r23d > + leal (%rax), %r24d > + leal (%rax), %r25d > + leal (%rax), %r26d > + leal (%rax), %r27d > + leal (%rax), %r28d > + leal (%rax), %r29d > + leal (%rax), %r30d > + leal (%rax), %r31d > +## REX2.X4 bit > + leal (,%r16), %eax > + leal (,%r17), %eax > + leal (,%r18), %eax > + leal (,%r19), %eax > + leal (,%r20), %eax > + leal (,%r21), %eax > + leal (,%r22), %eax > + leal (,%r23), %eax > + leal (,%r24), %eax > + leal (,%r25), %eax > + leal (,%r26), %eax > + leal (,%r27), %eax > + leal (,%r28), %eax > + leal (,%r29), %eax > + leal (,%r30), %eax > + leal (,%r31), %eax > +## REX2.B4 bit > + leal (%r16), %eax > + leal (%r17), %eax > + leal (%r18), %eax > + leal (%r19), %eax > + leal (%r20), %eax > + leal (%r21), %eax > + leal (%r22), %eax > + leal (%r23), %eax > + leal (%r24), %eax > + leal (%r25), %eax > + leal (%r26), %eax > + leal (%r27), %eax > + leal (%r28), %eax > + leal (%r29), %eax > + leal (%r30), %eax > + leal (%r31), %eax > +## REX2.W bit > + leaq (%rax), %r15 > + leaq (%rax), %r16 > + leaq (%r15), %rax > + leaq (%r16), %rax > + leaq (,%r15), %rax > + leaq (,%r16), %rax > +## REX2.R3 bit > + add (%r16), %r8 > + add (%r16), %r15 > +## REX2.X3 bit > + mov (,%r9), %r16 > + mov (,%r14), %r16 > +## REX2.B3 bit > + sub (%r10), %r31 > + sub (%r13), %r31 Nit: There's still an indentation anomaly here. > --- a/gas/testsuite/gas/i386/x86-64-inval-pseudo.s > +++ b/gas/testsuite/gas/i386/x86-64-inval-pseudo.s > @@ -1,4 +1,8 @@ > .text > {disp16} movb (%ebp),%al > {disp16} movb (%rbp),%al > + > + /* Instruction not support APX. */ > + {rex2} xsave (%r15, %rbx) > + {rex2} xsave64 (%r15, %rbx) > .p2align 4,0 Aren't these dealt with (in a more complete fashion) in x86-64-pseudos-bad.s? > --- a/gas/testsuite/gas/i386/x86-64-pseudos-bad.s > +++ b/gas/testsuite/gas/i386/x86-64-pseudos-bad.s > @@ -5,3 +5,77 @@ pseudos: > {rex} vmovaps %xmm7,%xmm2 > {rex} vmovaps %xmm17,%xmm2 > {rex} rorx $7,%eax,%ebx > + {rex2} vmovaps %xmm7,%xmm2 > + {rex2} xsave (%rax) > + {rex2} xsaves (%ecx) > + {rex2} xsaves64 (%ecx) > + {rex2} xsavec (%ecx) > + {rex2} xrstors (%ecx) > + {rex2} xrstors64 (%ecx) Here. > --- a/opcodes/i386-dis.c > +++ b/opcodes/i386-dis.c > @@ -144,6 +144,12 @@ struct instr_info > /* Bits of REX we've already used. */ > uint8_t rex_used; > > + /* Record W R4 X4 B4 bits for rex2. */ > + unsigned char rex2; > + /* Bits of REX2 we've already used. */ > + unsigned char rex2_used; When you say REX2, one ought to be permitted to imply you mean the prefix, not the struct field. That's ambiguous here, though - bit positions used match those in rex2, not those in the REX2 payload. IOW either you use lower case to make more obvious that you mean the other struct field, or you say e.g. "REX2 prefix bits we've already used." Albeit that would still be imprecise, as other REX2 prefix bits' use is recorded in rex_used. > @@ -265,8 +272,13 @@ struct dis_private { > { \ > if (value) \ > { \ > - if ((ins->rex & value)) \ > + if (ins->rex & value) \ > ins->rex_used |= (value) | REX_OPCODE; \ Like is done here, ... > + if (ins->rex2 & value) \ > + { \ > + ins->rex2_used |= value; \ other uses of "value" also want parenthesizing, unless not used with any kind of operator (e.g. in the if() above). > @@ -276,6 +288,9 @@ struct dis_private { > #define EVEX_b_used 1 > #define EVEX_len_used 2 > > +/* M0 in rex2 prefix represents map0 or map1. */ > +#define REX2_M 0x8 Extending an earlier comment: This really should go next to REX_W and friends. In principle the assembler could want to use this constant as well, hence why it would better go the opcode/i386.h anyway. > @@ -4196,19 +4221,19 @@ static const struct dis386 x86_64_table[][2] = { > > /* X86_64_E8 */ > { > - { "callP", { Jv, BND }, 0 }, > - { "call@", { Jv, BND }, 0 } > + { "callP", { Jv, BND }, PREFIX_REX2_ILLEGAL }, This, ... > + { "call@", { Jv, BND }, PREFIX_REX2_ILLEGAL } > }, > > /* X86_64_E9 */ > { > - { "jmpP", { Jv, BND }, 0 }, > - { "jmp@", { Jv, BND }, 0 } > + { "jmpP", { Jv, BND }, PREFIX_REX2_ILLEGAL }, ... this, and ... > + { "jmp@", { Jv, BND }, PREFIX_REX2_ILLEGAL } > }, > > /* X86_64_EA */ > { > - { "{l|}jmp{P|}", { Ap }, 0 }, > + { "{l|}jmp{P|}", { Ap }, PREFIX_REX2_ILLEGAL }, > }, ... this change isn't really needed, is it? The marker is only needed on 64-bit insns (i.e. respective slots 2 of x86_64_table[] entries). Jan