From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by sourceware.org (Postfix) with ESMTPS id BBF0D3858D28 for ; Fri, 22 Dec 2023 13:50:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BBF0D3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BBF0D3858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::42e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703253003; cv=none; b=MNZfe7IMFW78xTY0TY5VRmsxgoVTLQtnxohYDe/yvSsmxheoe5K51pjVADbr4shdgv0IfQ8Lem74Zq8VPvIN4fREoXaAGjslmI8goLG1Oj7rRfeDcm7jnwdxGZFEpRumwA/UyBNFUh9I+ud2yditoeGjwSqPvpYBEfP2WSPKvOE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703253003; c=relaxed/simple; bh=uaCxc01NoIk0CjOWxlSKEjEBIfhiuENe7Ka4SUQughE=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=XrAeJccTP9zoKfSoJYgmW6kBui6aeWG4Q+Zkthx2ysQQIfcnIr7PTM1WjmCx1ENJppEO9woiCGIx5UZexBUrQ0JsSUBtTqrHjnrZSyh/8v62ufTebOKDieupzRcOCrxPQoLqNsQKgpTL4RqprH6ExPiA8OthvQBKwoVEorMHH6s= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x42e.google.com with SMTP id ffacd0b85a97d-3368ac0f74dso1277056f8f.0 for ; Fri, 22 Dec 2023 05:50:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1703252999; x=1703857799; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=/+M6C4mTyBhW1diPCvUykVOeyBodJ2K+QpmhZ+tjsQs=; b=NuTRBQX0xeGxaedVnCa4NSOm2jGK1rh8/aEPeh9qHtnQ9Dg2Kj/CFDZEre39YNi7P4 G7Zt2WaoovWACJTCJLDRVl4QyfkclqXLVcwPOWbJBHywlLmcT4KhgOWeIKMSUA8+YTUO N2oDPNQM1I8QVmHI1EGS/DWg6q6PciULbbS9hKY/ibVO1kCOIE3aDwnlqbSr6a81XOMh MgREkGBXF0agTVqzdpRnbV5tJ8A7SXlSrevr0SQi77PNCtt/FA8lSSmbSKYZSJEiLAC/ fW0Z2hB8fM6R1P5UFhCRTnDlRQ8lS8Of5YWu+0ZcD3Ki6cqSJ+6/lwqudcAV5Ur78bN7 guEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703252999; x=1703857799; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/+M6C4mTyBhW1diPCvUykVOeyBodJ2K+QpmhZ+tjsQs=; b=lLHrwnpUTu6auc++9jz/x5pLbIjm5tVxOSaVEhXhIDPwA/RjphTuprmaFFKVvlN+l4 2uccixxFWQrYoUYGdQp4Ri4XFDzA2B7nOdvTMBmiC14jrdvY6BlQWoUQAK2fiykD34YM mCjYh/wJVivexSWbmGkuQ2Nl7NN8rcOSrNuAXRoiY0eOPgXyOHmxMpe/ylTX46Cj/3nF d3oTR3JHomM0T96CZspHvrpJ92eWiDe1xviKgwAHA57VoBMz6qh8HyL9Zya9zn0D9/kA Jw4Pyaes1wHNu4FRqIvxnFFdqfjpocrYDl8OwW/qRy1dkYCx7EK+VvHDt+SoHcS/OaAt 9NTA== X-Gm-Message-State: AOJu0Yz0XMkgTap+AwIfsLXAcBTf3xgikzfx6HiopQccwdkqqdOJ1tV6 UTqthJ0Q5tEIunlbWGOqVwZSnsNYRU+3zk6VSOp0SwAbk5SG X-Google-Smtp-Source: AGHT+IGIgFb3lR3WYte/85wtxXgdSenTpNuGR3PCZgv5QpHxVyfmgH3XyeffLY2LnxBNSy1A5RLIdg== X-Received: by 2002:a5d:610a:0:b0:336:7798:148a with SMTP id v10-20020a5d610a000000b003367798148amr818516wrt.140.1703252999349; Fri, 22 Dec 2023 05:49:59 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id e18-20020a056000121200b0033660aabe76sm4345948wrx.39.2023.12.22.05.49.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 22 Dec 2023 05:49:59 -0800 (PST) Message-ID: <1a796802-5267-4c88-abc4-dda4bdd262cc@suse.com> Date: Fri, 22 Dec 2023 14:49:58 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 3/9] Support APX GPR32 with extend evex prefix Content-Language: en-US To: "Cui, Lili" Cc: hongjiu.lu@intel.com, binutils@sourceware.org References: <20231219121218.974012-1-lili.cui@intel.com> <20231219121218.974012-4-lili.cui@intel.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <20231219121218.974012-4-lili.cui@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3026.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 19.12.2023 13:12, Cui, Lili wrote: > --- a/gas/config/tc-i386.c > +++ b/gas/config/tc-i386.c > @@ -89,6 +89,7 @@ > /* This matches the C -> StaticRounding alias in the opcode table. */ > #define commutative staticrounding > > +#define APX_F(cpuid) (maybe_cpu (t, CpuAPX_F) && maybe_cpu (t, cpuid)) Why is this still here? I said more than once that it's not helpful to have. As can be seen ... > @@ -3673,7 +3674,7 @@ install_template (const insn_template *t) > > /* Dual VEX/EVEX templates need stripping one of the possible variants. */ > if (t->opcode_modifier.vex && t->opcode_modifier.evex) > - { > + { > if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2) > || maybe_cpu (t, CpuFMA)) > && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL))) > @@ -3695,7 +3696,15 @@ install_template (const insn_template *t) > gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa); > } > } > - } > + > + if (APX_F(CpuCMPCCXADD) || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) > + || APX_F(CpuAVX512DQ) || APX_F(CpuAVX512BW) || APX_F(CpuBMI) > + || APX_F(CpuBMI2)) ... right here: There's no point in checking CpuAPX_F a whopping 7 times. > + if (need_evex_encoding ()) > + i.tm.opcode_modifier.vex = 0; > + else > + i.tm.opcode_modifier.evex = 0; > + } I'm also pretty sure that I asked before that such nested if/else please have proper braces for the body of the outer if(). To say it very clearly again: When you submit a new version, _all_ prior review comments should be addressed. Whether that's verbally (by explaining why a change cannot be made) or by adjusting the code is another matter. I said before that reviewing this work has proven extremely time consuming. I shouldn't be required to needlessly put in yet more time, just to re-spot and re-comment things already pointed out. > @@ -3876,6 +3885,15 @@ is_any_vex_encoding (const insn_template *t) > return t->opcode_modifier.vex || t->opcode_modifier.evex; > } > > +/* We can use this function only when the current encoding is evex. */ > +static INLINE bool > +is_apx_evex_encoding (void) > +{ > + return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4 > + || (i.vex.register_specifier > + && i.vex.register_specifier->reg_flags & RegRex2); Nit: Parentheses please around the & expression. > @@ -8097,7 +8142,11 @@ process_suffix (void) > if (i.tm.opcode_modifier.jump == JUMP_BYTE) /* jcxz, loop */ > prefix = ADDR_PREFIX_OPCODE; > > - if (!add_prefix (prefix)) > + /* The DATA PREFIX of EVEX promoted from legacy APX instructions > + needs to be adjusted. */ > + if (i.tm.opcode_space == SPACE_EVEXMAP4) > + i.tm.opcode_modifier.opcodeprefix = PREFIX_0X66; Feels like I did ask before: What if i.tm.opcode_modifier.opcodeprefix is already set? Aiui that would be a bug, but one that's likely easy to introduce and hard to find. IOW - better assert the field is clear before filling it? > @@ -14293,6 +14342,12 @@ static bool check_register (const reg_entry *r) > if (!cpu_arch_flags.bitfield.cpuapx_f > || flag_code != CODE_64BIT) > return false; > + > + /* When using RegRex2, dual VEX/EVEX templates need to be marked as EVEX. > + For the later install_template function. */ > + if (current_templates.start->opcode_modifier.vex > + && current_templates.start->opcode_modifier.evex) > + i.vec_encoding = vex_encoding_evex; > } Just to state it again - no use of current_templates in this funciton, please. > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -106,16 +106,6 @@ > #define HLEPrefixRelease PrefixOk=PrefixHLERelease > #define NoTrackPrefixOk PrefixOk=PrefixNoTrack > > -#define Space0F OpcodeSpace=SPACE_0F > -#define Space0F38 OpcodeSpace=SPACE_0F38 > -#define Space0F3A OpcodeSpace=SPACE_0F3A > -#define SpaceXOP08 OpcodeSpace=SPACE_XOP08 > -#define SpaceXOP09 OpcodeSpace=SPACE_XOP09 > -#define SpaceXOP0A OpcodeSpace=SPACE_XOP0A > - > -#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5 > -#define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6 > - Why are you moving these, leaving ... > #define VexMap7 OpcodeSpace=SPACE_VEXMAP7 ... this one disconnected? IOW - I see no need for the moving, but if there is a need, then this one also needs moving (see how it'll become relevant for USER_MSR+APX_F now). Specifically ... > @@ -137,11 +127,25 @@ > #define EVexLIG EVex=EVEXLIG > #define EVexDYN EVex=EVEXDYN > > +#define Space0F OpcodeSpace=SPACE_0F > +#define Space0F38 OpcodeSpace=SPACE_0F38 > +#define Space0F3A OpcodeSpace=SPACE_0F3A > +#define SpaceXOP08 OpcodeSpace=SPACE_XOP08 > +#define SpaceXOP09 OpcodeSpace=SPACE_XOP09 > +#define SpaceXOP0A OpcodeSpace=SPACE_XOP0A > + > +#define EVexMap4 OpcodeSpace=SPACE_EVEXMAP4|EVex128 ... there's no need for this to live after EVex128 was #define-s. > +#define EVexMap5 OpcodeSpace=SPACE_EVEXMAP5 > +#define EVexMap6 OpcodeSpace=SPACE_EVEXMAP6 > + > #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL > > #define Vsz256 Vsz=VSZ256 > #define Vsz512 Vsz=VSZ512 > > +// The template supports VEX format for cpuid and EVEX format for cpuid & apx_f. > +#define APX_F(cpuid) cpuid&(cpuid|APX_F) I think the comment wants to go into further detail. Please can you go back to read what I said when I suggested this construct, in particular regarding the stripping then done? However, with you not having found a need to fiddle with cpu_flags_match(), I wonder if this construct is needed in the first place. The earlier suggestion was entirely based on the assumption that stripping similar to that for other combined VEX/EVEX templates would be needed here, too. > @@ -2049,13 +2059,20 @@ bndldx, 0x0f1a, MPX, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex, RegBND } > > // SHA instructions. > sha1rnds4, 0xf3acc, SHA, Modrm|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM } > +sha1rnds4, 0xd4, SHA&APX_F, Modrm|NoSuf|EVexMap4, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM } > sha1nexte, 0xf38c8, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM } > +sha1nexte, 0xd8, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM } > sha1msg1, 0xf38c9, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM } > +sha1msg1, 0xd9, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM } > sha1msg2, 0xf38ca, SHA, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM } > +sha1msg2, 0xda, SHA&APX_F, Modrm|NoSuf|EVexMap4, { RegXMM|Unspecified|BaseIndex, RegXMM } > sha256rnds2, 0xf38cb, SHA, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM } What about this form? It surely also wants to gain an EVEX counterpart. Jan