From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by sourceware.org (Postfix) with ESMTPS id 3032A3858281 for ; Tue, 12 Dec 2023 11:16:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3032A3858281 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3032A3858281 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702379821; cv=none; b=YksTTwDMK+AfVCUEKEvER86YuZORkJs/cwXECbDXT4DUX+2Ns6WzrvXbKGT3SmcjZxMwg8Tw6Ni82TvOM1ZepNllak+0pKPCX9kG1OsSZzQWufCHYaUa1GvwQYUduuDEv1Q2WwhRsjVqZodNjvFtZsphDx8GkfWVH0QNQdl7plU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702379821; c=relaxed/simple; bh=ilYuGafGLV8xFF3oD5i2mdpADC/hMXf8gKW/iHAKBVU=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=Cmp5zC5TP+2dO8E03xrMRR6E9ImNdrk7ExY14tD2v8oLUXQKQxmqJRN1WPJw9MpC/lmw0FGMSxmereFNg2YxIZQXD+EI0RFmOqn45ix/qfVsVkr3KgZVpM028lHox4YO9AUQ35hLLMP7ZGBg/8P7uu3lbpCWLuUMmeLbLoLmi60= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-40c236624edso55964055e9.1 for ; Tue, 12 Dec 2023 03:16:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1702379818; x=1702984618; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=emLexNeiMxRrpYH4NoukFA94n/qTOww12Jh35UFMKZQ=; b=XCUZ+XAyQCVBPZhc62GokFbe/z4voRLkRbtUpe/zq/DBeKWAEv+zlFtDzzxkM5sMbf phsx4BtQcqtjaU4Gb/fi4Q4+XFw+EH/1dki68F8lNBv3SCH8iDODmFOd/EAMtApzHzBz dZwVUcUm/k1EiUkPMnmdg/ujdlhK4gPsZShLu+rjXOaVxF38BSpKEbAJzFmtAdDe3EcE u9SpQlhC/Ge9B8tyYIVPJcm6bqwVRwol74Q+YoEKkEkLSJrhcxCgCiZt9nIejBvWLyKh jzWnYuu58yqTqQJGLPca5AMtfTOcjhbZkQcssoTwgN2H3+9VS09hO0CRQoC3o7M9COEc XacA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702379818; x=1702984618; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=emLexNeiMxRrpYH4NoukFA94n/qTOww12Jh35UFMKZQ=; b=qBEs4SCOfb33P9BIxmZf3rPc9go1a3dq8WnFhZ4CQBLUq/NxOxJpLVm4A+uy7/cOsj /uh07J42UdPuQvI9afc+1s4jdgBFfSbKpmYnwUIS43wO+XYPNcZUNBRWf7L19NZejz77 ULKr4J3bo766jgA2NNLLt1TB5ulUaPlI7ZC8TWS/wrwkWmc/+kNqPO0CwKaJOX3xZLv/ FJ9roKilQRo3v4m/WtZyEV6cjpx0uWmrYJJ9Hd6QTNKw30X+0vQNQfgJkwI04hM9haCa mP1Uk3NBcs1Pu5IS0xbVL/XKGswBvVWEcVTc8TPo/CYqtKJhTv0roDFDuhmlcE9ZhzL6 FEHw== X-Gm-Message-State: AOJu0Yy37Bm0djxfMk6wrfaj2Ida8OkZNM6Bij8kBoqGRAu4KHNCJZbY sEzboxQaqYpOpSkBNq1ADCyESL/69fUcAvv7FTgu X-Google-Smtp-Source: AGHT+IG7tp32OMORRZFi41+ur1ApUni228OrQZR+MEsgJ8h/olkqdDMZZUD39w5XLg4t+iASNZs6vA== X-Received: by 2002:a7b:ce88:0:b0:40b:5e21:cc26 with SMTP id q8-20020a7bce88000000b0040b5e21cc26mr2859991wmj.81.1702379817785; Tue, 12 Dec 2023 03:16:57 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id e7-20020a5d5947000000b003335e67e574sm10653336wri.78.2023.12.12.03.16.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 03:16:57 -0800 (PST) Message-ID: <61ef66ac-ae1c-4c57-b800-475437e225e6@suse.com> Date: Tue, 12 Dec 2023 12:16:36 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 4/9] Support APX GPR32 with extend evex prefix Content-Language: en-US To: "Cui, Lili" Cc: "Lu, Hongjiu" , "binutils@sourceware.org" References: <20231124070213.3886483-1-lili.cui@intel.com> <20231124070213.3886483-4-lili.cui@intel.com> <546c8890-0526-49a3-8310-319358bf55c2@suse.com> <0bb5fbcd-f58e-48ad-a5ee-3413b026f903@suse.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3026.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_ASCII_DIVIDERS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 12.12.2023 11:44, Cui, Lili wrote: >>>> On 24.11.2023 08:02, Cui, Lili wrote: >>>>> @@ -3670,10 +3673,11 @@ install_template (const insn_template *t) >>>>> >>>>> /* Dual VEX/EVEX templates need stripping one of the possible >> variants. */ >>>>> if (t->opcode_modifier.vex && t->opcode_modifier.evex) >>>>> - { >>>>> - if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2) >>>>> - || maybe_cpu (t, CpuFMA)) >>>>> - && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL))) >>>>> + { >>>>> + if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA) >>>>> + || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) || >>>> APX_F(CpuCMPCCXADD) >>>>> + || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) || >>>> APX_F(CpuAVX512DQ) >>>>> + || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || APX_F(CpuBMI2)) >>>>> { >>>>> if (need_evex_encoding ()) >>>> >>>> There are several issues here: >>>> - Why did you need to change (to the worse) the original code? >>>> - Why did you not model the addition after that original code? >>>> - How come APX_F (CpuAVX512*) constructs appear here, when no >> AVX512 >>>> insn can be VEX-encoded? >>> >>> I don't understand what you mean, we have this combination. >>> >>> kmov, 0x90, AVX512BW&(AVX512BW|APX_F), >>> Modrm|Vex128|EVex128|Space0F|VexW1||NoSuf, { >>> RegMask||Unspecified|BaseIndex, RegMask } >> >> Oh, I'm sorry: I forgot about the mask register insns. >> >>>> - If these new macros are really needed for whatever reason, they >> shouldn't >>>> be added to opcodes/i386-opc.h when they're useful only in the >> assembler. >>>> - Style requires a blank before the opening parenthesis in function >>>> invocations (which also covers function-like macro invocations). >>>> >>>> I think I asked before: How is it that you get away without altering >>>> cpu_flags_match(), containing related and quite similar logic? >>>> >>> >>> For the original logic ( ... || ... ) && ( ... || ...), the content in the first bracket >> and the content in the following brackets can be combined arbitrarily. I think >> it is Inaccurate. >> >> In which way? If there are issues with the existing code, these issues want >> taking care of in separate (prereq) patches. Of course there are assumptions >> made here about the CPU combinations that can (and cannot) occur in any of >> our templates. Similar assumptions are imo fine to make in the APX additions. >> >> Note how I used two nested if()s despite that not having been necessary at >> that time. I did so in anticipation that for APX you'd want to add another >> (separate) inner if(), rather than altering the one that's there. > > Could we remove the CPU check here? it's a bit ugly and has limited effectiveness. > > if (t->opcode_modifier.vex && t->opcode_modifier.evex) > { > if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA) > || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) || APX_F(CpuCMPCCXADD) > || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) || APX_F(CpuAVX512DQ) > || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || APX_F(CpuBMI2)) I agree on the "a bit ugly" part, but taking what's there right now I don't understand "has limited effectiveness". Of course you can remove any code you want, provided you can prove nothing breaks. >>> So I give examples one by one for each identified combination. >> >> Which examples are you talking about? I see none given in your reply. >> > > Sorry, I want to say "I've listed every possible combination". > >>> Just found cpu_flags_match() has similar logic, I think the following is the >> only code related to CPUID alerts, but none of our combinations are related >> to cpuavx. >>> >>> if (all.bitfield.cpuavx) >>> { >>> /* We need to check SSE2AVX with AVX. */ >>> if (!t->opcode_modifier.sse2avx >>> || (sse2avx && !i.prefix[DATA_PREFIX])) >>> match |= CPU_FLAGS_ARCH_MATCH; >>> } >> >> Not sure why you pick out this one. This special case is needed for sse2avx; I >> don't see how it's related here. What I've been pointing you at is the code in >> that function which follows a similar "Dual VEX/EVEX templates ..." >> comment. >> > > I know you're talking about this code, I'm just guessing what it does? Don't know what I missed. You pulled out this sse2avx code. Hence I was expecting you to tell me why you consider it relevant here. > For example > > .arch .nobmi > andn (%eax), %eax, %eax > > --------------------------------------------------------------------------------------------- > if (flag_code != CODE_64BIT) > active = cpu_flags_and_not (cpu_arch_flags, cpu_64_flags); > else > active = cpu_arch_flags; ---> cpubmi = 0; > cpu = cpu_flags_and (all, active); ---> cpuapx =1; cpubmi = 0; > if (cpu_flags_equal (&cpu, &all)) ---> &cpu and &all are not same. > { > ... > } > Return CPU_FLAGS_64BIT_MATCH > ---------------------------------------------------------------------------------------------- > Then we will report an arch error. > > if (supported != CPU_FLAGS_PERFECT_MATCH) > { > as_bad (_("`%s' is not supported on `%s%s'"), > insn_name (current_templates.start), > cpu_arch_name ? cpu_arch_name : default_arch, > cpu_sub_arch_name ? cpu_sub_arch_name : ""); > return NULL; > } Which is what we want, I think (for the particular example you picked)? Yet again, I don't think I can see what you're trying to tell me. I also have to confess I've lost track of whether we're discussing install_template(), cpu_flag_match(), or both. For example in install_template() you may indeed be able to get away with little or no changes, as long as there's no used features tracking for APX (see the early ELF-specific part of output_insn()). Things would be somewhat inconsistent then, but that may be tolerable (as long as properly justified in the patch description). Not getting this into proper shape right with the introduction of APX may bite us later, though. Jan