From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by sourceware.org (Postfix) with ESMTPS id D6A6B38582A0 for ; Wed, 13 Dec 2023 07:48:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D6A6B38582A0 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D6A6B38582A0 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::52a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702453730; cv=none; b=g17wVffYuaJJ/KtsQMXWZSULVM+HP0i58LcsoNpceZdJLMMOxooafiqeHSLUguutke8N+i+2cCAwdqbaKkL98/o7OsYf6AREKxn/EM+QtbUbrRNZgBQ6Ly0kzB/vfubmJ9wBMcmNNPG2yzsnMlzfMp7bIk9NVjYdOX0Cus7nNMw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702453730; c=relaxed/simple; bh=B+IY4VQdLCF6/mtc2pUV6DGmErSW9UB7wgKk4AefB/Q=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=yDGeUc0+ZQ83AAJCCtR2+LdDkK6GwlwZ+77sp7zB2jmvEpZmzCx6fXOFLASYO0qiOdLCEVhPXRNyNPSx6TNqw3WDunyJREOiJBrp3k5y1cE7sn3r9ajDdtPJtaDq7fIxg4Z2z1ZetruRZHkbjt/4LpBRVz6LMwUpD4kIWXleyfc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-54c70c70952so9107728a12.3 for ; Tue, 12 Dec 2023 23:48:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1702453726; x=1703058526; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=kHdmuigOTGYN1oEd6lyin8anC5/heW6uqStpf6kc/L8=; b=D0B9pKVQf9CStDEHLGvbd1sqjVGXQ3mFnskgkfJghCVE/RnvYAe9zVLOF9iBaFT8ja MH5GGBMzqxDm32Uj0pHxVy1muCqp3cTqGGKylNT097jEm0mnDwx3i8IiR+EGxTv5BJI5 5GFn4YzazD2/jaeVCQkVEsJBQBVdZj7uZvUreCwWQp1+Q0in6M/sJQ2IzpfOtPseTC31 r4orCYE0JKrAXdWZ/hNT6E8aIWLgWga+HuBAfmvhhUltmBJuc5jqhhK5ACSLRdJxCv5S PDKNuJwD+g2mgHrT0Jyr8nZW4PhxrzIHL3by79pRm66mav3a+ExiG+h03udfXZ97fn+0 JbaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702453726; x=1703058526; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kHdmuigOTGYN1oEd6lyin8anC5/heW6uqStpf6kc/L8=; b=ga1nNxTEW7kpKD0Y0PjE4k6rahyySPpDvSYN6nGSb6RQXbKPo6qMrfme8R1DhFjyrY QYXFrgOU1CM39e0nOH+kGVTTykS14Exrp8hsrmperCR5YRY5fko7DvTFMtBBDL9quxrj v+eTh+sbPTvdpbRjNwVuHvS7l3o6w3kSQlHfVTQ4b56pmIL6mTmvjhoYr8ahP8/UFQKs 5NLcEJ97SD17mpsZnDIe/PEGue0Y8TVD8UTrcUwuWoGfhw/EcSdhBfZwjIholYKi/8Fp W/tgiTjt20Jtf3Z5+jqLQu9M2/qFYMndAk1DzyDvl1cZTXFclEOb3HPsmPYssqRq+/5g gPYw== X-Gm-Message-State: AOJu0YywwdOdBlrjXAtzT++ngUAf+uoaS+m/GOBU6OF1ZYIwWAf86cFJ 0bPFxHBJ3LDXgBZCkKk4+jEI X-Google-Smtp-Source: AGHT+IHTlUkO7q8xL96MxbLGo4kJh16oCaybJ8c0H7xQrM01dpi5SnjOGdH9BcrP8dvahjEFNh1wWw== X-Received: by 2002:a17:906:cb:b0:a1b:7609:5d6b with SMTP id 11-20020a17090600cb00b00a1b76095d6bmr2392650eji.5.1702453726407; Tue, 12 Dec 2023 23:48:46 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id vi8-20020a170907d40800b00a1c7b20e9e6sm7278513ejc.32.2023.12.12.23.48.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Dec 2023 23:48:46 -0800 (PST) Message-ID: <4e36724c-fbec-46ce-81bc-ef29a6daaf2d@suse.com> Date: Wed, 13 Dec 2023 08:48:45 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3 4/9] Support APX GPR32 with extend evex prefix Content-Language: en-US To: "Cui, Lili" Cc: "Lu, Hongjiu" , "binutils@sourceware.org" References: <20231124070213.3886483-1-lili.cui@intel.com> <20231124070213.3886483-4-lili.cui@intel.com> <546c8890-0526-49a3-8310-319358bf55c2@suse.com> <0bb5fbcd-f58e-48ad-a5ee-3413b026f903@suse.com> <61ef66ac-ae1c-4c57-b800-475437e225e6@suse.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3026.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 13.12.2023 08:36, Cui, Lili wrote: >>>>>>>>>>> @@ -3670,10 +3673,11 @@ install_template (const >> insn_template >>>>>>>>>>> *t) >>>>>>>>>>> >>>>>>>>>>> /* Dual VEX/EVEX templates need stripping one of the >>>>>>>>>>> possible >>>>>>>> variants. */ >>>>>>>>>>> if (t->opcode_modifier.vex && t->opcode_modifier.evex) >>>>>>>>>>> - { >>>>>>>>>>> - if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2) >>>>>>>>>>> - || maybe_cpu (t, CpuFMA)) >>>>>>>>>>> - && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, >>>> CpuAVX512VL))) >>>>>>>>>>> + { >>>>>>>>>>> + if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || >>>> AVX512F(CpuFMA) >>>>>>>>>>> + || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) || >>>>>>>>>> APX_F(CpuCMPCCXADD) >>>>>>>>>>> + || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) || >>>>>>>>>> APX_F(CpuAVX512DQ) >>>>>>>>>>> + || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || >>>>>> APX_F(CpuBMI2)) >>>>>>>>>>> { >>>>>>>>>>> if (need_evex_encoding ()) >>>>>>>>>> >>>>>>>>>> There are several issues here: >>>>>>>>>> - Why did you need to change (to the worse) the original code? >>>>>>>>>> - Why did you not model the addition after that original code? >>>>>>>>>> - How come APX_F (CpuAVX512*) constructs appear here, when no >>>>>>>> AVX512 >>>>>>>>>> insn can be VEX-encoded? >>>>>>>>> >>>>>>>>> I don't understand what you mean, we have this combination. >>>>>>>>> >>>>>>>>> kmov, 0x90, AVX512BW&(AVX512BW|APX_F), >>>>>>>>> Modrm|Vex128|EVex128|Space0F|VexW1||NoSuf, { >>>>>>>>> RegMask||Unspecified|BaseIndex, RegMask } >>>>>>>> >>>>>>>> Oh, I'm sorry: I forgot about the mask register insns. >>>>>>>> >>>>>>>>>> - If these new macros are really needed for whatever reason, >>>>>>>>>> they >>>>>>>> shouldn't >>>>>>>>>> be added to opcodes/i386-opc.h when they're useful only in >>>>>>>>>> the >>>>>>>> assembler. >>>>>>>>>> - Style requires a blank before the opening parenthesis in function >>>>>>>>>> invocations (which also covers function-like macro invocations). >>>>>>>>>> >>>>>>>>>> I think I asked before: How is it that you get away without >>>>>>>>>> altering cpu_flags_match(), containing related and quite >>>>>>>>>> similar >>>> logic? >>>>>>>>>> >>>>>>>>> >>>>>>>>> For the original logic ( ... || ... ) && ( ... || ...), the >>>>>>>>> content in the first bracket >>>>>>>> and the content in the following brackets can be combined >>>>>>>> arbitrarily. I think it is Inaccurate. >>>>>>>> >>>>>>>> In which way? If there are issues with the existing code, these >>>>>>>> issues want taking care of in separate (prereq) patches. Of >>>>>>>> course there are assumptions made here about the CPU combinations >>>>>>>> that can (and cannot) occur in any of our templates. Similar >>>>>>>> assumptions are imo >>>>>> fine to make in the APX additions. >>>>>>>> >>>>>>>> Note how I used two nested if()s despite that not having been >>>>>>>> necessary at that time. I did so in anticipation that for APX >>>>>>>> you'd want to add another >>>>>>>> (separate) inner if(), rather than altering the one that's there. >>>>>>> >>>>>>> Could we remove the CPU check here? it's a bit ugly and has >>>>>>> limited >>>>>> effectiveness. >>>>>>> >>>>>>> if (t->opcode_modifier.vex && t->opcode_modifier.evex) >>>>>>> { >>>>>>> if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA) >>>>>>> || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) || >>>>>> APX_F(CpuCMPCCXADD) >>>>>>> || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) || >>>>>> APX_F(CpuAVX512DQ) >>>>>>> || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || >>>>>>> APX_F(CpuBMI2)) >>>>>> >>>>>> I agree on the "a bit ugly" part, but taking what's there right now >>>>>> I don't understand "has limited effectiveness". Of course you can >>>>>> remove any code you want, provided you can prove nothing breaks. >>>>>> >>>>> >>>>> Here is install_template(). >>>>> All I can say is that after removing the CPU check, no test cases >>>>> failed. I >>>> know it's hard to convince you to delete this place, or what do you >>>> suggest to do with this? APX requires this, otherwise the test cases will fail. >>>>> >>>>> - if (AVX512F(CpuAVX) || AVX512F(CpuAVX2) || AVX512F(CpuFMA) >>>>> - || AVX512VL(CpuAVX) || AVX512VL(CpuAVX2) || >>>> APX_F(CpuCMPCCXADD) >>>>> - || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) || >>>> APX_F(CpuAVX512DQ) >>>>> - || APX_F(CpuAVX512BW) || APX_F(CpuBMI) || APX_F(CpuBMI2)) >>>>> - { >>>> >>>> So be it then (assuming you don't delete any pre-existing code >>>> there). As said, I expect this will bite us later. >>> >>> Done. >> >> I can't connect this with ... >> >>> + if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2) >>> + || maybe_cpu (t, CpuFMA)) >>> + && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)) >>> + || APX_F(CpuCMPCCXADD) || APX_F(CpuAMX_TILE) || >> APX_F(CpuAVX512F) >>> + || APX_F(CpuAVX512DQ) || APX_F(CpuAVX512BW) || >> APX_F(CpuBMI) >>> + || APX_F(CpuBMI2)) >> >> ... this: You said you want to remove all the new checks. And now you say >> "done" with the checks all still there? And even if I misunderstood you, I still >> don't see why you'd modify the existing condition: The adjustments made in >> the body of the if() aren't applicable to APX afaict. Plus there are still the odd >> APX_F() uses; I'm sure I commented on that before. If any adjustments need >> making for APX, you want to add a 2nd inner if() inside the enclosing one. >> > > I want to remove all, including your pre-existing code, there is an EVEX testcase failure due to not clean i.tm.opcode_modifier.vex = 0; As you required that don't delete any pre-existing code, so I still need to add my new combination, > > How about this ? > > > I want to remove all code, including your pre-existing code, VEX test case fails because it wasn't cleaned up i.tm.opcode_modifier.evex = 0; As you asked, don't remove any pre-existing code, so I still need to add my new combinations. > > How about this? > > /* Dual VEX/EVEX templates need stripping one of the possible variants. */ > if (t->opcode_modifier.vex && t->opcode_modifier.evex) > { > if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2) > || maybe_cpu (t, CpuFMA)) > && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL))) > { > if (need_evex_encoding ()) > { > i.tm.opcode_modifier.vex = 0; > i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f; > i.tm.cpu.bitfield.cpuavx512vl = i.tm.cpu_any.bitfield.cpuavx512vl; > } > else > { > i.tm.opcode_modifier.evex = 0; > if (i.tm.cpu_any.bitfield.cpuavx) > i.tm.cpu.bitfield.cpuavx = 1; > else if (!i.tm.cpu.bitfield.isa) > i.tm.cpu.bitfield.isa = i.tm.cpu_any.bitfield.isa; > else > gas_assert (i.tm.cpu.bitfield.isa == i.tm.cpu_any.bitfield.isa); > } > } > > if (APX_F(CpuCMPCCXADD) || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F) > || APX_F(CpuAVX512DQ) || APX_F(CpuAVX512BW) || APX_F(CpuBMI) > || APX_F(CpuBMI2)) > if (need_evex_encoding ()) > i.tm.opcode_modifier.vex = 0; > else > i.tm.opcode_modifier.evex = 0; > } Something along these lines, indeed. But without APX_F(). I've just looked it up again: #define APX_F(cpuid) (maybe_cpu (t, CpuAPX_F) && maybe_cpu (t, cpuid)) Why would you test CpuAPX_F over and over again in the conditional? See how the code that has been there for a little while checks each CpuXYZ exactly once. Plus, simply as a style remark, you want to add braces around the if/else, to make entirely clear that the else belongs to the inner if() (iirc some compiler versions warn about code as you have it above). Jan