From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by sourceware.org (Postfix) with ESMTPS id 46597384600B for ; Mon, 22 Apr 2024 08:48:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 46597384600B Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 46597384600B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713775728; cv=none; b=r9ALsybGT8vHldF141OI0EZAtaPIx9LJxVkR2Kyf/9gnvuog/VGP3QHqLe6/0XdYxXRi5PEQI4GjS2aQS8S7IKNhQ4PU6wr8/3zCzEcLbKjTXpz9gf0wS+kihVRR6s5g8mpSI5B1RYLvr0DwNqli/S/gJzldOyqPHL60q7q6IDU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713775728; c=relaxed/simple; bh=YaNkSkhbEFdYeDN5RkYUQHvsA5Sqa8rKWU0iQgLxYAw=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=LpmrM/lYHiH0bQ7wXLV1qKKP79h7HOkjF7arYYENAQPv3dYMjDDJi42JtYRWSTmMU+hc8SQeiqjI7xje/+MIKkqFmv3pBaudfbeCV2ST8Znu6Sf8ut4XnluslOrfE5Zcs451wZhCn9gTS4UABAMInKuFy+pokkeAlDcFZvXBkN8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-41a523e27e0so4979485e9.1 for ; Mon, 22 Apr 2024 01:48:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1713775725; x=1714380525; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=777OJ9UZ6l3fgfc0JR+V02FITKzLjZZnqAECGJZNpC0=; b=YOw69w6fqDAiVl1ilJ++xUKSbkUuEPSk8mYhpbKI0sPW9od1Uc9ntnAR1XhF3t7Q4G 4KxlVqW0Vzd+j/zuJSd5S7M9pU09bYaSardsOIr+7Q4pGYIy1lSVb1JOkrOkFemn4HQf 28x4Hy3zyFnKOHi9OI66rxH2Qxpg77baEEHVc3C7UaN8v3aKAeRN7eWjyHEhEnx00nMD RucVDj/UUEU8mb0VldIU1MQj3cJSMgTVFqoNx9xBmSPXTtvWiwv12WqvwzQOYd7f+J/W iHXdOfiFge7ObGL0/TsSkWjLkF0pibQlw7GFCONZWprFKctRdV7EKB77QdwqG4phX2uU 4n4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713775725; x=1714380525; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=777OJ9UZ6l3fgfc0JR+V02FITKzLjZZnqAECGJZNpC0=; b=Q47eNgzRFWURRfxy3mN+Jc1oVnO9LaE/0QimD0sIfyqp0wHCzr3a0jSSWeNPckg0uu Qf4Wj75y2a0f+woxDnsWqoDBVB65f/Ri7HYiQX5Ll7OVziGY961ws692tXSrkGLtn9Uy oGNznRbBqlOt41xGKJYrtYnGa0DFua/w9DXCP47xuKKNYUVzZHAlnB22SXZx5wG5w7/3 2J6l72pnN3QRWDDoZcQeWZLu8WoqE2lYN2urShJhfA4iaCF9Foj9otMCOgCbudsKw+f+ PPtuvHUgEmjy2Inh/fJyAUTpfStWDaPDLleV8bL1XwOVBuEf9A9P3y/NEF5TSC4O9OEZ Sl8w== X-Forwarded-Encrypted: i=1; AJvYcCVL3xLHmN/wTf+qZt95K/4/k1yWfo4QlH9RXXlu515/I6z/lRob6r9nQ91JgOUnCW7+v0lBuJ9ejUDtpD+9dcfVeki2sV1pgw== X-Gm-Message-State: AOJu0Yy75wMgUVEPVT0Eigfux698Osb2FFNpfw72x2Zzl21XSqY0LXiH J+OkV58EbCD3fC6ziNq7kPHLKI/Ix7ni0CO3T9c42ENLHAmWiQ1Zra252VGArA== X-Google-Smtp-Source: AGHT+IGml4sONa1np0TqI4RYHzcVrK5m75rjG3K9orE1HB4u1pE/+ow2V4VrU7OohtlkHKzHbsOyAw== X-Received: by 2002:a05:600c:4754:b0:416:7b2c:df0f with SMTP id w20-20020a05600c475400b004167b2cdf0fmr8348848wmo.7.1713775725060; Mon, 22 Apr 2024 01:48:45 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id v13-20020a05600c444d00b0041a3f700ccesm3931485wmn.40.2024.04.22.01.48.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 22 Apr 2024 01:48:44 -0700 (PDT) Message-ID: Date: Mon, 22 Apr 2024 10:48:46 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] x86: Optimize the encoder of the vvvv register Content-Language: en-US To: "Cui, Lili" Cc: "hjl.tools@gmail.com" , "binutils@sourceware.org" References: <20240419073657.2418102-1-lili.cui@intel.com> <013ad260-9b28-496e-bbaf-c5a066774e99@suse.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3025.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 22.04.2024 10:42, Cui, Lili wrote: >> On 19.04.2024 09:36, Cui, Lili wrote: >>> @@ -1000,13 +1002,13 @@ pause, 0xf390, i186, NoSuf, {} >>> >>> // MMX/SSE2 instructions. >>> >>> ->> - $avx:AVX:66:Vex128|VexVVVV|VexW0|SSE2AVX:RegXMM:Xmmword, + >>> - $sse:SSE2:66::RegXMM:Xmmword, + >>> - $mmx:MMX:::RegMMX:Qword> >>> +>> + >> $avx:AVX:66:Vex128|VexVVVV_Src1|VexW0|SSE2AVX:Vex128|VexVVVV_Dst >> |VexW0|SSE2AVX:RegXMM:Xmmword, + >>> + $sse:SSE2:66:::RegXMM:Xmmword, + >>> + $mmx:MMX::::RegMMX:Qword> >>> >>> >> - $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, + >>> + >> $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV_Src1, >>> + + >>> $sse:SSE2:::> >>> >>> >> pmulhw, 0x0fe5, , >> Modrm||C|NoSuf, { >>> |< pmullw, 0x0fd5, , >>> Modrm||C|NoSuf, >> { ||Unspecified|BaseIndex, >>> } por, 0x0feb, , >>> Modrm||C|NoSuf, >> { ||Unspecified|BaseIndex, >>> } psllw, 0x0ff1, , >>> Modrm||NoSuf, >> { ||Unspecified|BaseIndex, >>> } -psllw, 0x0f71/6, , >>> Modrm||NoSuf, { Imm8, } >>> +psllw, 0x0f71/6, , >> Modrm||NoSuf, { >>> +Imm8, } >> >> This is not a scalar instruction, hence "scal" as a template parameter name is >> misleading. It's not really clear to me anyway why this needs fiddling with - >> there was no SwapSources here, and none of its siblings are being touched >> either. >> > > 'psllw' has an extended opcode and two non-immediate operands, to delete the corresponding code below. > > - if (i.tm.extension_opcode != None) > - { > - if (dest != source) > - v = dest; > - dest = ~0; > - } Yet how's psllw different from, say, psraw or pslld? >> To help recognizing such anomalies (possible problems), could I talk you into >> splitting the patch in two pieces? First a purely mechanical one introducing >> (perhaps simply as an alias of VexVVVV) / using Src1VVVV wherever it is >> meant to be used. Then the remaining changes, with a much smaller diff on >> the actual opcode templates, in the 2nd one. >> > > How about splitting into 3 patches: > 1. Introduce VexVVVV , Src1VVVV and Src2VVVV. > 2. Replace SwapSources. > 3. Replace extension_opcode part. Fundamentally fine with me, just that it would seem to me that in such a 1st patch Src2VVVV would end up unused. Hence it would appear more logical to me to introduce that only when needed, i.e. in patch 2. Jan