From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by sourceware.org (Postfix) with ESMTPS id 144A13849AC5 for ; Fri, 19 Apr 2024 10:05:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 144A13849AC5 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 144A13849AC5 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::42b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713521125; cv=none; b=Vf4C6MSY+MdXZzrXikOkj3QHUwsFiY+zzKytnFkeAcRp298HnKz24LBGfGea1LbjU5C4sXj358JLlCbyxhCBnrgp+iqDfbMLePoRRWs+vMb2KmwKfv+/9RvsQ0HFk46xJj58sPXolEh2tPggL/Z6AtdCWIcZKl1SU2Ak7+1mtME= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1713521125; c=relaxed/simple; bh=HCOqtMTrJ6+BPvEk0QQdlaPELy8/vvVBYIHADhdOBuk=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=sVVNsmUCogVja8PKnI9oMxcpDNSFjUstCRGvTxQfmpGXBKQHAnslBiKHz5OxUEhe7GcYuiAnYxpKWl5k1Kr0clR4UwcObuN1Tfopyy1oBtH5wX1HhsXc5ripVMHYgyWemnmGxCZobvc9L0KTJYyju8D0NM5jUMR5Q3tU7sRMqZA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x42b.google.com with SMTP id ffacd0b85a97d-3454fbdd88aso1445567f8f.3 for ; Fri, 19 Apr 2024 03:05:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1713521114; x=1714125914; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=c+UPzidLE3dnY29r5Poi6SxRMJgqKjno8KtKMIA/utg=; b=EhWzPkxp4DckJ+WrrbSeVtzBBA7g+QTLjbapkGhEkXvTlrLOr6nYQQWDVLiZiG6uRg HHANndCPLgDCFdywEmZZFnbXTrOoNb5KAhwxLtFFOCNa+s9x1Dcsiv610b5uKPsucDMa dUQGKrvYTou3ENhd3mjHChwhxbCoFSXmks4pKSHKbd1jZhkXiL/e6VAYxmJauQEtM/Nd IlPZwUErpUO84D8tYMZOuaUEQ4X9g2ARrUUt0pgy80zAu4XGHRJDTEI9no6MwPNqMz8K 2slemh+FyiGQvkMhvNpPio115asdS1nNsd9gG85mLWl64kqSiqwQqZtwPliTTNyefQbk Sk/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713521114; x=1714125914; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=c+UPzidLE3dnY29r5Poi6SxRMJgqKjno8KtKMIA/utg=; b=uADe4auzRWlYxHNralbjZNYQOBGiT93dJQ6/1YZvZPRFp/xZE4oUeks2n/13SMv5FK gwYhOc3CoQSlCs16v1jj8zNBgPVkbZEQdggkja3SkI01gNzu/iY1VbKv8bBCGZQJ4CZe dhdRLMLJq2EDf2Qp+dPT+XOTfPsO5nurStiw0mSdSJPQMk79z2rVSb7K4/nwp9AK9cPm yPF/I8qFZfOsKS4pF5xzgMzZxwgjfw4AWsP+uPUUsDE44nqHOyhcHDoJCL7Ph2GKEsiX dE1AeVzcwxTgtzmj9noG68jckaHA44yUUFBA2DZGJ2Za10iFcww2Ygx8EXNAI8F+JmSc 0b6A== X-Forwarded-Encrypted: i=1; AJvYcCV9q1CQp5E3UrTK1DLa01Sm7k5a/jCCcurP6WrOCPRzrRPXVn157ddKGkv//pOU9KQdLWnr7d4TLwRrr/Pbm872Vnx4klgIQg== X-Gm-Message-State: AOJu0Yy47k0eqwfjCGOjsrclF6D1Vf/6XeQBynOvnkQ0zwzd1T5BFfr1 rFzavNszQLdjwBCylAyPh2DBntyQwvSEpbSupeqTEH7D1nWFuMONeS7ivRLwPA== X-Google-Smtp-Source: AGHT+IGM56k42AVZIAUhp8Lw3QqZprEg/lhWpXrKcZutLcexehqABlLC6kXCEvSjEaSDEBaGdyQTuw== X-Received: by 2002:a5d:4147:0:b0:34a:2d0c:4463 with SMTP id c7-20020a5d4147000000b0034a2d0c4463mr1333346wrq.4.1713521113775; Fri, 19 Apr 2024 03:05:13 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id o10-20020a05600002ca00b003497fba9b1dsm4084844wry.102.2024.04.19.03.05.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 19 Apr 2024 03:05:12 -0700 (PDT) Message-ID: <013ad260-9b28-496e-bbaf-c5a066774e99@suse.com> Date: Fri, 19 Apr 2024 12:05:11 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] x86: Optimize the encoder of the vvvv register Content-Language: en-US To: "Cui, Lili" Cc: hjl.tools@gmail.com, binutils@sourceware.org References: <20240419073657.2418102-1-lili.cui@intel.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: <20240419073657.2418102-1-lili.cui@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3025.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 19.04.2024 09:36, Cui, Lili wrote: > This patch wants to optimize the encoder of the vvvv register. > Previously we used Vexvvvv, SWAP_SOURCES and extension_opcode > to help encode the vvvv register, this patch simplified the > logic to only use vexvvvv and added appropriate Vexvvvv values > for the related instructions. This looks to be a good move, yet you're not fully leveraging the potential: Afaict all uses of SwapSources now go away. Hence SwapSources itself should go away too, together with SWAP_SOURCES. Beyond that largely (but not only) cosmetic comments: > @@ -10426,40 +10426,35 @@ build_modrm_byte (void) > || i.encoding == encoding_evex)); > } > > - if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST) > + switch (i.tm.opcode_modifier.vexvvvv) > { > - v = dest; > - dest-- ; > - } > - else > - { > - for (v = source + 1; v < dest; ++v) > - if (v != reg_slot) > + case VexVVVV_SRC2: > + if (source != op) > + { > + v = source++; > break; > - if (v >= dest) > - v = ~0; > - } > - if (i.tm.extension_opcode != None) > - { > - if (dest != source) > - v = dest; > - dest = ~0; > + } > + /* For XOP: vpshl* and vpsha*. */ > + else > + /* Fall through. */ > + case VexVVVV_SRC1: > + v = dest - 1; Indentation here wants to fit the "else", not the "case ...". Even better would be to avoid the "else", seeing that there already is a "break" inside the if()'s body. > --- a/opcodes/i386-opc.h > +++ b/opcodes/i386-opc.h > @@ -640,10 +640,13 @@ enum > Vex, > /* How to encode VEX.vvvv: > 0: VEX.vvvv must be 1111b. > - 1: VEX.vvvv encodes one of the src register operands. > - 2: VEX.vvvv encodes the dest register operand. > + 1: VEX.vvvv encodes the src1 register operand. > + 2: VEX.vvvv encodes the src2 register operand. > + 3: VEX.vvvv encodes the dest register operand. > */ > -#define VexVVVV_DST 2 > +#define VexVVVV_SRC1 1 > +#define VexVVVV_SRC2 2 > +#define VexVVVV_DST 3 > VexVVVV, While I'm not overly fussed on the names used here, ... > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -141,7 +141,9 @@ > > #define Disp8ShiftVL Disp8MemShift=DISP8_SHIFT_VL > > -#define DstVVVV VexVVVV=VexVVVV_DST > +#define VexVVVV_Src1 VexVVVV=VexVVVV_SRC1 > +#define VexVVVV_Src2 VexVVVV=VexVVVV_SRC2 > +#define VexVVVV_Dst VexVVVV=VexVVVV_DST ... I am here, due to the line length issues we already have. Please can you keep DstVVVV as a name (thus reducing the churn on the table below) and add Src1VVVV and Src2VVVV, all being 4 characters shorter than what you presently have? > @@ -1000,13 +1002,13 @@ pause, 0xf390, i186, NoSuf, {} > > // MMX/SSE2 instructions. > > - - $avx:AVX:66:Vex128|VexVVVV|VexW0|SSE2AVX:RegXMM:Xmmword, + > - $sse:SSE2:66::RegXMM:Xmmword, + > - $mmx:MMX:::RegMMX:Qword> > + + $avx:AVX:66:Vex128|VexVVVV_Src1|VexW0|SSE2AVX:Vex128|VexVVVV_Dst|VexW0|SSE2AVX:RegXMM:Xmmword, + > + $sse:SSE2:66:::RegXMM:Xmmword, + > + $mmx:MMX::::RegMMX:Qword> > > - $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, + > + $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV_Src1, + > $sse:SSE2:::> > > @@ -1058,7 +1060,7 @@ pmulhw, 0x0fe5, , Modrm||C|NoSuf, { |< > pmullw, 0x0fd5, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } > por, 0x0feb, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } > psllw, 0x0ff1, , Modrm||NoSuf, { ||Unspecified|BaseIndex, } > -psllw, 0x0f71/6, , Modrm||NoSuf, { Imm8, } > +psllw, 0x0f71/6, , Modrm||NoSuf, { Imm8, } This is not a scalar instruction, hence "scal" as a template parameter name is misleading. It's not really clear to me anyway why this needs fiddling with - there was no SwapSources here, and none of its siblings are being touched either. To help recognizing such anomalies (possible problems), could I talk you into splitting the patch in two pieces? First a purely mechanical one introducing (perhaps simply as an alias of VexVVVV) / using Src1VVVV wherever it is meant to be used. Then the remaining changes, with a much smaller diff on the actual opcode templates, in the 2nd one. Jan