From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by sourceware.org (Postfix) with ESMTPS id 571823858D35 for ; Mon, 29 Apr 2024 13:08:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 571823858D35 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 571823858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::42b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714396145; cv=none; b=PIs+EpWPLFbNLkiEvHIt0kzStJZnOfgbHovwwGJy8I+sfoKDAz17l504I4jODw4dRW30iFl5TWQc41hhutz2tvuLjPL+s3JlnpCVwz9LMCZzdls7CS8Lk1cZT7+W0fCB36CzoVg3wo1ycP4lI81RYuoPNJJLboIdAMzY/41h7h0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714396145; c=relaxed/simple; bh=o0CXzvqrqfZT0OCveTh1N90QVMygyECAnfWTFgmOGOI=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=ZQCQYch3b7PVIoW8fpZDqXoAeGyrBLg0k911wR9qgog8vDzU3suM3ecmVMomWTCAQioIEI4Sxw6X6rgbhZYj4b2KgvLtuPemUyY/wtj+0kOeBY0nDUJR5qnfkZsiEI1vHpXEGQsQNFfDZP92/XJVM9GRwycjm8sN4epbv2aR3xk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x42b.google.com with SMTP id ffacd0b85a97d-34cfd924eaeso809848f8f.1 for ; Mon, 29 Apr 2024 06:08:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1714396134; x=1715000934; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:from:to:cc:subject:date:message-id:reply-to; bh=bjuyRXtpYLSo8EaN62fY73lNSKFEOgGuaE02dXasVlg=; b=H+Vy9nVEmguSs9ysIscEuKIaCw8oN3eJ+YkT/wldBdryXynveGQKZHR3MWCHId7AjA +3XsEzqgy4NVitxsNKq3wO/R/1/xaAXJC66q2tqjRmroWOU4Dn4c2AY7gBWZ5c6ofYG+ V2M4tTMpgWxiBu7JQvi01BbWU+UC+ed5tUlfKOYfl9pDnLR/EXY/vYIZlBmBCHopmUNf yBykMJvRFhXPj5DAEEhHedXNe9hYYSy4PUjYjv0cf6qZw3ILlpuBgmngZ8/8Z9hPLS3f lEKnOZAoRvFe47HF1ULAdfs9N/uKARqTofUhEBs3n9M2b2vCRUEQ+lHg3ddphbiXw+ah 7qUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714396134; x=1715000934; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=bjuyRXtpYLSo8EaN62fY73lNSKFEOgGuaE02dXasVlg=; b=Ak7hVGI48H9NlAiGkSmvJEV7g4RwS7CRrVd02HdtByeh8l4wE3DI6D1TYFkAsPXgwN I4zgvqntzc3aybzL5zV38KD+buPNnmaH2OXv1mUW56tfofmqEKr/Y6nTFlE64+PByJSi 9N3/ZtmRkyUUI78mQzLcpBaLcuPXOAcDtV6iJg6zQTwd3MxdJMPMHLp3WSRtfg5wtu9G st6Z427oXHuAMb0LzYgL80zgx9u9Ow/xMXTTf50fieAEzKFnS3o7HCk8Psu8OLhHb5XJ gNbrmyIcr7rg5DmkfAX11fyUVzn5g/kryZWJxX48MWzgzFvhd6Mk9HsY/pPp43PdCPty ZdAw== X-Forwarded-Encrypted: i=1; AJvYcCU+0siH65D5SlEXf8Qq165okQZvTW4E1GaatqwJK9XLIvUIMICDc0+3doiiaqGPYhte2gO+eF/b6Jm9amKwd6yJ2WTNtyHCDg== X-Gm-Message-State: AOJu0Yx8ucUcl7TW1BYsGn+x3V60/2XIKwCkC8yzt6ESWkUhu2J4ar2x Fdnv2p+lPgbA+hVn59Q9OfhTLaSaovt3ekl/3H29KiRp2liy2WXQrrgYNIVw2cXOLbIMBdrvLk8 = X-Google-Smtp-Source: AGHT+IE1VIH1CGtYK32uvcDpBv4Mw4xUcTVp/uzzgycrgP/Fq+DyFskPXqhlN3AzA3zR1oY8xJlnVg== X-Received: by 2002:adf:f6c3:0:b0:34a:80ed:9930 with SMTP id y3-20020adff6c3000000b0034a80ed9930mr6129693wrp.8.1714396133984; Mon, 29 Apr 2024 06:08:53 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id cg3-20020a5d5cc3000000b00349bd105089sm29333484wrb.47.2024.04.29.06.08.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 29 Apr 2024 06:08:53 -0700 (PDT) Message-ID: Date: Mon, 29 Apr 2024 15:08:55 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/3] x86: Drop SwapSources To: "Cui, Lili" Cc: "hjl.tools@gmail.com" , "binutils@sourceware.org" References: <20240424072356.2433122-1-lili.cui@intel.com> <20240424072356.2433122-3-lili.cui@intel.com> <8549a5b2-d8b1-451e-90c1-74ee29dadf3a@suse.com> <11644d0a-1b51-4c31-82ac-92bb39fd59df@suse.com> Content-Language: en-US From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3024.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 29.04.2024 14:23, Cui, Lili wrote: >> On 28.04.2024 06:47, Cui, Lili wrote: >>>> On 26.04.2024 10:14, Cui, Lili wrote: >>>>>> On 24.04.2024 09:23, Cui, Lili wrote: >>>>>>> --- a/gas/config/tc-i386.c >>>>>>> +++ b/gas/config/tc-i386.c >>>>>>> @@ -10434,6 +10434,14 @@ build_modrm_byte (void) >>>>>>> >>>>>>> switch (i.tm.opcode_modifier.vexvvvv) >>>>>>> { >>>>>>> + case VexVVVV_SRC2: >>>>>>> + if (source != op) >>>>>>> + { >>>>>>> + v = source++; >>>>>>> + break; >>>>>>> + } >>>>>>> + /* For XOP: vpshl* and vpsha*. */ >>>>>>> + /* Fall through. */ >>>>>>> case VexVVVV_SRC1: >>>>>> >>>>>> This falling-through is odd and hence needs a better comment (then >>>>>> also covering vprot*, which afaict is similarly affected). The >>>>>> reason for this is the XOP.W-controlled operand swapping, if I'm >>>>>> not mistaken? In which case perhaps instead of the fall-through >>>>>> here the logic swapping the operands should replace VexVVVV_SRC2 by >>>> VexVVVV_SRC1? >>>>>> >>>>> >>>>> Yes, vprot* should be included, and it is related to >>>>> XOP.W-controlled >>>> operand swapping, the comments says " /* Only the first two register >>>> operands need reversing, alongside flipping VEX.W. */ ", But there >>>> is actually a memory operand, not two register operands. >>>>> >>>>> I think VexVVVV_SRC2 makes more sense here, it matches the actual >>>> situation, we want to use vvvv to encode the first operand. >>>>> >>>>> Opcode table: >>>>> vprot, 0x90 | , XOP, >>>>> D|Modrm|Vex128|SpaceXOP09|VexVVVV_Src2|VexW0|NoSuf, >> { RegXMM, >>>>> RegXMM|Unspecified|BaseIndex, RegXMM } >>>>> >>>>> testcase: >>>>> vprotb (%rax),%xmm12,%xmm15 >>>>> vprotb %xmm15,(%r12),%xmm0 >>>> >>>> VexVVVV_Src2 is appropriate for the latter, yes, but not for the >>>> former. That uses VexVVVV_Src1 layout. Hence my suggestion to replace >>>> the attribute when swapping operands. >>>> >>> >>> If replace the Src2VVVV| VexW0 with Src1VVVV| VexW1 and swapping >> operands. We can put VexVVVV_SRC1 before VexVVVV_SRC2, but we still >> need to add "(!is_cpu (&i.tm, CpuXOP) || source == op" under >> VexVVVV_SRC1 , and match_template also needs to be adjusted (I made a >> simple modification and it still failed, I think continuing like this may go >> against the original intention). >>> >>> switch (i.tm.opcode_modifier.vexvvvv) >>> { >>> /* VEX.vvvv encodes the first source register operand. */ >>> case VexVVVV_SRC1: >>> if (!is_cpu (&i.tm, CpuXOP) || source == op) >>> { >>> v = dest - 1; >>> break; >>> } >>> /* For XOP: vpshl*, vpsha* and vprot*. */ >>> /* Fall through. */ >>> /* VEX.vvvv encodes the last source register operand. */ >>> case VexVVVV_SRC2: >>> v = source++; >>> break; >>> /* VEX.vvvv encodes the destination register operand. */ >>> case VexVVVV_DST: >>> v = dest--; >>> break; >>> default: >>> v = ~0; >>> break; >>> } >>> >>> Do you think we should add a separate patch 4 for XOP that removes the >> special handling in match_template and completes its template? so we don't >> have to add special handling for src1vvvv or src2vvvv. This might go against >> your desire to reduce template size, but it would help simplify the logic. I'd >> like to know your thoughts. >> >> Indeed. You'd effectively revert earlier folding that I did. And the adjustment I >> suggested earlier ought to be small/simple enough. >> > > So, I continued working on the previous suggestion. With the following modification and it worked. > > @@ -8932,7 +8932,7 @@ match_template (char mnem_suffix) > || is_cpu (t, CpuAPX_F)); > if (!operand_type_match (overlap0, i.types[0]) > || !operand_type_match (overlap1, i.types[j]) > - || (t->operands == 3 > + || (t->operands == 3 && !is_cpu (t, CpuXOP) > && !operand_type_match (overlap2, i.types[1])) > || (check_register > && !operand_type_register_match (i.types[0], Just to mention it - this certainly isn't what I suggested. In fact I seem to vaguely recall that something similar was once proposed during the original APX work as well, where I then objected, too. > But I found that there are 4 test files that failed, I didn't find the doc on how to encode vprotb but I guess that is because I changed the default template from Src2VVVV| VexW0 to Src1VVVV| VexW1, then all the related test cases needed to be modified. Do you have any comments here? > > regexp "^[ ]*[a-f0-9]+: 8f e9 40 90 d8[ ]+vprotb %xmm7,%xmm0,%xmm3$" > line " 11ed: 8f e9 f8 90 df vprotb %xmm7,%xmm0,%xmm3" Well, while changing the templates is in principle possible, and the resulting code would still be correct, changing encodings it usually not a good idea. Thus when it can be avoided, it should be avoided, imo. Hence why I didn't suggest this, but to amend the code doing the operand swapping (for the case where operand order is controlled by XOP.W). Jan