From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by sourceware.org (Postfix) with ESMTPS id CA8A13858D28 for ; Tue, 2 Apr 2024 08:48:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CA8A13858D28 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CA8A13858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::32d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712047720; cv=none; b=qqeQAhlIGBv5UBfi7LDoysXY6YY8XyDgxgZ+1Fr0XNnMegMyWg3fD1mPOOpdWWTr6ODrIyDVgY+DaxYXN6dW1XrX4gykjPxwtenth/bKDUMUlCtzFpRXrWGOCmw9HD5Q+EjOJAsqrW5RZhuTjqkDeT4XJ3YtOUttRPB8JLPz1UQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712047720; c=relaxed/simple; bh=fEFOT+VhSzZ2Z9pFYZk3q6Fo4wr3UcfyqVgJNd5FMRg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=DG+430k+wFjwaXilxg0E7yChnvMWIli9nuQ9vLtXiIN8umZl3PmjHtrsS8cmq4PHt4zLNdlsCjwTc3nh7r1Te5HZoBPknrQL9LyPNRweVSBSHPyMQlKSjMUta+V8rDYeuGwGye1by8CeUl+dzxmsucA1fpl451fZbePpWBA0O5g= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-4161c24bb0fso1862775e9.0 for ; Tue, 02 Apr 2024 01:48:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1712047716; x=1712652516; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=OM4kyc2gCBjDsv+3rpzp0ZFabaXwipDf+8RPPBPSPPY=; b=ayIOVN5W5v943yBM9hgDA6jVQbQpTW0DyL6pHKjUi04588NLa0tdqdKVJg7n2FvOtF rPC33lP5/G69cZEW9E0o8pGGeP36XM/tlPF9v/bU0y/QJ7E9vQN1pewa5Z2z06RZcdXP y6l9ipv60L70lRIRBipfd+HvWApfETBtOr5wc3ejgzyBgoc3XDyXM4qVOWqjAt4OMEu7 7xGL7tDBGIT/dtNNZa50op9BZdOEW5PwSxotEy0zKU5uFQSMqpqOTlks4ySGHqi1GG5m inP9PjnZDWCfy29qo1bE+BsPHseLJZvQZ1BwpWZYZzDHl00Sjb++jSmzTJEOw3Oa2mNn 4tdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712047716; x=1712652516; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OM4kyc2gCBjDsv+3rpzp0ZFabaXwipDf+8RPPBPSPPY=; b=h7dkNVqQuVlWLEEX3IumXooqwQ6eS1QhRWM32YvmDDu7Gn9pJsAhXzEJsMIPvLobcE yBOINaynOhrmHywBKajbXHzvNsTTRpFFN1F995kLTw3daWjdaxN0D5PIWVYXGp/JbVdv cgjFf1gyoqOkAgwKsstUXbMgTrTPzZARylKEojqxVcN02N1/2QL0W+oLGyS55qbBPWNR ObVrF3yygaVPX5rvc6R3G1dfVtpv+oJPJQ2qNXkcGqF5sGXbm8RcMuP6ESl6/0uiAEdL V8Z2Uh9sa0M51rvBxAHk50wzIL1hhbzmUNVOTsxv1yVOgKegGnbLARJd1GO1y8G7N7IV 9ZgQ== X-Forwarded-Encrypted: i=1; AJvYcCU+2/yNkgEyjRW6jRGNbLnbQKtKVuAPUYAwEZhO1A8QCCT7v9DDKiMctx0ZEwjjPdZF+gjOdbz2f+6UudGbxVzXQi2nLPbrlg== X-Gm-Message-State: AOJu0Yzfb5aYDOKCjBjqsnkcvao/+BG2kfQd/L0hRCJ/cnGVnMF8BxOv 6MPSeFsLxSXavJ+qLdkvWIZAymUp6tchOiyG2kDYqaQcC9NMWOSos+5ljkEC/g== X-Google-Smtp-Source: AGHT+IFblyuAJcmJtwr0s7r+AB9wvkrMGsB7g/k4VxQ/2QbRNpdzjaSZ9o53oaM7TRHzCsbq6XswcQ== X-Received: by 2002:a05:600c:a4b:b0:414:6ee:a37e with SMTP id c11-20020a05600c0a4b00b0041406eea37emr6396643wmq.19.1712047716397; Tue, 02 Apr 2024 01:48:36 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id o12-20020a05600c4fcc00b0041567ce9f10sm4899629wmq.0.2024.04.02.01.48.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 02 Apr 2024 01:48:36 -0700 (PDT) Message-ID: Date: Tue, 2 Apr 2024 10:48:40 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 4/5] x86/APX: extend SSE2AVX coverage Content-Language: en-US To: "Cui, Lili" Cc: "H.J. Lu" , Binutils References: <155929a3-eb8b-4b82-a4ca-84ab6de34b97@suse.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3025.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 29.03.2024 10:10, Cui, Lili wrote: >> Legacy encoded SIMD insns are converted to AVX ones in that mode. When >> eGPR-s are in use, i.e. with APX, convert to AVX10 insns (where >> available; there are quite a few which can't be converted). >> >> For GFNI alter the gfni template such that VexW would be emitted even >> for the SSE templates: There the attribute is simply meaningless, but >> it simplifies the template quite a bit. >> > > For this part, although adding VexW to the SSE template is more concise, it also breaks the rules and creates hidden dangers, it feels a bit unworthy. GFNI SSE does not support eGPR-s, I'm not sure if we should give it an error instead of converting it. We should convert whatever's possible to convert. I'll re-consider the VexW part following your comment (without promising that I'll undo it; in particular I don't see any hidden dangers). >> Note that LDDQU is represented as VMOVDQU32 (and the prior use of the >> sse3 template needs dropping, to get the order right). > > This conversion is clever, although the mnemonic has changed, but considering it is controlled by -msse2avx, maybe we can mention in the option that it might change the mnemonic. Judging from the option name alone, it is difficult for users to predict that the mnemonic will change (traditionally, it seems to just add V). I don't think doc adjustment is needed here. We already have at least one example where the mnemonic also changes: CVTPI2PD -> VCVTDQ2PD. >> I'm tempted to "convert" legacy encoded insns in maps 2 and 3 even >> without -msse2avx. Thoughts? > > I was a little worried about this conversion, so I asked a few people for their opinions, they found this approach a bit unacceptable, here are some ideas I collected. > > 1. The compiler will do this during the backend instruction selection phase. Binutils should only do instruction translation, not instruction selection. > 2. We can only convert some instructions, not all instructions. When users use eGPR-s illegally, some will report an error, while others will not, which is very confusing. > 3. Binutils needs to report errors for illegal instructions to ensure the correctness of the compiler. > 4. I don't know if there are any machines in the future that don't expect to generate EVEX instructions. Okay, I'll bin this (vague) plan then. >> What about SHA and KeyLocker insns not using eGPR-s? Their legacy >> encodings could be replaced by EVEX ones, too, provided that's a gain: >> Version 003 of the doc doesn't clarify whether, like other VEX/EVEX >> insns and unlike legacy ones, register bits beyond bit 127 would be >> cleared. That's the whole purpose of the SSE2AVX insns, after all. Yet >> of course there's the problem here that then such insns (not using any >> eGPR in their operands) would suddenly gain a dependency of the >> resulting code on APX_F (and not AVX512* / AVX10). Perhaps for these >> we'd really need -msse2apx then. > > If the CPU does not support the avx512 instruction, Binutils directly convert it, which will cause segment fault. It does seem like a new option is needed. These two instructions may change in the future, we can wait and see if this is necessary. Well, this has resolved itself by the insn groups having been removed in version 4 of the spec. >> Should we also convert %xmm-only templates (to consistently permit >> use of {evex})? Or should we reject use of {evex}, but then also that of >> {vex}/{vex3}? > > Do you mean SHA and KeyLocker? No, I mean templates with all XMM operands and no memory ones. Such don't use eGPR-s, yet could be converted to their EVEX counterparts, too (by way of the programmer adding {evex} to the _legacy_ insn). Hence the question on how to treat {evex} there, and then also {vex} / {vex3}. Take, for example, MOVHLPS or MOVLHPS. Jan