From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by sourceware.org (Postfix) with ESMTPS id D14E8384640E for ; Thu, 25 Apr 2024 07:22:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D14E8384640E Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D14E8384640E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::436 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714029748; cv=none; b=josxODG2KXd0R31jWFpekQU+Q2Mbq9+8P9r3CGdOGTpWVE9/IQO9XW0AwhdsdbKEMRsMPNol1gNiUaG5abdqD5cshcgWYhYWYSuy2ANtoCMc4uDtm9+fkmI2nzh97BLvsgzrqjrpZm484pVWivVzMRmnbk1dNJA+ItEc1/UQonI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714029748; c=relaxed/simple; bh=Gjj8ymEG8VvtbIHvSnH94DiNf5wsYwv7M7N9s+RKnhc=; h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From; b=hTlcQfw2MGppBuSprleQ1vEdhnwPoXZliziCHBtWsWA5doqFeP/EHezL1O2Rp3UuxsThbqvY3Su8BmMTYnuoD7QEM21qltQvPrPN+Tdtymj0ec4tvUu3NsxQGKNgAzr0qayimhwAXlkA6pfc4K7+8mPMnNhCex3+2F7CaZ8lM0Q= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-3499f1bed15so1007098f8f.1 for ; Thu, 25 Apr 2024 00:22:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1714029745; x=1714634545; darn=sourceware.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=4R3Vs9Yw8h1UOhOG0gfmKyZyWOM2OjfamCq5BZL17lM=; b=R/dVfWsWsHzijyC3Qo0LmRIf5zDQCNSXkgBo68Q7sCe0yOEKS12Zs6JrPMdCG0Xp34 T+KX9SRyZF5YNHTJGDz3xtSxi+nimdTlw7OvhPSfLUQec9qdFrl+rpl2wxfln0H2kmPE Xf76YNQPVZhYccPq88V73mcoq0uoztog8WPNTicdtU+FA/Gl2jXOeOnPV7YKef+rO4qE X4Lx4aKeroI8XCnQcK3sj/SejopM4Tx/SyagVUGsx36tkcHL2VgOljV1k5NaL78V0N89 ofYdNwaxe7Uln6MHXMiPaB/q8UapXJ97o0kePuvIjSVMFh1QxvQ2YZE91hB1FVgV6zDV WPxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714029745; x=1714634545; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4R3Vs9Yw8h1UOhOG0gfmKyZyWOM2OjfamCq5BZL17lM=; b=WZM0aQ6DzUU+iWc8g/J+2clqD9VSWCcbFKxbkK6/IkA/GUuTmS0yER2PUQlwnAKApj N6AdJQGl7KkL3uudGTVszuWElYGUUi0+9SQieoPHxfLW1FF3griHxxQGqSmYy+qA/Fc0 ltdbBjXN5HoJi64R5jc5FZ6oRYG/fpursjQFrsDM3rCN7Fp9y2AkdR5SoPuxMDKE5EKy sl9EsJZq59GJtsA03dYJX3mJQBAIvdQITik3tD9UM5+6ywdcVj+kMeri6G6+pPcFE557 gjN0b1+su2z074N3xGNt0Upjye4wNImpIsI4xAndYp2DEkWmBZ/JR4kLX7CKCDl6tY49 3JSg== X-Forwarded-Encrypted: i=1; AJvYcCX9WAmJhhlNylmxVj5iZmr7IDSZwvYhQYrmi5ZtXHi7uEnH7MQPhngVLiO/XSh89TQsDtBJ2HuneSbWzIVMrH84f6kbTdd5mg== X-Gm-Message-State: AOJu0Yx+uHbdgMFqeyP6EhYUV7qaet8BtDcmmAStJL0lRdLoNcyzTlT1 jtABpjSorrYGIwoznIW+2FeKTGb5hjR+0FvUCv8Tv/5AxwV5JRws/bFKE0ISwA== X-Google-Smtp-Source: AGHT+IGgYDNrJqxQz/QyQyIksQr00OyrTds1L8gKBbKobCcu3psEg/v7uy5/711uATFmzWoFjF2xHg== X-Received: by 2002:a5d:550a:0:b0:34b:44d7:f3ed with SMTP id b10-20020a5d550a000000b0034b44d7f3edmr1185746wrv.8.1714029745408; Thu, 25 Apr 2024 00:22:25 -0700 (PDT) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id v13-20020a5d678d000000b003462b54bc8asm18934949wru.109.2024.04.25.00.22.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 25 Apr 2024 00:22:25 -0700 (PDT) Message-ID: <3000f16b-7471-44c4-b0e1-5458c0aba054@suse.com> Date: Thu, 25 Apr 2024 09:22:24 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 2/4] x86/APX: extend SSE2AVX coverage Content-Language: en-US To: "Cui, Lili" Cc: "H.J. Lu" , Binutils References: <1f66d44d-4185-48d8-ac74-edb92d372757@suse.com> From: Jan Beulich Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3025.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 25.04.2024 08:09, Cui, Lili wrote: >> Legacy encoded SIMD insns are converted to AVX ones in that mode. When >> eGPR-s are in use, i.e. with APX, convert to AVX10 insns (where >> available; there are quite a few which can't be converted). >> >> Note that LDDQU is represented as VMOVDQU32 (and the prior use of the >> sse3 template there needs dropping, to get the order right). >> >> Note further that in a few cases, due to the use of templates, AVX512VL >> is used when AVX512F would suffice. Since AVX10 is the main reference, >> this shouldn't be too much of a problem. >> --- >> To preempt the question: If we weren't to do this (i.e. leave legacy- >> encoded SIMD insns using eGPR-s alone), I'd raise the counter question >> of why these insns are supported by APX then in the first place. >> >> By using a mask register (which supposedly shouldn't be used by legacy >> SIMD code) we could likely convert further insns (by emitting a pair of >> replacement ones). > > Do you mean you want to allow adding "Masking" to legacy SIMD code? No, as per the explanation in an earlier reply. Just like original SSE2AVX also doesn't permit use of YMM registers in legacy code. We only want to replace what is (in principle) legal legacy code ("in principle" because we deliberately want to cover insns in 0f38 and 0f3a space, which cannot be expressed by legacy - i.e. REX2 - encodings). > Like you did with Disp8MemShift? That's entirely different. AVX512 insns _have_ to be encoded taking this aspect into account. That's nothing the user can (optionally) as for. Instead what this remark puts up as a question is whether we want to synthesize further legacy insns by expanding them to multiple (perhaps no more than two) AVX512 / AVX10 ones, using a mask register as an intermediate (on the assumption that such legacy code shouldn't be using mask registers at all). E.g. cmpps (%r31), %xmm1 can, afaict, be expressed as vcmpps (%r31), %xmm1, %k0 vpmovm2d %k0, %xmm1 In a few other cases an intermediate (mask) register may not even be needed, as e.g. aesimc (%r31), %xmm1 can apparently be decomposed to vmovdqu32 (%r31), %xmm1 vaesimc %xmm1, %xmm1, %xmm1 Yet the only other one this would also be possible for appears to be aeskeygenassist. Plus of course there are still ample insns which don't look to be expressible by just two other insns. Jan