From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on2082.outbound.protection.outlook.com [40.107.13.82]) by sourceware.org (Postfix) with ESMTPS id 0D6CB3858C62 for ; Thu, 31 Aug 2023 07:18:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D6CB3858C62 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aZRj4pdv8SgQOsPiGszS+lzP6aWG8mAZOUGFKTE5hYeed0AWSJKJWBKofLa2cJzVz7sZ4dxgaW50bZXcZXvmBf0gMqM9+gmbNUVkQiniRJE9q8ODLEbuFTZ1k28etUDS5sJx5ymiN4azeMfAAWBuGtFF2q1zX0pcvGz2npstiQ2BFiNSb+10HkD2ToOaubEi4jiPv4WSyQZmzabPu1JARpFiwLLopnuPSXl9JSkAPTwuJ6lCcuZTcr4FRDn5wY6auICHfWL0mZKGexZcYJ8tn+O+cjMYi4a2/Zbx+vkgSYVMp5HzyeR9b/P3YbIgkyuPwIzwWVxGA2Suc99Ew59V9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=4T0VVfm3eWwOd3MvrNcifmlQ7XwkS3ww07Ge2iO1ZhY=; b=HDNRClMxEW1IVTaqoet93M1FWeTEdwM7zqZUM726dQr6Z/CIp5hqtbIWWqUvpP+1wE+H4/nvmtPJrsXJEQEi6LQqyCfiTgioT8PhYxGYTDaMowk8d1X5yxXKeZpTkIY205y+/wZG7Q6trD/BlCC3T7LIamjNSKTn0RXsMI02QZE0wQI778w1wss9QwocpsfH4uHZiWu+Bivg3mjZgwdzXQeP7W7FpKXcUpmQ4yhPuG/MsR0kr8PUbn1F06YCjFnB3VVq7UVrK3wWnVU+jOAPWkQFt7e9W5dKrCQ185aRNAzwZ0qmmDxv4yoJy+hrdPeKARyHfslIs8/LiZt1r7vJPA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4T0VVfm3eWwOd3MvrNcifmlQ7XwkS3ww07Ge2iO1ZhY=; b=ycnA8RsyIvHBtHsRHMYoNBzeFrU7L1ERS62d7WS5hPNYlgnuIL07HSrXluUhlkbDhCHxNVP/Y+fo8No2skW+Fli6jUq1ojvTtmCCtfSLL/Ke/CA5s1NVJTP2HTo8jhe2rgIOWOKUTU4Y4Z6qrNavglqenBRuBMilxDOVKTvRNQuk9zN/hAfACjZAGWBFnhWh5ecibqiLorx5f1KqamsWeYuh+B/lcljUqILsc9AowOrsVzSBvkm6sJrZHFSCLTCngYoYf2otZFj1lTe+VORgM1OBfR+6ZT/CNCDDYPI4nJn+DZX68RvTc4k7pTA47sAB053GGRbGmuZ/MPnk/2Nqlw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by PAXPR04MB8335.eurprd04.prod.outlook.com (2603:10a6:102:1c2::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6745.20; Thu, 31 Aug 2023 07:18:11 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654%6]) with mapi id 15.20.6745.021; Thu, 31 Aug 2023 07:18:11 +0000 Message-ID: Date: Thu, 31 Aug 2023 09:18:08 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: Re: [PATCH 5/5] x86: support AVX10.1 vector size restrictions Content-Language: en-US To: "Jiang, Haochen" , "H.J. Lu" Cc: Binutils References: <6f819651-36c0-1c69-8224-fe21f0f96a3f@suse.com> <990c83c3-0776-efdd-e162-5c367f4ebdc2@suse.com> <1b1b6e37-9484-95d8-d63d-c586a064729f@suse.com> From: Jan Beulich In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FRYP281CA0011.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10::21) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|PAXPR04MB8335:EE_ X-MS-Office365-Filtering-Correlation-Id: 33382a7a-1a56-4a58-82ba-08dba9f26ef7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: S2t96vZ/+RXmV/0Xf20ZFiA/sBAVGDaQb2LysUg3GV7RP8hbVgiNOUaCV28YUaSzPzaSQfvHe/wJGSdrIL5IZkjxLT9uUfA/zlTfOUANSHb1p+WXfheWrDPSeKdTUEBIhpI3iYMiNg+yyfTe9NtQIMHUGgdHyYBwzXhCQIFPTRNqlOvNJsH1Q5oqUJ6HyI19dlTZh4aPu3+NU/j1Df6Fs365B4gnGmdEL71+/E3rtpYwh39cIYW0u71J+GHt832vZz0BoV7+28+E17OBwsf2ttYHpa7HYy8obqvhNhNGtKOsfrpoaLmAlxoDEB6bZ5BWE5nGBPOMRuiISmFI8Ju+3q13aV6lL4sG/ggyTRgeNmLycFW6wrlwnA5VotPErO8amgc/OMM0arX/fMnXmxKpZjk9c+YtOKbx2Hd33Tn8i6RXT4GTZ81CunmAOglnMl10g8mQTR90ffrxkP60Gb5D+eUb4FY2ZACnmXXO/81TYYe5ix6yuwam99BmotCNjUczbW1Z5lGEg8VuPzFEOGrjqboIWt9WjFqat4uGzQoBENK0m0QOrtJu6g5+cg6OlgPnhsDEhoGSPc6mlOBHImo3HjSysaTQ6YgMngRyTv6Q6Z57lb3Yh6RiDQG1TPz/TOuhuq4Y4EL8Ym54KGIMl9P/bg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DU2PR04MB8790.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(136003)(396003)(366004)(376002)(346002)(39860400002)(1800799009)(186009)(451199024)(31686004)(2906002)(86362001)(31696002)(5660300002)(36756003)(38100700002)(4326008)(8676002)(8936002)(83380400001)(41300700001)(6512007)(66556008)(53546011)(316002)(6486002)(6506007)(26005)(2616005)(110136005)(478600001)(66476007)(6666004)(66946007)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?Y0dCbnRGUUc5SzRPSkh2NVk3WSthOEliaTVSVGh6dHh5TkxGTDBwUi9yUGpR?= =?utf-8?B?aHVlR09Rc2t4RklGblF4LzNQSHlFWjFSbHlKcHF0aFJGTkNCTWFxYjdKNENo?= =?utf-8?B?SXlSWldTMkEyT3M3QmRKVHlEalIxV0dEZ2ZLdEIyVFBqL29vWTlqbHlYdUd0?= =?utf-8?B?Wk9MMnRmWWxzU0lydVZpRTVxWXVza3lDOWhXaHhDQlA0OXp1clIxcXdwazJJ?= =?utf-8?B?d3lNK3l4dkNkeVdWVldFeTJzNUpVb0RsYzZ2RnpKdXhNNk5VeEVRdWgydktS?= =?utf-8?B?aE1EUENqbEtvdHFYY0pGZ1h4cXpGanVlcitZVDBNSlZrTXFicmNzYS8rVGxm?= =?utf-8?B?OHhqUnZQcER5K2MzUmRKZG4xc0dMTmZsMmZnRUoxTTU4MmhXVFRJaXUwd2NS?= =?utf-8?B?REZRSUFINGpzbkxYMHJCWmNqbVRmOUd1TnB3NG5LQkFGbXIrZWhpME01S2Jh?= =?utf-8?B?RVUrbkI2cjg5aFQ1cC9ZN0EwUFkzbmZ0czRMakpheWJrbkdQMzZBOEJQcVBn?= =?utf-8?B?V28yZndwOHpTaVVUcDU4b21oTWRZSDdBV2hnRVFJTlpIbWwzbVpDL3Y0VDVt?= =?utf-8?B?eHRvSlpCSWFudVJsWitVTXVKeGRrTUl2NUR0a3NGOGQwbGZaNmpoZ2E4aERO?= =?utf-8?B?SHRQYmRRRkMwMkt0dVFMTThNczB5cVo2WllXcGFDeDhZWU1kRUVKVzVXNU9N?= =?utf-8?B?Yi8wcEEwbXVGN2svNDVzckwyZm9Zd0lVQ2VVeHVzN204YUEzeTdXMjhWSVBU?= =?utf-8?B?S0hYT3ExVEdyTlJ6d1Q3Z0t0VEJVU1NnS0d2M2cwbnU2Q1dWaUl0SmJBaTlp?= =?utf-8?B?MzFjZkxDVXR0UCs3ZS9Pa29LNTdFemFwL0FEQmdneG1CamtEZFAvaHlPdEdj?= =?utf-8?B?aHpFamRXZkVuK21mdmdoZHI4NG1oenJaVXhXemljQnJDVElRWU1reUtJOVN0?= =?utf-8?B?MnRRZUZkRGZKVXplbzBCdFRWTVJGdVNiSXI5QlBqTFl0emgxZUZScW1CYmpv?= =?utf-8?B?YXZyQXg2ck9wLzZBRDhqTGV4azQwZ3hBVVVmSW1uNXRuSHR1aUxpdGlhOHlM?= =?utf-8?B?SWROQXcvYmthcmhsMjNSZEhhR1Z0M1lvTkxsRkpPdWp1STJTb3l6QUFHaS8r?= =?utf-8?B?aHo4U0JCL1VaWmU0MnBWTEdtdy80MXVoTUxIZHhJblhsSFJKdGRGcUdMWlFX?= =?utf-8?B?YjNxWGhpWlBxeTBIYVNwdGpjTHdVeDNrWEsySEN0VkNhcDFlTWlpc0FENy9k?= =?utf-8?B?cStCVHMySXg5N1NHTnVLbmNyY2pMZUY5VDg4NTYzRUlDN28zVUZvTWtOQ2k1?= =?utf-8?B?WHgzcUd1MGxNa1BSL2VDYkxsYUFLZmVuSFo4WlpjNTN5S1VTL3BsbnAwUlhF?= =?utf-8?B?Ui9OR2lpRmhrQ3c3eDZpL0VUWVBWTi9XQWZQSXU2VHF2SkxiNnlZZE9GaXRw?= =?utf-8?B?eUxGZVJQZnQyRVZvZjA3eUhqWFRJYi9KTHR5a3lGK1d0UmpCUXF6REkzc3R5?= =?utf-8?B?cStTbHA5ZWV1YmVZUU1mYThuM09rNU0yK0s5TkJkcmJIVmRKaURUSDFqYlR3?= =?utf-8?B?dUpIeFgwelYvaFZBT3NZZlh5a1UwZDNhR1pqZFJtS2VOclNqMDREY0Q2Smpa?= =?utf-8?B?bmZ5UnprRnpBL1hkQjdjZ2RXMFMvSEEzZVkrVWxjS28vK0hJUjJIY1Z0dVlT?= =?utf-8?B?YW14R1dCcG1qM0ZJdGV1NGovU1VZT3hydVFNR3hScTRTc3BKSm12c2s4QkF0?= =?utf-8?B?THZPVFQ5ZnZmN0p5TFlZL3RLUCtsVkpsamxQY1kzWUJlZnJrNEhKbEpzN2pP?= =?utf-8?B?bnVlcWVNbG0xbDk5MTFFVGI5b2t1TUFKUWFnc2dFQWFFY001eGVOaUI0Mksw?= =?utf-8?B?djE3T2IyMnJIcXU2UnlTb0VaSWhSK1B3blErMDQ1UE16S1J6aXVPN1F0TW1C?= =?utf-8?B?YnZMWWJTSk1FWkNEd1ZSRUxMQ0NlY2dRWC9CeURIVHUvM3VWQTNSYzZYcEdh?= =?utf-8?B?K25uenpQM3pwN3hHRzZRbi83cVN2NlZyVVlKb2g4RFJKbzFsclEwSTllOE1G?= =?utf-8?B?cnhuV1VUMXViNVBLNHFZR0RpR1V5NGI4V2UweE8wVnZoNnVZSEpwTkVTT0NQ?= =?utf-8?Q?yPuxsHCrAj3Qal2QViFH4PiwM?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 33382a7a-1a56-4a58-82ba-08dba9f26ef7 X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Aug 2023 07:18:11.3428 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: MRsBxY+1iE4AlnRwHqLM4Nx5DNviij0SE7ZK738rLPs1sWwj8qeFOhn3os01E9FBJM9hNct1RTmqpcBovO61OQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR04MB8335 X-Spam-Status: No, score=-3028.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 31.08.2023 07:56, Jiang, Haochen wrote: >>>>> I agree it isn't necessary, but as expressed before I view it as desirable. >>>>> Apart from the sentence you quoted the spec later also says "There are >>>>> currently no plans to support an Intel AVX10/128 implementation." For my >>>>> choice of also supporting the 128-bit restriction I'd like to put emphasis >>>>> on "currently". I think I said before that emulation environments (qemu, >>>>> sde to name just two well-known examples) are free to implement such >>>>> further restricted ISAs without then becoming out-of-spec. >>>>> >>>>> Plus supporting this mode right away has made me make certain adjustments >>>>> in what I'd call more clean a way, which I view as desirable as well. >>>> >>>> Since AVX10 spec doesn't specify if mask registers should be limited to >>>> 16 bits for AVX10/128, doing it in assembler is premature. >>> >>> It's hard to see why they would remain wider. The more that they were 16 >>> bits only in AVX512F. >>> >>> Plus of course nobody needs to use the options to enforce the 128-bit >>> limit. The way I've coded it, it matches what the specification says. >>> >> >> AVX10 spec only has >> >> Quadword opmask instructions will only be supported on processors >> supporting vector lengths of 512 bits. >> >> It doesn't say anything about 32-bit mask. 32-bit mask can be useful >> even with 16 byte vector. How's that any different for 64-bit mask with 32-byte vector? > The concern form my side is if there is an extreme case that overloads > registers, we might need to spill 32-bit register to 32-bit mask register > in the compiler. How's that any different for spilling of 64-bit registers? > Another minor concern is if there is finally a AVX10/128, although I do > not see that could happen, if we get a wrong choice here, it will take > some more time to correct the final assembler on the user side, which > I mean on the real OS. > > However, I suppose both ok for me whether to allow 32-bit mask since > AVX10/128 is nowhere near in the future and it is a toy code to play with. > We could be some kind of conservative at first by just allowing 16-bit > mask register. Also, the code change is quite easy and no much worry on > changing that. Exactly: I'd rather be overly restrictive initially (which people can easily work around by using .arch suitably around individual insns) rather than being too permissive and then failing to flag mistakes. Jan