From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70089.outbound.protection.outlook.com [40.107.7.89]) by sourceware.org (Postfix) with ESMTPS id C22E63858C83 for ; Fri, 14 Oct 2022 09:52:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C22E63858C83 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RVwU77LrLC8tQwnySb4zitBFP+sWmHzSLXtOd/wv82rpc5xFShlvb5B4fUj2AC4hZ1bqrD24b87apos8MvnUtav96cR1Cwziw1RqqOaipxlSUj1qB7H9eX8SbM7Fj3v6kYY5mi58O26dl8T43YDBsZNz9CrSyNtlZvZiSstONosUZMAwX6hhJf85j/0CWEosNu0WsaTmBpRxNqnUnE//udjSR2KNDpGw3oOlBCFiv0BO9img6+461ookWp0IygHYhlDrXc6qMi4aeu+qRVkRYPRV3xEAeVTEqkSEmZYMtBK+0D8P5rrVw7l4D+or5dSobGUnfDI7tIhh+HIaFXBgUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=j+Oho84W0pCHrHf0rnwLWBvZgc/iXS1Q7MnoijG3ktE=; b=CGIZVb7oVCtyUuGAKeiA+/092VukjX4RQMsG9FBWICZc41YzgChKvlB875B9UxLcIxfF3w5n+Kd9F5Wkmpy3+TPXjakwdl4TF3c/gv8Dp40BJjG9aBoyDPpnxvhdv3AltHvK96/I2EqObKnJNb86mcuzWwLbVkmPgA084zkrM3pEAx79W3gCV+ZN5TJgDb4CLU5Vv8UVDoh/2nMFbq0dKF8KlSUFkkxfuVtDuyLaaQIiXq5o02rKivnEHx9SVY2t05OZ4/ulKIW51XcAmQoUZVC0id+a/dG+oFzXHRrymXvXl+C5ZeE34OI8kHfMPvODjrjdv4LWss/vCdS44K8sfA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=j+Oho84W0pCHrHf0rnwLWBvZgc/iXS1Q7MnoijG3ktE=; b=aw3mTsI9x52mAGUKfiC1b77nEb23/XJ+4pm2S21dxeR8AVp2td8gDdADJin6o+yBiMtg2XkjEuvwCYh/W3n401BryqR1xBzDUfVv2ic9Q209/VRzRPvvIA56TIQEONrw699JoAKOVV4zKYoX3/jerg4dlJmhHb6nGiu5H/HH3WCP+a2HoA37YpCCh8b5aykDup5ERzP/OcSTTYTalgCnlSzOKA1ksgj9sqKjly6o5cNKhEfUwfOVVLp5zFUBwc918FhMzzr3l6xquxIHmjHEEuY6Z/THT008kHqdp4eu20REgVxMTdjE7c1MzPU2vhmrSoEwnq1Vu7NDX6ACuOE3IQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by PA4PR04MB7823.eurprd04.prod.outlook.com (2603:10a6:102:c1::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5723.26; Fri, 14 Oct 2022 09:52:36 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::2459:15ae:e6cb:218a]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::2459:15ae:e6cb:218a%7]) with mapi id 15.20.5723.026; Fri, 14 Oct 2022 09:52:36 +0000 Message-ID: <863655db-f202-477f-c638-00773c25886c@suse.com> Date: Fri, 14 Oct 2022 11:52:34 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.2 Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA Content-Language: en-US To: Haochen Jiang Cc: hjl.tools@gmail.com, wwwhhhyyy , binutils@sourceware.org References: <20221014091248.4920-1-haochen.jiang@intel.com> <20221014091248.4920-2-haochen.jiang@intel.com> From: Jan Beulich In-Reply-To: <20221014091248.4920-2-haochen.jiang@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR3P281CA0016.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:1d::21) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|PA4PR04MB7823:EE_ X-MS-Office365-Filtering-Correlation-Id: 5dabb3b0-8892-42f2-9f08-08daadc9d28b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 9dQQUVTgPYJis05GMCJgGVSZRuoQWvYfirg4zxdN+PaPm/Z08HL5+jfxFvBwJ5/u1I2f7iob5ohnVymiKPNt07/AWnnsitC8C9n/fNxGilM4eRQkOMdhuJ1B4QN2fMlT4+B3ULXOoYB7ibUepy9HO5xCtcW0QFaCotTIS9DI0tR7oPLZmWci6wdpIj0DrikM8+UBb3GKK0qHKw0IAEruZY7IY6Gb50Entg2HaU9i4a3/FCCnJFVPHmrhbZS7wvjhLV1XSPYeCLSLMHXlY13v6jGXjTxc7GWIOKuPqO5xdzZont4UtofsQaQMQMjdRcntzB73lHvxE9/RKphJoZ32m7dMKtOCO+eQqNYbbjMhBm4PVgvW+lHhpMuSo1fb3+ySfifh98Rk8bXpYSIVW7Ox8CqgKALlFdVEcu5AP/8djBiGvZ0dEbiYK+i5YPoNPGs4ayf2+pM04zQIiuv4nDpLVPP9zLbGRXDfOrX5VHQZss2Jl1vaRjhg1SMWBdboxV+tvHXfxg2RCjqcPzFgCI1I5JslLjLNBcGB4VqfyZ1wniVnAbvQBnyJiM11ugA/dH08GGE212GcfWT1TiXawhPRkFvaGLHcRijseFFlUQ8/V/HhLJPV6To6DwK15bNUXjNmSswFzb9lf9Z3hQonYIt6p2Kw8h6I2kfZcKXOukzpzqAVFKZ8cpF5QUCNwYvKrNgITQYXXkoKFNncXGOOO4Ik0n+halvuokEuUtLjkmclyH6yprY3AwDun5xOjTJzEqEM5ddl8xxCGgqp/VJlD50dTZlYNHQeFofgtUesG4k9EMo= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(136003)(376002)(396003)(346002)(39860400002)(366004)(451199015)(5660300002)(2906002)(186003)(6506007)(6512007)(2616005)(26005)(8936002)(41300700001)(53546011)(38100700002)(86362001)(31696002)(83380400001)(36756003)(316002)(478600001)(8676002)(31686004)(6916009)(6486002)(66556008)(66476007)(66946007)(4326008)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?TkZpdEZ6QzM0Uzh0VHNUOE90NHlXejYrSXlVbFR5b1B1b1p5REE5TExoZTc5?= =?utf-8?B?QUtrOHZqWWVLNE1jZHNubVFJUFBubzNieXN4dVBjNlQ0UGI5MTZDdXgyZWxX?= =?utf-8?B?NGZkNHFFRXlYUGhNQ2xMNmNicm1TRmJES2UrL3ZhZ0YwcXVieWVjOXFMVitn?= =?utf-8?B?ZkNieXRBRHp4WEhZM3pxMlNta1RDY09FbjJDQ1BFWkU5cW5oQlo2UlBWSTd3?= =?utf-8?B?QW9raFVqbUhwTTNyaEZKSWtjSVBscWV1a1NnZTZZQ2RqNWF6dmVja1R2S2JQ?= =?utf-8?B?OFUyaDdJTFE3TTNhWVJhZWRCSUxYMEZkK1JWSnJSUGw2OXEveFhHVG84MHVX?= =?utf-8?B?SXVmMkFhSkdQWTk1cEp6cDltZjNtN0V6dkI4ZDVFUkhMVkZVVnpKYVdKbWN6?= =?utf-8?B?M1VWbGNKL1UybFlmKzdkcmpXc3hFdXRxNk00NmRxcEx2dmVObDhaU3JLcFFY?= =?utf-8?B?SU0vZjJ4aVFvMk9QbjdSYzVQVVN5Rmp0QjBhYW5EN1BzUit0V2kxbENsOUk1?= =?utf-8?B?akxjV1FqRVZZdUdoYy9Cb1VZTjZ4TVY1bmE5Q2MxbG9BU0JHbVdFM2huNzJy?= =?utf-8?B?VE5JT0tMeTVuQklCRlkzczdhaVRrTWVHaThSV3IrUnJHWkpRV3dRMW5qejBE?= =?utf-8?B?WlpuaVJVT3I4MjZIYmZzazJqd1FzZWlJQ015NmxBLzhzWU1RZXlFbE1ucStw?= =?utf-8?B?aXBsUVpVaHNuMDVBUnNDbEkydCsxQkEraytuVHIzcFVZK3krR1Q5QVNtMzJD?= =?utf-8?B?SDh4SDRBSFQ2TGpSTjlZK1F6UHFBeSs1QUg2RFVhdW5xbGlMQUZzeGQwODFW?= =?utf-8?B?NzR6L2RLblJlYXhCRFRXamViQmtxcG9KNExTRldRV2NWR1U3dHgySExmTWJ5?= =?utf-8?B?S2FDUEtyeCt6eFVoVEx5NTEvSnRzRXZEMkdpd2Q0YlIzTUF0dmg1RG9hYTAr?= =?utf-8?B?SUppR3dqT2Mzd015aU0zN2czaDN1ck9mK2JRWE9hM3l1VWtwUFZKamkvR0VD?= =?utf-8?B?bU9IZnBlK3QxUGIzYUVkSzFHaUJsdGVIVER2dzltdWF5U1FFa2pzbTkxMmtk?= =?utf-8?B?dk4zcSsyRU01T0JzUWR6bjJOYkZncFZsZjZzNFNEWXRDS0pSR2E3WElaZnJj?= =?utf-8?B?NDNYZjF1T2wvQjE0alp3Um9mS3ZRciszUDh1MjFWY1RrRUMybmNYa2hTbEM2?= =?utf-8?B?Ukl5MHB2aWJPaUFzRktFRGtpcW50MWhpeWpGNmlyRFRlUVg4c3QxOG42WXdJ?= =?utf-8?B?ZkJFeTB1QkdaNTd1VjNnN3dmaUJDU29GL2N1ZHl1VklIdnphaHdmZTVNK0Jl?= =?utf-8?B?c2VKT2RDT2hPdzRvY09tQXhFQ2xoaTdLais0KzFHbVI1S0VycFUvNlVFaUtx?= =?utf-8?B?Nyt5a0l2ekpmeXJxTHBNNmo2c3FNNFEycGFlRTJhMGJ6WWpMNk0yb3Z1Nmd0?= =?utf-8?B?RWw4MXV0UjFQWFFOVkVYT3VTN1U1SGMwZGQ2MU1XOUNLS21pNEpyQnE5Y05D?= =?utf-8?B?REU0djlrMDdtenh5eHJmOFhNcmpBMnFDSUw0RER6ZGxVY0E3MVVCYUhEaVZo?= =?utf-8?B?T1dPbmJRTE92a0NlR2FWK2ZLQm1uVUlyU0VyRDRhWUI5Uk5PTmRsMm9TeVVt?= =?utf-8?B?blo2VENsaGtnbjdyVjJsc3JqY2lYMzZWUnBuQmNNN1gvV2gxL3B4TWZKY0dK?= =?utf-8?B?RkFqdzQvUWxXaUNHaUZaL2czbkRCR2RyK3UxODhsYUJCK1UyMTFaYWpwNzNw?= =?utf-8?B?bk9BOWY1a2dJTEUwNmx1UFFtdVNTWVJkT1VjaGtCUm5PeGJiTG5KSXRaSlJL?= =?utf-8?B?RUYzTVpGQkRYWUpHK2dPaHZGZEw4ekFWZHcyeFpmMTVHVmpJRjZaOEc4a0VL?= =?utf-8?B?RVZyT013eVIvSjBydU5xTkU5NlkvRk5QbllLYis0M2tUZkFsZEJya3J0Qzhn?= =?utf-8?B?cjdxeGl3RG9OODJnS2xRS3M0VTlRa0dZSUZjWjBUaGpTY0p5OUhFRFJUMjQ5?= =?utf-8?B?SHVudVd3eGJYbDRRK2drYzF1NHIreTQvVFZ5SlU3WldFWjZYanhpZE81QjJq?= =?utf-8?B?T1p0WTdDNEdOYzVHRzJPQlE4ZjFSVFVqUjRhT0RwbDBSYy83Q2YzZjdhd3Iz?= =?utf-8?Q?2qwoiKKnXaZnlB3JFDJ+QMJTK?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5dabb3b0-8892-42f2-9f08-08daadc9d28b X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Oct 2022 09:52:35.9525 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: UAo6Lgsb8irZoZBY0SRm4dagvV/tYvs2wb5vS6CEuqr96+kQyqg2K/Oqq69Y1/xKY3ITjbwiwP9o6/Emu4I9Gw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR04MB7823 X-Spam-Status: No, score=-3031.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 14.10.2022 11:12, Haochen Jiang wrote: > From: wwwhhhyyy > > x86: Support Intel AVX-IFMA > > Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is > cleared by default. Without {vex} pseudo prefix, Intel IFMA instructions > are encoded with EVEX prefix. {vex} pseudo prefix will turn on VEX > encoding for Intel IFMA instructions. I firmly object to the proliferation of this mis-feature. As expressed before for AVX-VNNI, as long as the user has disabled AVX512 (or respective sub-features thereof), there should be no need to use {vex} in the source code. There's also no reason at all to make the disassembler print {vex} prefixes - we don't do so for any other insns (apart from AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings (when none of the EVEX-specific features is used). I actually have a patch queued to undo the odd behavior for AVX-VNNI, at least on the assembler side (which also drops the PseudoVexPrefix attribute). > --- a/opcodes/i386-dis.c > +++ b/opcodes/i386-dis.c > @@ -1526,6 +1526,8 @@ enum > VEX_W_0F385E_X86_64_P_3, > VEX_W_0F3878, > VEX_W_0F3879, > + VEX_W_0F38B4, > + VEX_W_0F38B5, > VEX_W_0F38CF, > VEX_W_0F3A00_L_1, > VEX_W_0F3A01_L_1, > @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = { > { Bad_Opcode }, > { Bad_Opcode }, > { Bad_Opcode }, > - { Bad_Opcode }, > - { Bad_Opcode }, > + { VEX_W_TABLE (VEX_W_0F38B4) }, > + { VEX_W_TABLE (VEX_W_0F38B5) }, > { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA }, > { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA }, > /* b8 */ > @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = { > /* VEX_W_0F3879 */ > { "vpbroadcastw", { XM, EXw }, PREFIX_DATA }, > }, > + { > + /* VEX_W_0F38B4 */ > + { Bad_Opcode }, > + { "%XV vpmadd52luq", { XM, Vex, EXx }, PREFIX_DATA }, > + }, > + { > + /* VEX_W_0F38B5 */ > + { Bad_Opcode }, > + { "%XV vpmadd52huq", { XM, Vex, EXx }, PREFIX_DATA }, > + }, Irrespective of the aspect mentioned at the top I think this is yet another case where VEX and EVEX table entries can be shared. This would (if the {vex} printing really needs retaining for whatever obscure reason) merely require the processing of %XV to do nothing for EVEX- encoded insns, plus of course the separating blank would then also need to be included in the processing of %XV. I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries, which you could then re-base on top of. > --- a/opcodes/i386-gen.c > +++ b/opcodes/i386-gen.c > @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] = > "CPU_AVX512F_FLAGS|CpuAVX512_BF16" }, > { "CPU_AVX512_FP16_FLAGS", > "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" }, > + { "CPU_AVX_IFMA_FLAGS", > + "CPU_AVX2_FLAGS|CpuAVX_IFMA" }, > { "CPU_IAMCU_FLAGS", > "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" }, > { "CPU_ADX_FLAGS", > @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] = > "CpuHRESET" }, > { "CPU_ANY_AVX512_FP16_FLAGS", > "CpuAVX512_FP16" }, > + { "CPU_ANY_AVX_IFMA_FLAGS", > + "CpuAVX_IFMA" }, If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs adjustment, such that disabling of AVX2 also results in disabling of AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.) > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16, Modrm|Masking=3|EVexMap6|VexW0|Broadcast > vrsqrtsh, 0x664f, None, CpuAVX512_FP16, Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM } > > // FP16 (HFNI) instructions end. > + > +// AVX_IFMA instructions. Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples of the (needless) use of underscores like this. > +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } > +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } Please use plain VexVVVV (without =1) - we want to have as little clutter as possible on these usually already overlong lines. Jan