From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2040.outbound.protection.outlook.com [40.107.22.40]) by sourceware.org (Postfix) with ESMTPS id BBF2D3858D3C for ; Mon, 17 Oct 2022 07:35:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BBF2D3858D3C Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NxcwR/oxqu2OfSXFlx9z+7dvsnOIo2//hwd6PQMnY6K2JNJ6FWjzRC8qHVB4DkUMQ3pLm2wMi3oUh0Op6Oz3m8637o+MPdlPpNZMBZriw6YeSATihLBwSV0LDMyFKPIUyDbSaerqTTl5jGeCFx1dq2Co+apA01EVfA7Yr1evGFWfn1CdCtT44p7OfLZvDOeJdcd3Jt2FdF54nTYzetzQrj/Lh21r2CRGTCT/yy3JMTrlJnYNKvidnBcTdbPS1N8lUJxaJgK7YkBtlEXNCIgnHKHhhh7ck7Npypjv6REsSMHtcDL9srlBEm0+toDfPC3MEz5pMmDPqgE3Z7B85LMkYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7X33TzGOz50HC8q8J6PEOpmnTk0UI7uwpdGufGp0UDI=; b=RdOiWhvW93gZLeoVeQs8xT/gcYPzbPeojjKrl6KjzunmyuzvH6LO6GfWz4VXoz1YJbSgg5qKHatRLM5IaNOjbLxmhOOfvpzb3K/ksI/BIKCuVsIoqa0F1yNDK2ntCLw56aXD19/AWjsOqXlmvEZU65Ant6+/0nFYPZy56cjEuuvAeYhgbWOQjne9PIJ+gkvOjdfjWU4y02/ovHKsqKdBkr5Tor/zhMEYodBOJyF8PE0b070tgwPrDW5XC4i6lmA/idGrWSQXFYBWXIgBHQya8bWUu5zCz4yz3WaeE3kqHNy2pRSn5aqgcA+NPMZ9I1Z9KmIW7rbbLi0GmwuTILskLg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7X33TzGOz50HC8q8J6PEOpmnTk0UI7uwpdGufGp0UDI=; b=MF2n+1Xol6WVlAi56p21ud6lwReUAxHpOQnArbGBl+Yx5lmLOGjCII9SXUsDXbKe5djIYPLCBahVMeCVApMM6ijbW0ByJwF1x1Tu4DSXp64843SY4VqtQyIyzIYCN1jwhI7I0RYbhVIwGv9gEnoHEQzofuDJnUxnh38DohZ+tWc7epY2cITyrKcJAw+lDxq/jeqtHmEN8LpIY5RiQviu61sMY93fLNc2Inh7isdHjbBiYLzBUDb25r864VQm2c6MINnrwGI8LOa7Swecg/mHIuRKgbdSf0VGDdjo1ZmpuIHBGix2yyz0vBVC8Cg+sbyL3Yw28hUjJQ5snFzQCmgizQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AS8PR04MB8513.eurprd04.prod.outlook.com (2603:10a6:20b:340::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5723.26; Mon, 17 Oct 2022 07:35:05 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::2459:15ae:e6cb:218a]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::2459:15ae:e6cb:218a%7]) with mapi id 15.20.5723.033; Mon, 17 Oct 2022 07:35:05 +0000 Message-ID: <8e2d8a02-1521-5fa2-d97f-3de4c997818f@suse.com> Date: Mon, 17 Oct 2022 09:35:05 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.3.3 Subject: Re: [PATCH 09/10] Support Intel AMX-FP16 Content-Language: en-US To: Haochen Jiang Cc: hjl.tools@gmail.com, "Cui,Lili" , binutils@sourceware.org References: <20221014091248.4920-1-haochen.jiang@intel.com> <20221014091248.4920-10-haochen.jiang@intel.com> From: Jan Beulich In-Reply-To: <20221014091248.4920-10-haochen.jiang@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR0P281CA0141.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:96::12) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|AS8PR04MB8513:EE_ X-MS-Office365-Filtering-Correlation-Id: 0333cab2-0163-462d-7a7b-08dab0121c07 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: +2LnAo1i/6v3t9s5oYOb5Lbq8LetlGXjV6zmmYcLUz/xb+GD3Ti5RZyqRHZfAonM2h8twYYio57fFGa069lFLTZed+7pFxbV1BzrQOKydM4W+T70mwn2NJCnvYaHDsXXNJlPPDN4CFZhjVb5smZ+T1d7C7/IPlv2W43N8A5ie/MM8U3h2fYXlP81sDnYqthzjN+YEgEv8VFXgioPYbqdK3oPP43JmbquwStDBSmkPmsfaH4TsOcXFIVYKFKfFOJ01Mf38RBwlVqjaL65hcldXOEocomEatyn/dO+bN90iV9kPtqqdTdYt2mF7JDTn5GSszRc+6K4wngh4BzAx3Nmi0/Zazfk3LLik1VShAYdz+I51XhS2nburA6iHCAW+DocLEGGucvHpiGf501ZmiwvEa8mw+/sCvVCG8E/bvBp0vcvhbqWj4sKUsxx15TAqpwLp1pqZSZ8mLOefh86CscMwx+vdsQfHT4X4jQ049xx+4pqr8djR9TUYDPqlXfT9iU6p1HVPM/NHCV9IHe+8hPwn6gYA41WNEnVN/6BbfPKw8sCiUARo3CKmcEDXgmjJCI/RehrvbcGrCUpB/a0AfTZNhgBE+0omFgqUaEpvjkuwiY18TrccrUpxkJvS+dHQpuQFPlqZwpkIEzvfX0GQ/GC4Yd7+yROQbRi9D/SFyQP82+jzotumgwSi9uGXkKBm9WvdapECkvfEOOf5C6/FiOv4/7DYN4fe/XwcIegwnl4CdwdkCPbHwQyr7vAvR+0DxLqhjlPUEMCynzyElMASKTPsCNgBRsGzBRBiGFOLRB+4UA= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(366004)(136003)(376002)(346002)(39860400002)(396003)(451199015)(31686004)(6486002)(478600001)(38100700002)(2616005)(186003)(316002)(8676002)(4326008)(66946007)(66556008)(66476007)(6916009)(6506007)(5660300002)(36756003)(26005)(53546011)(41300700001)(6512007)(8936002)(86362001)(2906002)(31696002)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?VkdETzVrYkk5Z0o2dzNDMnJNaklUcHFIVjlEQ2c2a1FIaERRdEtIOERLSFln?= =?utf-8?B?TGlRZWM1QVlPUVVJU0JBdXIydU8zNVQwS1ZuUTdCVmFZUlJOZWErZ1VKWUlw?= =?utf-8?B?c25Pa0dUWTQ3T3ROYWQxN1BwVyt5dmFubWxmakpRSFdUVjViTEdNVzBwemNv?= =?utf-8?B?ak91ckhYRUtDWENySnR1dXJmU2t0WS84Z2wyUWlQQjZHV0FjbG5DK0V2MStL?= =?utf-8?B?aFplMUxGaVhjbWFoYXlFRlJiL0dvVjNVN29JajVZSXNaWHRVNU0yR2RRZDlP?= =?utf-8?B?THBrTHV1ZHptOGlrOTRPQ25tK0hGMkhUZVB4RXlGQjczRVZQbGxpS3FUR1VC?= =?utf-8?B?Qi9PNWVBQkRLWjE5cU4rOFhiZFptYmwyemRQbUNWQXd2cFhoRDZrUlJNUFpS?= =?utf-8?B?RzRHWXNKdVFublNBMmpGWk1CUkhpMzYrQ2pWSmt2dDFybE15SE9QclJBMjJD?= =?utf-8?B?NWlveWxzN2d5S2doc052RitQMzAwb3pBOG9wWTVRdkpTRkY1Wjc0TUwwSThU?= =?utf-8?B?b0cwMDNLYnNhMmRjMEg5R0ZxWTh1VHJpWEc4MGtqVXJJbWxvZUs4RU13ckRN?= =?utf-8?B?TU9CSWZCbE1MZ3hWa3RmWTZJcS9kem9acUlZZG9Vdzd2ejdNWWN0S05ZdEla?= =?utf-8?B?b2JSMXZ3WkRGSGprY3M3ZWIyZzFDenBESTQrdDloQi9vUTNYZitaZEt6dTFp?= =?utf-8?B?NjIvRlBCazdjbzAwV2dyV293Qnhrd3N6cWd5bXVBRW0vZE40U0RMME5wRXg2?= =?utf-8?B?bExwUUJnVi9zZ3dmYmE5NFFqSUNDSzlKb2RtdU5RLzRJT1JSemp1eTdidXZh?= =?utf-8?B?cGFIbEtuc3UydWFzdHc4QVRmWEh2aTUwckZaUmp4ZTN6bVFDUEpEWkhkL0Jh?= =?utf-8?B?REZBckovWnl4RVJ2dlhUYUR4L0tzekMwVTJkMmZ3MXNSdURjVnFWVU1rN3di?= =?utf-8?B?T1J5SktqQllKa3laekkzTllSM0VrQ0sySjRKYyswUXpMRm51dlJZczRBdFdG?= =?utf-8?B?bkJHYm9SYklYOUF6WVdCUFowMWdKREFVRlFlWGFleGdEOGpmWXpNemtIeWMy?= =?utf-8?B?OHpHb21aSDdCRHFVWXNRTDZXOEhiblBFSnl2NFFHWmhvRVVtUWtERGI1SVd0?= =?utf-8?B?aCtvRXJ4NjdQRzNpeGxLMm1FbjVSWnVrMlFBNGFYZWRsd05LbTF4Q2F0RmxW?= =?utf-8?B?amFRZTQ1M0Rkc2MzWTRHY1VzN2ZiY2xPVjBKOU81WTZCVFJ6eThuTFRHU203?= =?utf-8?B?Y1MrcUZ2WlRRVElzT21BRWY0YUJacDBVdTlhMnlWTXJlZjFxV2NHbk12OXF1?= =?utf-8?B?ZzQ5d3ZtN1lkVGVpYjgvT3pkaWtvOG9vWHlKL2J4UnF3WFRhNE5FWUhmeHlX?= =?utf-8?B?NnNEaGFDQVZ5c1o1RjB1M2pWRXBCdEgwSEFsUURIZ0FhZWNOMk5wbGxpSnhy?= =?utf-8?B?M2RwSnd1ZFRIZG15TXBIYWlITTJSZW9Fa3JCQzhETW5yMEtzajIzUktLMkJv?= =?utf-8?B?RGFvNVZiWWNEZXlJMmNyRHdSVVREejFuRWQ2Qkljd2VuV2lZeURkUjlySmU1?= =?utf-8?B?emw5bmczdXNkN3pteGE0RnJMdU14TDk0czNDN1RjRFdZaHRna0tZbUlJajJu?= =?utf-8?B?Z25ibXk0ZDUwZUNBZDQ0WERWZUErbmNOZUpKNUh0RnR4WExLaUx2ZUQySmJr?= =?utf-8?B?VXdaT2lpcEt1MjVQZ0JHTjkzMkZUaU9NaldTanoxS0FCUnV2MGVJSlc2YkEw?= =?utf-8?B?bUJXYTdSYk5nVUNqK2ZVNlI2aWQ4U01XR09aczhiWU5yOGJDWjZ5Q2JkdWcr?= =?utf-8?B?cGFBSXlBemVKOWRadExESkJTR0hvRHpWVWxpejdTSU1hb2k5dnl3TW1DTGQ1?= =?utf-8?B?UlZsVzJNSjMybnh2cnZtZVg2cnI1dGxTOG51Tm5QcTFlZjhocWV6TXBDVE0z?= =?utf-8?B?V0hFY24ydVpKLytuZ2pDTHllMmowY0M1S2UvZ3I2Q2ZabzQrVHhMV3ZsVEd6?= =?utf-8?B?Qkc5aWkyNFBFMjZVbm9zb25TQmVRaERsZjI2bm93RHh6TS9PUjYvZCtHTWJO?= =?utf-8?B?ME1OVDdTbmNQNDZ0QSsxQjEvSTZNOENyM3JBZysyUzNmSnRGM0RlZ0tWbXZj?= =?utf-8?Q?YYSD+8VwlY64eJSaVaWR8o2+x?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0333cab2-0163-462d-7a7b-08dab0121c07 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Oct 2022 07:35:05.3279 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: PWnpMCKMxDrERWT0GmXU0ozKrexVaEpZ1BW5seY18mIa7AZ4u4tc0CN5e3s5teS4VFqX8Z/GVZDbmAIkQ4mj7w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR04MB8513 X-Spam-Status: No, score=-3035.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_NUMSUBJECT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 14.10.2022 11:12, Haochen Jiang wrote: > --- a/opcodes/i386-dis.c > +++ b/opcodes/i386-dis.c > @@ -933,6 +933,7 @@ enum > MOD_VEX_0F384B_X86_64_P_3_W_0, > MOD_VEX_0F385A, > MOD_VEX_0F385C_X86_64_P_1_W_0, > + MOD_VEX_0F385C_X86_64_P_3_W_0, > MOD_VEX_0F385E_X86_64_P_0_W_0, > MOD_VEX_0F385E_X86_64_P_1_W_0, > MOD_VEX_0F385E_X86_64_P_2_W_0, > @@ -1399,6 +1400,7 @@ enum > VEX_LEN_0F384B_X86_64_P_3_W_0_M_0, > VEX_LEN_0F385A_M_0, > VEX_LEN_0F385C_X86_64_P_1_W_0_M_0, > + VEX_LEN_0F385C_X86_64_P_3_W_0_M_0, > VEX_LEN_0F385E_X86_64_P_0_W_0_M_0, > VEX_LEN_0F385E_X86_64_P_1_W_0_M_0, > VEX_LEN_0F385E_X86_64_P_2_W_0_M_0, > @@ -1565,6 +1567,7 @@ enum > VEX_W_0F3859, > VEX_W_0F385A_M_0_L_0, > VEX_W_0F385C_X86_64_P_1, > + VEX_W_0F385C_X86_64_P_3, > VEX_W_0F385E_X86_64_P_0, > VEX_W_0F385E_X86_64_P_1, > VEX_W_0F385E_X86_64_P_2, > @@ -4088,6 +4091,7 @@ static const struct dis386 prefix_table[][4] = { > { Bad_Opcode }, > { VEX_W_TABLE (VEX_W_0F385C_X86_64_P_1) }, > { Bad_Opcode }, > + { VEX_W_TABLE (VEX_W_0F385C_X86_64_P_3) }, > }, > > /* PREFIX_VEX_0F385E_X86_64 */ > @@ -7120,6 +7124,11 @@ static const struct dis386 vex_len_table[][2] = { > { "tdpbf16ps", { TMM, EXtmm, VexTmm }, 0 }, > }, > > + /* VEX_LEN_0F385C_X86_64_P_3_W_0_M_0 */ > + { > + { "tdpfp16ps", { TMM, EXtmm, VexTmm }, 0 }, > + }, > + > /* VEX_LEN_0F385E_X86_64_P_0_W_0_M_0 */ > { > { "tdpbuud", {TMM, EXtmm, VexTmm }, 0 }, > @@ -7788,6 +7797,10 @@ static const struct dis386 vex_w_table[][2] = { > /* VEX_W_0F385C_X86_64_P_1 */ > { MOD_TABLE (MOD_VEX_0F385C_X86_64_P_1_W_0) }, > }, > + { > + /* VEX_W_0F385C_X86_64_P_3 */ > + { MOD_TABLE (MOD_VEX_0F385C_X86_64_P_3_W_0) }, > + }, > { > /* VEX_W_0F385E_X86_64_P_0 */ > { MOD_TABLE (MOD_VEX_0F385E_X86_64_P_0_W_0) }, > @@ -8610,6 +8623,11 @@ static const struct dis386 mod_table[][2] = { > { Bad_Opcode }, > { VEX_LEN_TABLE (VEX_LEN_0F385C_X86_64_P_1_W_0_M_0) }, > }, > + { > + /* MOD_VEX_0F385C_X86_64_P_3_W_0 */ > + { Bad_Opcode }, > + { VEX_LEN_TABLE (VEX_LEN_0F385C_X86_64_P_3_W_0_M_0) }, > + }, > { > /* MOD_VEX_0F385E_X86_64_P_0_W_0 */ > { Bad_Opcode }, > diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c > index eac229e54d..d10b462548 100644 > --- a/opcodes/i386-gen.c > +++ b/opcodes/i386-gen.c > @@ -259,6 +259,8 @@ static initializer cpu_flag_init[] = > "CpuWRMSRNS" }, > { "CPU_MSRLIST_FLAGS", > "CpuMSRLIST" }, > + { "CPU_AMX_FP16_FLAGS", > + "CpuAMX_FP16" }, > { "CPU_IAMCU_FLAGS", > "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" }, > { "CPU_ADX_FLAGS", Can you please insert next to the other similar AMX entries? Seeing the flaw here, I'll be making a patch to address the lack of CPU_AMX_TILE_FLAGS in the similar pre-existing entries. When you move the insertion, it'll be easier to keep things in sync. > @@ -426,7 +428,7 @@ static initializer cpu_flag_init[] = > { "CPU_ANY_AMX_BF16_FLAGS", > "CpuAMX_BF16" }, > { "CPU_ANY_AMX_TILE_FLAGS", > - "CpuAMX_TILE|CpuAMX_INT8|CpuAMX_BF16" }, > + "CpuAMX_TILE|CpuAMX_INT8|CpuAMX_BF16|CpuAMX_FP16" }, > { "CPU_ANY_AVX_VNNI_FLAGS", > "CpuAVX_VNNI" }, > { "CPU_ANY_MOVDIRI_FLAGS", > @@ -467,6 +469,8 @@ static initializer cpu_flag_init[] = > "CpuWRMSRNS" }, > { "CPU_ANY_MSRLIST_FLAGS", > "CpuMSRLIST" }, > + { "CPU_ANY_AMX_FP16_FLAGS", > + "CpuAMX_FP16" }, > }; Same here then. > --- a/opcodes/i386-opc.h > +++ b/opcodes/i386-opc.h > @@ -223,6 +223,8 @@ enum > CpuWRMSRNS, > /* Intel MSRLIST Instructions support required. */ > CpuMSRLIST, > + /* AMX-FP16 instructions required */ > + CpuAMX_FP16, This (and the related stuff) may also benefit from grouping with the other AMX ones. > --- a/opcodes/i386-opc.tbl > +++ b/opcodes/i386-opc.tbl > @@ -3339,3 +3339,9 @@ rdmsrlist, 0xf20f01c6, None, CpuMSRLIST|Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|N > wrmsrlist, 0xf30f01c6, None, CpuMSRLIST|Cpu64, No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, {} > > // MSRLIST instructions end. > + > +// AMX-FP16 instructions. > + > +tdpfp16ps, 0xf25c, None, CpuAMX_FP16|Cpu64, Modrm|Vex128|Space0F38|VexVVVV=1|VexW0|SwapSources|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegTMM, RegTMM, RegTMM } As before - plain VexVVVV preferably (without the =1), irrespective of the already present AMX entries still using the less preferred form. > +// AMX-FP16 instructions end. Nit (again): Perhaps better use singular? And as above - perhaps put next to the other AMX entries? Note how they are all in a single group, despite it being 3 separate feature bits. So I guess you will want to insert exactly one line below tdpbf16ps. That way the similarity between both is also going to be easiest to see, check, and maintain. Jan