From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2064.outbound.protection.outlook.com [40.107.105.64]) by sourceware.org (Postfix) with ESMTPS id 9D518385CC90 for ; Tue, 19 Sep 2023 15:46:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9D518385CC90 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Mjhf0iMFkHCsgoJBxQ1GgwCxKwO3FwGLyAmINP85mUaFF1c0AzHMo0jn2tDWvvdkp/kkq+DooBR6you6rrACiiFvHsdw1a275jDlqm/5sh1sAwg7Oq1BCovvzoJyAit6BcBlxfed3mInc/lTrZ+8kV5jzQj3l4HetcS4XMhQqTLEtvJ/Yl3wmAWcbFeLQyQnwsDP/FCGxtQk/mOitCtnoAwjNA4yuujRnr1S0CuP3jSO/pTtvAUL6zhuZ9yqfoXvi+UbTObhBxlTcVdYdDYyeuqV9vIZJ4CExWjv7+t0e6djszAV2glOFUltBTSM63xjxxAe435z19TdToWxFVH4nA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1IEe3wcAHFvvqyhHCGsLjR6bXD91E5Rtgt+j+EQ/XEE=; b=lB+ClvBJPrgLUOSs6D80Uz1Q2aOz4C3RExVVFPxE7kpN0eD1Z9rVJCMMLSPNjYfxAW0OFNrhh1VVOLp+NvHZZ1H111R7T1TgrmYTLoCQ3zRFnTpcmHHmaF7jkwjWrDqavJvSiH8dCsoRxWr2N9CDD5TSrRQ8IfBQIYPlY+Jf/WDniwSCTZlCUdz+uorGvPUQkbUCsR5XVFZWRuHOAQ/OgGoZv0/QmXmqnjoTNPe1tJ0RUCl7DkIYyDTgLrdXTxu7insRQ1gXcuJ5UWZzWoLTG05DJeeG/VIvm+nLRG+EWLG3I/GyuUJNwYF1ivJ4IV9ApT93WwEzcueaXGHki9hb9A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1IEe3wcAHFvvqyhHCGsLjR6bXD91E5Rtgt+j+EQ/XEE=; b=D4/iA4pBzYCzk7Z9UVBQ/sideREoTx4Bu2/eP0D5tEGgfJ115rAlW6EF/I4Fksn0wya6a1MpmZkm5P8dOb1WQOKe51QHtIRMmdPztFfSLeRqLYTMBxWCQQv4KxB1nguQxqxPix8n9C3hDWeEM0osNytRJIWh3Tcv5K7Fyj0hLFfQnbfhN4hnaWt62UYgrMbrMEcuRphvYqc0prred1EDOeW4uedbwn9M6ceocjKcYttj4i4ZllAyqQUCDOto2NeBy9DlzkErQdv24ISugqT+bDdRvQV9lz3iYgnzAVx9E96sxLhueL1jVI4YvaNGnWxJYpsmOu7POsqtPLfbNVjhSg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by AM9PR04MB8194.eurprd04.prod.outlook.com (2603:10a6:20b:3e6::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6792.27; Tue, 19 Sep 2023 15:46:01 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::f749:b27f:2187:6654%6]) with mapi id 15.20.6792.026; Tue, 19 Sep 2023 15:46:01 +0000 Message-ID: <4dacf33d-8770-775c-cfee-8741d159e08d@suse.com> Date: Tue, 19 Sep 2023 17:45:59 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: [PATCH v2 4/4] x86: fold F16C VEX and EVEX templates Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: From: Jan Beulich In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR0P281CA0128.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:97::10) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|AM9PR04MB8194:EE_ X-MS-Office365-Filtering-Correlation-Id: bfff11d9-f308-4a3a-cd4d-08dbb9278698 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kJFuzPhzQoOMPKMNP/dztE4BHXID405epK19YzOuqsXIs+6uAOFgL3TWhlytLdkfiQ/56kKay3y8BNe97rWTNWcn6jLrGBKLiUaEPhcUG7O6eM3G5OCS0lZjYvR9osP3adRLEVg0aXBTW7tN+fSvYXVLVmjcfiQ1KK3kC0AapsV2gFIIB0beMOay7let9YTWAdlSXz+w6uzf81wdOrjRM27HvxD1owmztB2iWuSI/EwuWsPuNMXjhlo5fQsiFIXzPKSTMo6ktSFKi7qocDYO+Yn6wkNAecbzq4CeKP+OzBcEB6uVG2hvv83GdyYydE4yMonIros6U2vehCunpLjNYVANA8sjhN3TB19l7/1WPbrbGPETZxlhZPGFE/LeK8NBIVBS+q4FzmTf4aNmJkYc4PiUUlF7qBim60G9LXZG341RiZLJxqzk2wWu6ZO9VKqFySfTg1gNKMrl8RqK/e6Z+1EWIWxMZQ9OQ8htg9Napd8gYATBEHmWQ/LQyWfkdJ97nZfyiN5p/U6RhgFPXvP8J1rhHIcZoxaLd+M+Nq1Z8rwwVh1t7WHqIqpaR4XfhMfPt3fA83JDh0A3lwuPWLbu/S3Zd17OHLRpZy3HZ9hMa6HDtorxOBlTCUEQFxy2cgccw67JnYKc//996U/DK04Uvg== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DU2PR04MB8790.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(396003)(39860400002)(366004)(136003)(376002)(346002)(451199024)(186009)(1800799009)(36756003)(31686004)(5660300002)(8676002)(86362001)(41300700001)(4326008)(8936002)(6506007)(6486002)(2616005)(26005)(6512007)(2906002)(66946007)(66556008)(478600001)(66476007)(31696002)(38100700002)(316002)(6916009)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?OVMzdFVvb2xEZDJ4TENjNUJDSEN0bzZUWStMRTZPUXkyWW1oc29qZUg0K3hU?= =?utf-8?B?N3h1dGxQY2xGY1VMTEswdUptS0dGbk9NVEw4eWRyNmlNd0s0MmlqNEFRUFVD?= =?utf-8?B?aWwrN0ZuUG02V0tDWG5EaDYwblVPaDN4MDFFaVdDNC96emhZWUZFQ3oxQlFq?= =?utf-8?B?UmJpWEVRM1ZIejM4VW84Q1R0eXRvanN1RjFTb3kvL0h0Z2h2SWF3QU0xYVE3?= =?utf-8?B?TGJjbXUxMXlFWGxkL1pRaVBOMUtsZjJsdHNMclFsZDRwVGsvQmcwcmEyQndM?= =?utf-8?B?Nm1JcjdlS1p4dlVHR1Y5UFJrT2UzTExWR2xPSExCQVZ6L29BVnB3cVNOQkhU?= =?utf-8?B?RnBhUmgvZ0lRSVN3eG1XS1FacncwT1lUNm5wNUtRK2xjbUN6OVN0c0FFcHg5?= =?utf-8?B?OUlLRk8zVS8zelNBU05MVVM5aCs5blg2TjFRSFBMbk4xZ3dRMS9hOTNjMEhx?= =?utf-8?B?bEhqM3hRSWt1ciszN3BTRVhsRGhhL0JuejNhdXNwZjd3eE1PaXgzZy84ZmUz?= =?utf-8?B?Y29tSWJNM2dWVGltbmtjY1A4VWFjYUFCUTZmSktQclhNMWJQaDFjQkRVUHBI?= =?utf-8?B?ZmRhQ0FNNDVYTGcwMnExREdadHp3S1dmZGFjNFIyVTJLcERaZ1hVa2Z0bnBB?= =?utf-8?B?SXFuM2RUb1R2THY2ZEt2SGxzaUJHSEtqYW8vVDA1YUJ3QitaMFVnU1ByR2JG?= =?utf-8?B?T1puYXhWLzBMQnlVdWRxcFM1TkdWWFNOL2dDbmUwYTVKRFAxMzV4RE9sUFdo?= =?utf-8?B?VXRYMDNMaHhKU0JZRnhQbXYrTldUTk1Zc0lNZFRaTWQ4MnVGcEFtcEFwMmJY?= =?utf-8?B?My9adEFoVzNiZ1ZOODZkckgrdk0zL3VhRG9RSWlJeGJ3Y1VOWHdIVE1ZNkRX?= =?utf-8?B?aFpEb2J4MUl1Uko1VG0rL2t1czdyMVBPSkpwNzFaNjdGd0hiSTI2S1hpckdV?= =?utf-8?B?VDlTN0RISUdueUN1TXR5SXlzcmc3eFRVYU1wdDlRMFZnTHBZaVdJZzNGZ2d6?= =?utf-8?B?YUpLVFFXRS9qY0J0UGtuR3RBZmR4VWlKRUUwN0pWYTIzQU5MNllLVzNoeW9z?= =?utf-8?B?UmRYbGdsVENKUWt4ZDAzVkZiaEFieUtFOVlYYWpSSyt1c3MxQW1qd1JqUXpm?= =?utf-8?B?SHAwVndzck9kbnQreE9KNGt5TzQyZnArZ3pyQUw0V1lRcVNMaHdCV0E0TnBX?= =?utf-8?B?V04rZWxXS3JJTVNna0tRdXRDNGhkbGpGRUJ2LzhyYVNJWTFiSTlaNGFFWEtN?= =?utf-8?B?QTZYVzRxbUM4L2JZNXYrWlUxUi9ldWN3ZjVLOHp2M1d1dUxWc0dUbkd6TFRG?= =?utf-8?B?WU42b2NuRlhGY3UwdlI2LzFaZlBTVlJIU0IrNUtlRGNTUlFoaXBvM1RBYjR0?= =?utf-8?B?Vm0zbXdQcGloRGNKblRPQmhlM3Q2TlBrUzVKb2x1aEtmUC9aWFhNVnhDdnpS?= =?utf-8?B?ZkhOd0pIeGs2TVZpVHM0ZFpwaENteVJOR0hLdkVGbU8xWG45OUtWQ0xZSEtn?= =?utf-8?B?TEtCVGZVcUZtSXlzbWZVTlFrYXRObktzK1BEaThCdDFEeHhyTHFVc3Y5T0NM?= =?utf-8?B?OVkzVU9ZZWpFZTB4ZXd6eXltQS9tTUFWc2VlK01wTFoyRnBobjgrWEZrNzNE?= =?utf-8?B?dDB5clVleEN5NnNKd2h4R1FJNGxsZkQzK2ticEczeStTdlNNTHpSMUFZVE1P?= =?utf-8?B?cTc1bDdYcmJqbGlzSm5CdXZVMDZqVXA5NDIrR2JVR3VkWHNaWkNOVWkwZkVV?= =?utf-8?B?eUhyalJpaUxEWWdHTDhwQ3l0NjhtQkR2eTBKU0I4RXRRcXBzangvb3IxVU04?= =?utf-8?B?VVp2cm1hUWZQeWI3NitZR2gvdDhkRWV4T3g4anBUVzk0UnNTc1JqdGhIaklX?= =?utf-8?B?Q3p0b3FzVVhUNFJScm5ZZC96dzE0dklGVkUwdEhmRFEreDM5dDc1cVFDYS8z?= =?utf-8?B?ZGlHM0ZKRXZhem9RNjVlN0JaQW5XdFI5SERIZ0hoOW1iNnZ2NFlFZ0hRbWcv?= =?utf-8?B?aWxWUU54V0lVS0lKMlNxTWcwMUhRWDRJNDJoSk9mNnNra2tvREY4N0loWEZm?= =?utf-8?B?emQxaDE0SGk2S3M2T0s4ZWJkTWlWaVp4WU8zcG4vNTFjc2RudlVmMjVURWRT?= =?utf-8?Q?MFwvxSHdJkGNlhqodoxSdQ6Wn?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: bfff11d9-f308-4a3a-cd4d-08dbb9278698 X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Sep 2023 15:46:01.7166 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 16sR8rkBmHZcr5xwxW+l/p/+7bdS8pmW34b3mOE968j+7QFlOhP+WhlGbISz8Rb62AfU2Q4HGk9TQZQFWvo9HQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR04MB8194 X-Spam-Status: No, score=-3026.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Following the folding of some generic AVX/AVX2 templates with their AVX512F counterpart ones, do this for F16C ones as well, requiring one further adjustment to cpu_flags_match(). Note that there is a slight asymmetry with the FMA checks, resulting from the various vector lengths having separate insn templates here, but being a single combined one (each) for FMA. --- TBD: Unlike for FMA the gains aren't as big yet the code changes are slightly bigger. The change may therefore be deemed to not be worth it. --- v2: Eliminate unwanted side effect. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1926,9 +1926,12 @@ cpu_flags_match (const insn_template *t) { x.bitfield.cpuavx512f = 0; x.bitfield.cpuavx512vl = 0; - if (x.bitfield.cpufma && !cpu.bitfield.cpufma) + if ((x.bitfield.cpufma && !cpu.bitfield.cpufma) + || (x.bitfield.cpuf16c && !cpu.bitfield.cpuf16c)) x.bitfield.cpuavx = 0; } + else if (cpu.bitfield.cpuf16c) + x.bitfield.cpuavx512vl = 0; } } @@ -1953,6 +1956,8 @@ cpu_flags_match (const insn_template *t) : cpu.bitfield.cpuavx) && (!x.bitfield.cpufma || cpu.bitfield.cpufma || cpu_arch_flags.bitfield.cpuavx512f) + && (!x.bitfield.cpuf16c || cpu.bitfield.cpuf16c + || cpu_arch_flags.bitfield.cpuavx512f) && (!x.bitfield.cpugfni || cpu.bitfield.cpugfni) && (!x.bitfield.cpuvaes || cpu.bitfield.cpuvaes) && (!x.bitfield.cpuvpclmulqdq || cpu.bitfield.cpuvpclmulqdq)) --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1793,10 +1793,10 @@ rdgsbase, 0xf30fae/1, FSGSBase, Modrm|Ig rdrand, 0xfc7/6, RdRnd, Modrm|NoSuf, { Reg16|Reg32|Reg64 } wrfsbase, 0xf30fae/2, FSGSBase, Modrm|IgnoreSize|NoSuf, { Reg32|Reg64 } wrgsbase, 0xf30fae/3, FSGSBase, Modrm|IgnoreSize|NoSuf, { Reg32|Reg64 } -vcvtph2ps, 0x6613, F16C, Modrm|Vex|Space0F38|VexW0|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM } -vcvtph2ps, 0x6613, F16C, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Unspecified|BaseIndex|RegXMM, RegYMM } -vcvtps2ph, 0x661d, F16C, Modrm|Vex|Space0F3A|VexW0|NoSuf, { Imm8, RegXMM, Qword|Unspecified|BaseIndex|RegXMM } -vcvtps2ph, 0x661d, F16C, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM } +vcvtph2ps, 0x6613, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } +vcvtph2ps, 0x6613, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38|VexW0|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } +vcvtps2ph, 0x661D, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F3A|VexW0|Disp8MemShift=3|NoSuf, { Imm8, RegXMM, RegXMM|Qword|Unspecified|BaseIndex } +vcvtps2ph, 0x661D, F16C|AVX|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F3A|VexW0|Disp8MemShift=4|NoSuf, { Imm8, RegYMM, RegXMM|Unspecified|BaseIndex } // FMA instructions @@ -2525,15 +2525,9 @@ vcvtdq2pd, 0xF3E6, AVX512F|AVX512VL, Mod vcvtudq2pd, 0xF37A, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } vcvtudq2pd, 0xF37A, AVX512F|AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vcvtph2ps, 0x6613, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } -vcvtph2ps, 0x6613, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM } - vcvtps2pd, 0x5A, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=3|NoSuf, { RegXMM|Dword|Qword|Unspecified|BaseIndex, RegXMM } vcvtps2pd, 0x5A, AVX512F|AVX512VL, Modrm|EVex256|Masking|Space0F|VexW0|Broadcast|Disp8MemShift=4|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegYMM } -vcvtps2ph, 0x661D, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F3A|VexW0|Disp8MemShift=3|NoSuf, { Imm8, RegXMM, RegXMM|Qword|Unspecified|BaseIndex } -vcvtps2ph, 0x661D, AVX512F|AVX512VL, Modrm|EVex256|Masking|Space0F3A|VexW0|Disp8MemShift=4|NoSuf, { Imm8, RegYMM, RegXMM|Unspecified|BaseIndex } - vmovddup, 0xF212, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } vpmovdb, 0xF331, AVX512F|AVX512VL, Modrm|EVex=2|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM, RegXMM|Dword|Unspecified|BaseIndex }