From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on2045.outbound.protection.outlook.com [40.107.14.45]) by sourceware.org (Postfix) with ESMTPS id 01CB23858D3C for ; Fri, 23 Jun 2023 05:59:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 01CB23858D3C Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=S7T7/fnOCpeUtLB7Q4ecHko561QbJS5Z4q2ad8QEJVpPs+324djALqzC/Q58mYAcu2HruXeYurpPdA0vZe+MEhOMoaYUkOkUPN8nBwNmEsJ5Pq9KFWZTcaYe1mtsyxjGlCe1EmYc3/tlE3Qbx2dktIMM3P1Ot/EX/roV4RZteSOp/xWmV/RardrJpuuWJK2YaOb4qoj6DuhL3Yo/CQKGP8MIdKrNMMx0BsjwHfXgNdjeOE2KZeKuDGhFBDPQr1e05SSxkYjaIPKnizdZUyOMUC5wjTTx78vx07Px0URia3iqGTWucU+1VHZ9J1GVS+GtoOnz57H2ba4jgI+O7FNbCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CM9EU6a2fMQFoIniYk3i4kBXPtNdDHBIxjqBmt+tjgw=; b=MyFDjuQLCzkwdsQgrJw9lhfS0qMpS/2FErY+2qYnFKdKPI1ohlLtBUFFan1OSHhzFLql34A+IhV2MUc8szHyGgRqKh/FSvO+RRUB5zo095MKicnJ8KLp+1k6zoxt3IYdgL/IVpGJIOSooaUhYJ4eNrNT0CqOShb85PtNeL44cVyPnqU0MFW6kTwupGUs9TuAGzxwpmzA1p0iD88SDWZMAko3i6CQnxMkLNGmPc7GkBBmUUDVVO5KX4JAb+25WBrK7YgZony7zLGAoEIAFlS3yRcc7h1PyhYwH11c8yvVqRw6nzySiumOwc/CmG6jxbwkh8Ws54Gw2nj0ILnjXywOrQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CM9EU6a2fMQFoIniYk3i4kBXPtNdDHBIxjqBmt+tjgw=; b=xauFHnNYi5eBuHObxqZiy2qNLjWJYkHCcOZhg7h7wsDbjSPPi/Dvqlk/nB38N1xEJCWqZTxSLPlc/j1wz+Hc4+5+XzqUkNzrFlpGLfPW0qMK6tpEk9sQU8OiN/rG1tarCEX9I7zz8zs5M3XIk9JOAW9/xZ8g9Ts1C+sv+XpLZFk4irTp8ZLkkL8ng4Rg2c5avjRs7JuvRE68tDl6FkfFK5llW2JoBPagYROcH5lN2uNe0kTQcQlt3d4+G/Fe65gqKAWUz3Mkp7W9P/9UCwDO2WmSCNze3Us0lJnymVlMqvnoCISzmu4dnx7HfQNQnIW4OW2nYEtg88j8USBkyMWXuA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by VI1PR04MB6894.eurprd04.prod.outlook.com (2603:10a6:803:13a::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.26; Fri, 23 Jun 2023 05:59:17 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6521.024; Fri, 23 Jun 2023 05:59:16 +0000 Message-ID: Date: Fri, 23 Jun 2023 07:59:14 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH] x86: Don't check if AVX512 template requires AVX512VL From: Jan Beulich To: "H.J. Lu" Cc: binutils@sourceware.org References: <20230620164436.432481-1-hjl.tools@gmail.com> <6a45006a-773d-4592-f8b0-b3facc1aef7e@suse.com> <12cdea81-5722-277a-7199-dd2cc2e9285e@suse.com> Content-Language: en-US In-Reply-To: <12cdea81-5722-277a-7199-dd2cc2e9285e@suse.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-ClientProxiedBy: AM0PR10CA0112.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:208:e6::29) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|VI1PR04MB6894:EE_ X-MS-Office365-Filtering-Correlation-Id: ad089ea2-bfcf-4697-0df1-08db73aefa46 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Fg4AS6lQCErTPOg/uhJ3Y1fIvaIa8xcUzsU9Xy+JG42QCH64sTFldAeJ4k5Oi+M8Jn/ntXqSdPPxnfMgQfE/WaPM2Wr4biFKYDf33UHV7ux85D/lGgjO0lrwUbJa/npoAA7lZa+CTUCNY6jDSGbfclzk8CXEwEUn7SceWdkGfsT7Z/XPtuOza3Uib7XjGGqC5O3kqjBZ3S7YCvtGGSfbwDx1SNAxgyUwklKPwuCfVV7j3FPj4zbLUDs8y/uMIcmt5hqxwzrDJoO+ZkWHOFCQRjfEf+erk78+EtaS1YomcfJ7+ca144vMVTd91An1vmxMFebJAuOS4Grkywo5s7itp5yVuk44b4g3qD0aasDISuUceJF9z6G7z2OPCJdLLLqhm2pz/YYpkpA+bO8phM7TqK4dC2LSByE034KTCpFJ605/xafNUtgpvjuyMlb4IYQTf/MPuH32n5aGiIL59RsSpWIy7myteIUBa1PHR+gCK4EQxXSUv31CGpc99jd6/dc8AczRTW+rU0d8rXF9whdtHtZi/Q+6lInEt5vhSrJ4kBkMJqEXlQ9n8iHCqFb7VmB+byNNWfN1tSy4ZjJo7NdcWTcf5bEcBLZ1JtFphKBniOXeoZS636n84ClDGF9WObCtiIMHkHa9T+eo9S3sgrR9iA== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VE1PR04MB6560.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(136003)(346002)(39860400002)(366004)(376002)(396003)(451199021)(31686004)(66946007)(4326008)(6916009)(66476007)(8936002)(8676002)(66556008)(478600001)(36756003)(316002)(5660300002)(41300700001)(86362001)(31696002)(6486002)(2906002)(53546011)(186003)(26005)(6512007)(6506007)(38100700002)(83380400001)(2616005)(45980500001)(43740500002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?blQ4OE5nTkQzSFdoQ1dOcm5reW5peVY4bEhTQmk5SmtGS0JUN1NNenRJeW1i?= =?utf-8?B?UHBXeGZhUDVFeHpCdFdNREdrZVhuZ3NqcXY2UVgraDB3WWxEUmF5bEtsZExq?= =?utf-8?B?aWxZWnkySkN3all6ZnZRdU5mWkgyUUV6Z3g2eTl1b3hiT2ZuTTJiY2NzdXps?= =?utf-8?B?dnVyNTNNWC9xKzgvVmVIeldZRUlZdUp6b0NXSzBVU3ZoVjM2L0RSMXI2NXhx?= =?utf-8?B?NWpzbE5RRVYxRnd6WklFb3QyUDYya09Uc2tKSzZuZ1U2S1pnTHVuby9NZ2FJ?= =?utf-8?B?OU1rc0g3emRXVGNuMGRmM1FrZjIzdmtYYnY0dkdUNFVjYnVmQ1g1M0U5bm50?= =?utf-8?B?ZDZySldKK2tqUGk3bmdzSWJrNFJSYXpTSmoxYjZoalM3cU5jdGZ2WmtMTzBj?= =?utf-8?B?bnlxVWZFZFZ5eWtkbXdtVU92a1VBcE13NlU4dUUzcnk3VnBRV3I4OVVYSk5j?= =?utf-8?B?eUJVZkVxMjEzQk9IU25iMk1vNG5jZWtZN3FkWWZJbWdyTFBSRmJrbTR4YXQ5?= =?utf-8?B?Uk1lVW1qc1g0OEQwci9lT3pDVmlZcDZuWlYyejNOZytSWFJ1VDU5RzdlVlEy?= =?utf-8?B?SEgwdXgyakszUVJteWYxV0I5dDJhR1J0N2FFb3d6NEJGaWdZcUdDTy8rVmd0?= =?utf-8?B?dStzZ2ZqM0ZCTG4vVDVGdzE2UjNtNUgvaG9hVXJuRjdLYm8vY3o0cE5hQlJN?= =?utf-8?B?dDRwdmFKZGx4eHJISWFjeklZZEptVzhqQTZTaVh2UGQyRmttR2xVQTY0RUdH?= =?utf-8?B?WTdPUFl3RXE2dHJIYmZ3TkY2eHg5eU51K2gyZ1JuN0thNjJsbUJqOUx4S3pF?= =?utf-8?B?Q283YUk2UWVhUUcrRHJZbSsrYTNQRGJjaUVWeFU5NWNlc2JnZWJ1U2FpZVE5?= =?utf-8?B?andHTEtJYWJiV0pCWVdFU1FIM0xLRm5WbVNzZ0paOW96K2Jna2I3QjJHbGxL?= =?utf-8?B?OG5EME9GMnh0R2FCRjc4SklhbVdVeW5Ra2hObkpUNmdwZTUrS0o1N1V1bVFk?= =?utf-8?B?Z3lzRGJrWXZlZWRVWW9CQ0ZRTHNjQ3RBSWFLZzN6RWFUcEY2YXpvZTVzNmFY?= =?utf-8?B?WnhVZ255cjM3N1k1NUpoY0RCQUVZM0N2TzJRT0x3VnZZa1V3blZuZWhIV0RN?= =?utf-8?B?bHEyaWtkMURRWXRobXY4QlJLTktlYjgvNzlOZGthaUtsak44R2laV0p3TGtS?= =?utf-8?B?TXBHc1RVOEhZbGFacG1PUWVROE4vc2J2d09yVkN2dzZmeUs2OXNCUk4wTUx2?= =?utf-8?B?dFNVcWZqMndoaTJ1UlBqaDdaU2VNajhCYTc2RFd4aElWZ1FuOERPNGFxc2E2?= =?utf-8?B?ZnZ0aHViS1hJM1ZrbGQ2ZFNYK200U0QyYVFGbXpZdU05bHNwMXJESWlJSG5q?= =?utf-8?B?VkgxK0JsVVRNOGEvdTNPOURnYndTN3RFMTNzZjJuaks4dWdpSm95RDlPNEhB?= =?utf-8?B?cWpna284UTd6UG5ZelJsM3RPb1RiM0VaTUNPN28xWVBNMytoNDNEYVlMNFM0?= =?utf-8?B?S2VUWlhTcVhPQTU2eDFDU2hxakhoL3prZHZ6Qk4wRXVkd3h2VnBnNFZtRHhs?= =?utf-8?B?TzM1V3NNd1ZYNXZFSkxOMWExY0VvVHo3bTFzNk5VMHBtNmU4SHVGYUZ5OHgz?= =?utf-8?B?QzFXZHoyOWU5TTRXSjVOSzlOUmd2dEtCT1kwYld1b3BhaGRCRHh2b2xJT0xQ?= =?utf-8?B?cit6NGNFcXpKeUh4RnpFTUI5WHdoVC9NWFZHOWNwQkZnTjl4UmgrZFZRa2dX?= =?utf-8?B?VkpMSklydkJFQVpaQ0JHV1IzY2RQWVVCWjdENDRyMk56Sm1vZmhxMndXYXV0?= =?utf-8?B?b3ZURFhrSW5RSG9aYmdicUNTbklWU0dtU2ZyenFoL05XSkJ6dlNrVkRsYnQw?= =?utf-8?B?LzFvenBrem83N25FNFZmaktpYTQ0TmRMS1ZlTVp2bE5qeGd6enpVNVBKZks5?= =?utf-8?B?cVNjRG04UkpJdW1VL1pVQTk0OGFQT2l2c0Q2VVcwUmVkR3VSTlE2SXUrdTh5?= =?utf-8?B?YVQzNU5POHZqenByQ0k2VzJ1Vk5KR0VMeDFMQ3lURTVIUFRxMWFQN3JaZHBp?= =?utf-8?B?c3gvSjhYV0xXWUJLaDRqKzJCOFhXb2pjeTV3d3YwTERUVmFjbDhNK2NiLzB3?= =?utf-8?Q?uG8M8lL1iCHOYXxLj+FkSjtaH?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: ad089ea2-bfcf-4697-0df1-08db73aefa46 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Jun 2023 05:59:16.5330 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: dvja4gGLR6CDAuA5wcNYRi9YhT4WusG9FW5MuCekS74x6SZMxoIYZCkHKGaBOzeW1dHdVdfCFUQtPP+KPQpiiw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR04MB6894 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 22.06.2023 11:00, Jan Beulich wrote: > On 21.06.2023 18:06, H.J. Lu wrote: >> On Tue, Jun 20, 2023 at 11:52 PM Jan Beulich wrote: >>> >>> On 20.06.2023 18:44, H.J. Lu via Binutils wrote: >>>> --- a/gas/config/tc-i386.c >>>> +++ b/gas/config/tc-i386.c >>>> @@ -6288,11 +6288,10 @@ check_VecOperands (const insn_template *t) >>>> /* Templates allowing for ZMMword as well as YMMword and/or XMMword for >>>> any one operand are implicity requiring AVX512VL support if the actual >>>> operand size is YMMword or XMMword. Since this function runs after >>>> - template matching, there's no need to check for YMMword/XMMword in >>>> - the template. */ >>>> + template matching, there's no need to check for YMMword/XMMword nor >>>> + AVX512VL in the template. */ >>>> cpu = cpu_flags_and (t->cpu_flags, avx512); >>>> if (!cpu_flags_all_zero (&cpu) >>>> - && !t->cpu_flags.bitfield.cpuavx512vl >>>> && !cpu_arch_flags.bitfield.cpuavx512vl) >>>> { >>>> for (op = 0; op < t->operands; ++op) >>> >>> While the code change is correct afaict, I don't think we want it, >>> which is related to your imo wrong editing of the comment: For one we >>> check for the flag to be clear (unlike the xmm/ymm checks). And then >>> the check is there to skip a pointless (in that case) loop. IOW the >>> loop is only needed for templates which do not explicitly name >>> AVX512VL as a required feature; templates which do can't validly have >>> a mix of zmm and one or both of xmm/ymm in any one operand. In fact if >>> you grep i386-opc.tbl for "AVX612VL.*ZMM" you won't find any single >>> match. >> >> That is why the check isn't needed. > > I'm confused now: You explicitly want to make gas perform more work for > no gain? The flag being clear is the common case, and hence in the > common case we can avoid getting into this loop. I guess I managed to contradict myself here: The flag being clear is the common case, and hence the effect of avoiding to enter the loop is more limited than I first inferred. Jan > So yes, as said, your > code change is functionally correct (as in: not breaking things), but > it is at the same time making overall behavior worse. Please revert. > > Jan