From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on2053.outbound.protection.outlook.com [40.107.14.53]) by sourceware.org (Postfix) with ESMTPS id DF24038618C4 for ; Tue, 14 Nov 2023 10:50:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF24038618C4 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DF24038618C4 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.14.53 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1699959042; cv=pass; b=f1AedQl9uaJrD6THMjsmlpf7okBrSWVRDkgApH9ExIUGHf34faeocImOkNtwHqh3Jra2/auHsQa5dzXTYPqw0THOxFGg8l1uWIGyEEv33u2njMnLoDfwZExSLF+GGz8Kdsb9hH27sqHOb5r+N58vuHk/KT/RJzzPZa34gA0dphM= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1699959042; c=relaxed/simple; bh=ZXpHX7EYPZc67ESN2JZIlzto58TCw8i7ngcn9zkqbEY=; h=DKIM-Signature:Message-ID:Date:Subject:To:From:MIME-Version; b=KeWSIba7qcKFRwaI6X1DSnqEt8EDZmeuRJQNOvo7fBemdMkuQGLr04KI3A6frZJQwDSyDEVFteu7cRoErTu/ez+M1aIJn73PX/pq3sPrE8szBwTr6g3HxUkH1oZ4Z60Jl6kb+wdPQ79K4muPnKB0U6J7XulZQT6oQoC5wZKP/dM= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PDsG/qdTTfKd0zbfXvgMffiD9L87j2GphhX9QLmEGr1+B/y8f63vxDOPJA9tg58jogkDjGRfygJw9VOTbP1WCJ0ogmhXsn4iqrSOTzP0pHZ37GRfdiG+Vdo52J0r8iKIEzRBMSYqaeekflm5Piy36+pHUkFSmUQsORJIWv0mlxaWoV0ba7EKIpgoAMnZdUZEoGs+jU75Gnay/QBxqdgc+EsQkItp0J4kzmsD2HTG1MxjYxuYRedIyYJz3M2kJz+kPEcJFEXIpAZGOmdUjxgYEnVs5XMU2A+I16ViXGKb5yR4yZPvhRj8mQbC+Kg19a1betb+OWmkQQj0DN7J8N6VjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mrX9C2DjSyxSf/E3QHtagILSln15FG46f34BQVFLla8=; b=JrVfpMxjS6Wtnl0e8QhFf1TxV6+ZyN8hFZVO006eCBbvOj2DDUhdF13CDsZlMt2BOYFjBT1yZgi+uMJs6R7/sBZ7G3xUisaiJC5U10wbvcpxDaE5TwWqzdh0djxmv/kngrPmThhOKfngIdDESPLQAMn5ruKmEvCAXMe8t6TuNuZ9f3Z12zRMAivXWfT/pH0fVRd80/377d9PxNyFN3XNGiYVLbuf+Rychrzl5Hk/ANPtM2Fv1RDPvkXJ9bBQoXihyB+tOReKIZKuG0ZN5DFZ+caViqoLH/xPeqrr3GiHi0MXCFeyA4mXMau6SpiS7spkiTUWyeLduT6l0ofG05kMMQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mrX9C2DjSyxSf/E3QHtagILSln15FG46f34BQVFLla8=; b=dI/JY4Vkny7uB9NNqh7j6PeJcJKV0aSBj0qMtlqKb9utiGjk9WqmEgFpJnqRGvVhIB+aasC0F302xwRmsqY3yCuhXMmJBdU7P/t4ik7BmZhjf/LhvD3vEDj9FURVhGQIo4BvC5pWQWiHYQiHX9JO8MdOmCpn2auSBRb3EzF+VAzjz60x3/5QviJej0/VKm7NVWswtsDz9qCssbgFVTB5Qfc7154iSwd5pi3oRTsb8Ryji7g1Bm6lOWcB9xZKXNyzJR/uNn8dnrJPDebMvOvhp+Ubap5jh0MlOwmx5wYtbvo7pmfORCU5OZb9TxlmUmZ/Ek8Y5D8DnY/KfSZ8D/7sdA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by DBAPR04MB7479.eurprd04.prod.outlook.com (2603:10a6:10:1ad::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7002.16; Tue, 14 Nov 2023 10:50:38 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::eb8e:fa24:44c1:5d44]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::eb8e:fa24:44c1:5d44%3]) with mapi id 15.20.7002.015; Tue, 14 Nov 2023 10:50:37 +0000 Message-ID: <8ed3b7a2-8cba-6428-1c01-5b6c28ca4a89@suse.com> Date: Tue, 14 Nov 2023 11:50:35 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH 7/8] Support APX NDD optimized encoding. Content-Language: en-US To: "Hu, Lin1" , "Cui, Lili" Cc: "Lu, Hongjiu" , "ccoutant@gmail.com" , "binutils@sourceware.org" References: <20231102112911.2372810-1-lili.cui@intel.com> <20231102112911.2372810-8-lili.cui@intel.com> From: Jan Beulich In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR0P281CA0217.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:ac::9) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|DBAPR04MB7479:EE_ X-MS-Office365-Filtering-Correlation-Id: 2b19b506-8bd4-49f1-5197-08dbe4ff897e X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: EKb69MhyarQ0ci0WEPAvh+ubVg/3JgR32cmXl19+r+GttkDaVuG1jZC30b471wLr1WJdpvJax03aXh/3rzDtKaV0/QLLnK9U/jxMaH4X6kt0IaHThfR8wcpsymOkv9MZ9cjbUY12c0FjE50+q8Ld9nU43L7DlJFomkLDARPx0Wg+B4fKE3tFoJAlWP6i/yiS5bojzyEgrCrr+zGB4MS4cshEzLETXvtLxH/UlV7KxT3KtId2GFw3Seb45BXMVGs7DwN1VytD4x/jFbOIrIBzA8s+IC2/hkREJ4z1C+I8/hfQd4BUQGX/p9JXUr2MU8QHVepLZg0i9eebfzqzvRk4KyqQ4BIx3Ur/szX50w7vK1nMd+uolcRd+2yB/5ZKgmNMyscm9PhqSPwCIrvxCQ7oCIhsKG2+ZsaHhf+hXX3htyNauS9Pl1cK+w6DV6m5KvA7t4S0zEmG/M1Tm5K3zLGAJVtFz/mCQgWHV1KBIT3KXkQJEK3K/Hn3vkhLXJF1ZMhDg8yEge574iyBTgFi/6zD+tfjK6ZhfA3JWoF4w3Ks+6nN3GX50pGUKo+Bvx6bdmbw1OML3PS0b1Ipn1lXHaEzDcu/q/Kc91n7Zirs7w/8TKqV3zwiTbsQ9sd0EAPFzlMAzFiws6DE+tHN25biilnuCw== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DU2PR04MB8790.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(396003)(39860400002)(136003)(376002)(366004)(346002)(230922051799003)(64100799003)(1800799009)(186009)(451199024)(31686004)(66556008)(66476007)(54906003)(110136005)(66946007)(86362001)(38100700002)(36756003)(83380400001)(6512007)(31696002)(53546011)(2616005)(6506007)(41300700001)(6486002)(19627235002)(2906002)(316002)(478600001)(8676002)(5660300002)(8936002)(4326008)(26005)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?RFNQdXM5b01OVEowYzVHVlZVbkNjbWQ4cU9QeXprNm81RngweXhNT3dYVm1T?= =?utf-8?B?Z2ZWVHRvNGRIU1ViMEQ3Q0JLWkNDS3JvMHc3WHNUWjA2UnRYb29zcEFTWXFt?= =?utf-8?B?WWNNL0N2TzlseGNWcHJ1NkNqN1pXZTlkRVFSY3RRbm9lb0d2V3JjeHhFM2cr?= =?utf-8?B?Ny9oOURvV0Fid3BpSUdNTUR3YWVRUVZDSmJEbm9rZVd4YTI5aUlOLy9wd1dw?= =?utf-8?B?OFQvM3lwU0YxWW44OURyR3QrMFZWRlBXbTZBSTUrMm91S3R0bWtzQnBIaFVa?= =?utf-8?B?SEcrWG1xeklIYjlxRDQzNXIyci90Y1hVak1pUjRWeXVhWS8rM2RtZGJySkZK?= =?utf-8?B?cnFxdVFNWFpXZmFZRHp2TVJwUFdMOWJzOHFHOXNwVlZyZkdsUmFGblhmcmxu?= =?utf-8?B?ditzTThOVHNIVEpadmlqSytvR1c0ZkNaUlVrU0NYRlRSb0xnVU1WNjNWNkh2?= =?utf-8?B?T2MxL2Jid3p0SnJOTytCKzU1Qm1vZ1NDSEI2a1lvRkw3eVB0ZHJNTFNPN3pF?= =?utf-8?B?VnhzQWFlNlQwRFBWV01pWjlzZmRxaTNJMVN2Ri9iVnZCdTA2TFZKSWhOWURD?= =?utf-8?B?QTgvRGtvSDR3aStZWXlPWEo4U3RhUysrcU1PUUM1ZGhiWUl3NTgveTJWUzVv?= =?utf-8?B?VHkvZzRoTFJyQ3VadUhVQUZRdEFWMzVYeHk1S1dPQUpCeldoRHpjSkw1cVhL?= =?utf-8?B?ZU1MdGczaVhLeGdBNmhOS2F0SmJmeEh2TjhYaHFFZENwbncvdmhNREhBcjRQ?= =?utf-8?B?YjVYTm15L3k2YlVyRVJyNUpSTDhEdGs1UFU5WGloZTltODZFYmF4RWd0R1VW?= =?utf-8?B?UnlzM1hRT0o4dlBWRlFoRDZtcmwyR1ZHNXMvZU9hSDUrMittSUdldE8xUytr?= =?utf-8?B?Q2ZXTDBLOFB1NlNReXVJN1ZoTFBtRUdMSllFcWg0THlrRFRXRnNlVDR3MHlH?= =?utf-8?B?TmVlT3Y5bDc3aGExQ2RuSGlLUlNWdEowdXZGYjJRWk9xTEpoZkdrbHZDWlRU?= =?utf-8?B?NEdNa0k5MVpWRFJpNGYvWEx1RXMrai93T3JHdStVYndDVnhFUWNpaXdJZHRv?= =?utf-8?B?bnVlS21KYS8zakFlcWdUeThGTnhXaDQwS2lGN1VuL1lubkNzZ1I3dWFFc0Jm?= =?utf-8?B?T1dtbkdwRUhnTTNCN1ZQeFdZNlFqKzNmQTVTU0taVmlpYy9zeWw5WGFVTlZ5?= =?utf-8?B?Q2g4Q2NDUEFCNmg3OVVHM2UzNEZZMitmalFCTjA1TzJXRlZ0enFxcTMwSUJ5?= =?utf-8?B?bVI4U0RPaXR6UWQ1SXRBZTVOQWxId3ZLdGloOVFXM0JQV2lZV0RlTmFBamJt?= =?utf-8?B?Q0VnYjdpVG1xVlVXV3VlSjFBWEtxdGZRNjhJWjdHYWl1L1BhOFc1RjFlclZB?= =?utf-8?B?Qm1NaXRMbjk4Qmt0bklGOFBSUGQzNGYyOFhkTFZkTkpPd0gvejdtQWFzcnNx?= =?utf-8?B?SEJsTElXRE1JVFlvNmMwUTVKYUtHOWJJS0xrd3VHNXlNV2VvUjB6S1ZRN0dT?= =?utf-8?B?UGtRazhpUXJMREZuOGxvU1dEbVNPc3pGeFhyMUh5RjZUd1RVSGVYT0I2MWxY?= =?utf-8?B?cU5HVTgxY0NvZnEweUVDcmprTUhQVU5taXpqODVnWXRwS2F0V1BOWFRMQ3li?= =?utf-8?B?bWFqQUxTaVN1WVptY2tDZEp2KzdRcEdUY2VrTjJjNTliWWhsR25aWmhNekRC?= =?utf-8?B?UTVsMmwwTkgxSHUvcmw3Um5nZjJpV3BlcXUwYnNUb1pxcVJSeVMzbU5ub2s0?= =?utf-8?B?aUg1Yk1VbzEzUWMvOVY0YkRnaXJ2ZThhN28rWm1tNlcxSDFNYVUwS3BXRzBO?= =?utf-8?B?Mlp5ZWFWQzlSeTMxUTRFZ0ZvajNTMDFqTnd3Nkt1QVAwUHZzMTZVaXdUN0VW?= =?utf-8?B?SnFOZlNKQTVJcDRJL3VsSjIxUTU5c1dqQlAvZHdrMnlLVTdCZGRpeG43djda?= =?utf-8?B?aE9ERUVVd1hKYjRlSmxCWGRXWm9seDJPbUJpVzFlcDU1OWpjL3pQWWRLRDNs?= =?utf-8?B?S203UFppdXNJMHJFRzh6cEo2aWpCZ0d1cDBqOHk5Y3JiR0dJVEVpeUt2dFZq?= =?utf-8?B?ZFdlajhjWVR3cXhvc0ZxTU5yQ0M5QXNGTHRhMGpjeWNGa082cUM3T1dmVDIr?= =?utf-8?Q?NrooQMJsZnp0XUs3Gr6Ztcwwr?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2b19b506-8bd4-49f1-5197-08dbe4ff897e X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Nov 2023 10:50:37.8693 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: n5kPwunZ/dbyiH37gMr8c9B2Nki3Tr8THcKQdux7fstZV8PgQUx7uQ0db1KOpDy+3CnfxCCgLUvOGIgoKca0Aw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBAPR04MB7479 X-Spam-Status: No, score=-3034.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 14.11.2023 03:28, Hu, Lin1 wrote: >> -----Original Message----- >> From: Jan Beulich >> >> On 10.11.2023 06:43, Hu, Lin1 wrote: >>>> On 02.11.2023 12:29, Cui, Lili wrote: >>>>> + unsigned int src2 = (i.operands > 3) ? i.operands - 3 : 0; >>>>> + >>>>> + if (i.types[src1].bitfield.class == Reg >>>>> + && i.op[src1].regs == i.op[dest].regs) >>>>> + readonly_var = src2; >>>> >>>> As can be seen in the testcase, this also results in ADCX/ADOX to be >>>> converted to non-ND EVEX forms, i.e. even when that's not a win at all. >>>> We shouldn't change what the user has written when the encoding >>>> doesn't actually improve. (Or else, but I'd be hesitant to accept >>>> that, at the very least the effect would need pointing out in the >>>> description or even a code comment, so that later on it is possible >>>> to figure out whether that was intentional or an >>>> oversight.) >>>> >>>> This is where my template ordering remark in reply to patch 5 comes into play: >>>> Whether invoking re-parse is okay would further need to depend on >>>> whether an alternative (earlier) template actually allows >>>> REX2 encoding (same base-opcode could be one of the criteria for how >>>> far to look back through earlier templates; an option might also be >>>> to put the 3- operand templates first, so that looking backwards >>>> wouldn't be necessary in the first place). This would then likely >>>> also address one of the forward looking concerns I've raised above. >>>> >>> >>> Indeed, adcx's legacy insn can't support rex2. >>> >>> For my problem, I prefer to re-order templates order, because, I hadn't >> thought of a way to simply move t to the farthest same base_opcode template >> for the moment. The following is a tentative scenario: the order will be ndd evex >> - rex2 - evex. >> >> Yes, this matches my understanding / expectation. >> >>> And I will need a tmp_variable to avoid the insn doesn't match the rex2, let me >> backtrack the match's result and the value of i. >> >> This, however, I'm not convinced of. I'd rather see this vaguely in line with >> 58bceb182740 ("x86: prefer VEX encodings over EVEX ones when >> possible"): Do another full matching round with the removed operand, arranging >> for "internal error" to be raised in case that fails. Your approach would, I think, >> result in silent bad code generation in case something went wrong. Thing is - you >> don't even need to advance (or >> backtrack) t in that case >> > > I tried to reorder the templates and modify the code as follows: > > @ -7728,6 +7765,40 @@ match_template (char mnem_suffix) > i.memshift = memshift; > } > > + /* If we can optimize a NDD insn to non-NDD insn, like > + add %r16, %r8, %r8 -> add %r16, %r8, > + add %r8, %r16, %r8 -> add %r16, %r8, then rematch template. > + Note that the semantics have not been changed. */ > + if (optimize > + && !i.no_optimize > + && i.vec_encoding != vex_encoding_evex > + && t + 1 < current_templates->end > + && !t[1].opcode_modifier.evex) > + { > + unsigned int readonly_var = convert_NDD_to_REX2 (t); > + if (readonly_var != ~0) > + { > + if (!check_EgprOperands (t + 1)) > + { > + specific_error = progress (internal_error); > + continue; > + } > + ++i.operands; > + ++i.reg_operands; DYM decrement rather than increment for these? We're trying to go from 3 to 2 operands, after all. > + ++i.tm.operands; Why is this? Aren't we ahead of filling i.tm here? > + > + if (readonly_var == 1) > + swap_2_operands (0, 1); > + } > + } > > convert_NDD_to_REX2 return readonly_var now. check_EgprOperands aims to exclude some insns like adcx and adox. Because their opcode_space is legacy-map2 can't support rex2. Good. Looking forward to seeing the full change. > And I need some modifications in tc-i386.c after reorder i386-opc.tbl. > > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c > index 7a86aff1828..d98950c7dfd 100644 > --- a/gas/config/tc-i386.c > +++ b/gas/config/tc-i386.c > @@ -14401,7 +14401,9 @@ static bool check_register (const reg_entry *r) > > if (r->reg_flags & RegRex2) > { > - if (is_evex_encoding (current_templates->start)) > + if (is_evex_encoding (current_templates->start) > + && ((current_templates->start + 1 >= current_templates->end) > + || (is_evex_encoding (current_templates->start + 1)))) > i.vec_encoding = vex_encoding_evex; > > if (!cpu_arch_flags.bitfield.cpuapx_f > > What's your opinion? See my comments to Lili on already the original code (which you further modify) here. There cannot be a dependency on current_templates here, imo. Lili - the fact Lin needs the modification above actually looks to support my view on this. Jan