From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2083.outbound.protection.outlook.com [40.107.8.83]) by sourceware.org (Postfix) with ESMTPS id ABB263857C44 for ; Mon, 23 Oct 2023 07:24:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ABB263857C44 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ABB263857C44 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.8.83 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1698045843; cv=pass; b=NRe9j+PtvaJTvKgbLKyiSdqDaOAqWHP+K3mNSCituO1ns3iVHE5ZcFveZxQBLYWyIjxVbZFx0dgjnRasIGgVCOZo5B4XicbFtfFY3Dey7CobZFzGSlbOu5M784vO8xABOWWleqQ3ZU2cKAybSyBS/XE0zl1bvD5tkJZUtm4pIzU= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1698045843; c=relaxed/simple; bh=+iBwIerey40FtLf2rDxvIxCG5UiU55a06kIK3sKF8Lo=; h=DKIM-Signature:Message-ID:Date:Subject:To:From:MIME-Version; b=lV3yrBQcfRAcBZu8pgnlGi7ZhKtWg+SLqHrc7c+QAvNSv4FJBv5kaT8hGIk7nXvq7khPZ0zFWIB2zhsAPka+mOKG+cWd2A3xQZUuDzFkUdsxgfjCshSMWogNir5z7gLashk6jxrP0oeydJaQaPLytQQ+FKfHOJyra9JZPvqIT1w= ARC-Authentication-Results: i=2; server2.sourceware.org ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EC0ZBvpUx1JyNLfvxx8dda9oguGpTKRXCgszJ0sYMr3MEtN84TCdhkdpsUmr3U5smAP2Y4Xl9ORsnrZWmvJ12Hm3folIsA7kRkWa2wnDVnvzUqLeqo2hKDiVAtcZrC/4iOuzkJwPXTNq8jb6VoDIeOz/Ith7/o4kJfcTHlNAO/IF963ZPJ/nZiAGXhszWKQMa94ke50nlMhZqdzo2cmTlzL82iAKf8pWVToiKjs+teDDf9gMSdzcefABaR7xz9Xna2cMiSmnzzPvPcRDstU1p2LwjFMT/y64FAkMH0qOe2EUT3VP+GtU3XTIbkENtH1dUCnCmCDiRvtpSLDqOe9+CQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EKA454rkIzSXS2a6JsuLGqjqr2UNWqo6IlruJ2lZuPo=; b=BC8GuzeoF5L8XG0gO47GNB9V0c2VnQLihT73Hw+mFA4qajkqszuzXNaaGBtwrcEaxgQAgKP/3Rzs+Q/CXovqdNuPNKh7WXk2gwqCJqGLDGWwcQ5bnReWR73nFSPZ2mD6T8lWBMLCZZUnHc1gukl+5qWs6Z5ypOWCfBPJDvuQmBUzSz3/oMkWccCugEUA67ObFvcONSJcgcxu1cilSDThNR4v9SLAnb8voSC0jngb1zWg3cfyRweLV/0j2NDIy7grtDWO2hGLvheKo/6KXVyh09yTB4LhCBKJfyRH05If0OcDcIzlgpQIiHBo0xI9Cq3erE/WfEcBWwMvvwEBALxTRA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EKA454rkIzSXS2a6JsuLGqjqr2UNWqo6IlruJ2lZuPo=; b=koEGh5ytxRARHDgce/KLhobltrG3Friid1LrhS7m+QYBUwM7ldDKFHgKovRBJLhNcrFOttPba3dqXrNfoGWV9PsU32QYCCjRKRoLTmDDeQQ6gT5gK8Mw7171xbecsza+mu+FtIGIGpgtjw5B0DwHtdX0Fxf2/AtGhHuMrtTGy4lU+6qTToeDEIz8i1se/fMbmAay8zeZv7jG/piUTg/H8bl4/n6Ggkbee96DtMSC/riv35n4U/YLyTGpWg4YBLthw7sWAycSVZnNx47e4aY7+UXOTCvfGVTf93ENGxFOjBP4snDfB36gQGtqu0KBGdALN3urm7qBd2u26gp7lWA1vA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com; Received: from DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) by AM9PR04MB7651.eurprd04.prod.outlook.com (2603:10a6:20b:280::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6933.14; Mon, 23 Oct 2023 07:23:59 +0000 Received: from DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::d924:b650:a2ad:7b25]) by DU2PR04MB8790.eurprd04.prod.outlook.com ([fe80::d924:b650:a2ad:7b25%3]) with mapi id 15.20.6933.011; Mon, 23 Oct 2023 07:23:59 +0000 Message-ID: <68619a61-a0f4-a851-77c6-e324eca30cb4@suse.com> Date: Mon, 23 Oct 2023 09:23:57 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH 5/8] Support APX NDD optimized encoding. Content-Language: en-US To: "Hu, Lin1" Cc: "Lu, Hongjiu" , "binutils@sourceware.org" , "Cui, Lili" References: <20230919152527.497773-1-lili.cui@intel.com> <20230919152527.497773-6-lili.cui@intel.com> From: Jan Beulich In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-ClientProxiedBy: FR5P281CA0026.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:f1::11) To DU2PR04MB8790.eurprd04.prod.outlook.com (2603:10a6:10:2e1::23) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DU2PR04MB8790:EE_|AM9PR04MB7651:EE_ X-MS-Office365-Filtering-Correlation-Id: 2774542f-bd4f-4fbd-aa69-08dbd399061d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: nL4YmSKj1McGYrT2vQepUU4KEEgRM/jrv/4L6ZwKAT9l+qtC3jVaaWtsDP1/ZRS1Ipi9FnSQfY/VUJbx1a0Pceyj8MaAlA+Dg6OWx2Vqhrq7hs/mIEOREefX974Y2GB0mGR5LReJM66hAFHWhu2ZQ5agRCXMVcBupYyNNfdW/MBkLD3NZgmps4ngl1FNOqyESXJtp4kjpgoJcDGkttWx1GJZd9kQ47IUYFjeAqwCiTAImXxUxrPV98GjpLM4dXvN3fF1bvYuQfaXuu4A4qClq2NqmYfDHJ8x8AnTwAEnZQUnfYGQ5NbRS/YCdWa45uh51E8XnnAPdTguVrEgXMlQ6qvrQwnD1XJbzgY9mgNU53Waw+9il+eVjf2LR3vKviTGwzc8lVnoQkC2LgJohVRbAwEJuFd4omNs1pQrLMsLq2ksp39WRv7DhthZcEP2ER1ZsWG03RBihwW8oX8SWg4BgDBIa99pgJAzKeQwST8VIWyqJVdy8SvbCnyOKPqRX2hY/CcZB9vbAywLTLNjPUerD35GBVWUUAw40x6C2zjTlxvtBgpYG+xFsQpcfCr3SVJ/d7cj/0C6EkS2LuR2f+QujV3bT6em9DlvRkqPSgb5DUxLlk8kjm6DVs2OZHduJyJzDejrrG+P3j5Uals3HU0JRQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DU2PR04MB8790.eurprd04.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(396003)(136003)(346002)(39860400002)(376002)(366004)(230922051799003)(451199024)(64100799003)(1800799009)(186009)(2906002)(41300700001)(26005)(2616005)(36756003)(86362001)(31696002)(38100700002)(83380400001)(5660300002)(6512007)(6506007)(53546011)(66476007)(66556008)(6916009)(316002)(478600001)(54906003)(66946007)(6486002)(31686004)(8676002)(4326008)(8936002)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?aVF2bk5QbXNwdGxTNFZkM0NPb1NTcTRLdUFXV2FHNnZybUc3VHdxdjdjYk5t?= =?utf-8?B?dW45SEZJRGZ2cXBDZk5vUWxxYWhmckMwN21OQVVidFh3eWgybW94amhCbTZi?= =?utf-8?B?WEcyanRoblpSMEpaTndJMktPeEdyTmNoZkU1eDZMWHV5Tk1LeUpCdE5nSjdq?= =?utf-8?B?R2RXK1VhVXkwK3REN1dvSHo0YnBCd2FxTUhOWElyWXdpcDMvRnp0NVpybUI0?= =?utf-8?B?eERHNWxIZ2VIdVppbDY2NU1MZHBiUWlXcUVPdlBzeDNZT1hxUUE0M1c3YnNM?= =?utf-8?B?d2YwNWdHcmpyR3V3SmxqTFdUVlkwcnhwVlRqRzhOWTNTNlhPZFptT0RsRml4?= =?utf-8?B?R29TV3l1S1k0NmIvUk5HK2lSenNnYms5LzBQS2MwQzVhU1E3Sk1rakx3ajVH?= =?utf-8?B?WnE3bHNIV2NJZ1VjRlhlZEp1VVFpNUxRSVgwY3hINURCOEJIZ1V6OWE1bzFE?= =?utf-8?B?THBYMGZwOCtPcGhEMzR6a250Mk9jbXNLSW1KTHNGTFp6Y2c1TWk0L1VqQ3dj?= =?utf-8?B?YThOc0pmbitjeU11QWtGZlpnZmhHeTRKZEZ0MVhEVEpTNlZSSDNhZTRibWkx?= =?utf-8?B?RS9PbnFpMnpROTFqbXpGYStoM1RLZGhUeCtDVEJDWGVlUUFONFNqZEFiWUIv?= =?utf-8?B?Qk5ZbldIa1BGSE9ralJrbDI0RVBJeGZieFlvcnpLQjIyZGRZV2dKZWRvMjVo?= =?utf-8?B?S0V1RGs0QUZlVm9paEI1S1RhNm9USUFGTFhGWVJxaXQ3Y2IwSGlkSnEreits?= =?utf-8?B?QUVCUEtxS1hnVkhrSDA0VFVNclRQc0o3ZERxb3ljc0tqUzUvdmE2Q0Z4YXdQ?= =?utf-8?B?NGVRZldNcElrdUszMFl4QzVsM3hvTEYwVkFTRlVuWkRwYzdkMnpUNmpTRWpO?= =?utf-8?B?TllkKzdLUFkzc3NBNWlKWUV4YURxQ2pyeVErUkl0UWtzVVl6c1E4c280QWw3?= =?utf-8?B?QVdiZEJzc0JyNUJ2aGZBNTAzd1pONS9QYkw3QlBRQnlJRFVrelV4MThsY2xV?= =?utf-8?B?SEVXMlp0RWFwYmZIRnU3RzNkbXlRSEQrMElvNGV3S2tRRWd3Umx4VVhoQ1Fa?= =?utf-8?B?K0NZOTcvdk5NUGhPUmlGemIwdE4xWWpuVEZtMDMrNGxvQnp6Y3lmaUVGUWhD?= =?utf-8?B?T2lsZXRpcFBTeDZTTHpqUjZCQm14VXJNWGMrWWxYbkw3eHFHWGY1bWRRaHV6?= =?utf-8?B?SEl4SWFSZGFVdEQrWjRkNVRzSElTUzF5ZG9STnhDVGFhcUFna3VML0QwUHRr?= =?utf-8?B?ejR3OHhSZS8yRm1qZnVyS0N6WDdGLzdYNEUxekJ1QU14WnVIc3FFb2dvYUpr?= =?utf-8?B?UVJvL2VXMG51ZEFCVFhVWm5WSkplWStDYVpCd2pBMnR1SGE1WVZWd3EwVUd2?= =?utf-8?B?d1VHc1N1SmJWQ2ZKb2RFK2E0QXRyWTZ4TDB5Q1dybEFkRzQySWMvU2dZeDdQ?= =?utf-8?B?ZTNybXRlZlNFSUYrV0g4WWhaVWx6Tlgza0g1U0FKRXJ2Ykp3WGYrYWE1aWRZ?= =?utf-8?B?K1NZNVpDNjIrVVgvWWdGeWU1eDJQTUNSZEg0cmx2V1B5YVJMVllySGRjdjlo?= =?utf-8?B?Ti9VRjJRUTRIcENvZXU4ZG5aTTFDTk5WZG9IZjhVbDhkZ1lHd0FPcmxlU0R0?= =?utf-8?B?amFjc05FdUxyaEtoQjhOWUJFOHdIbVZwODBCQ1VWRlFvTkhQMVZXNS9nRENH?= =?utf-8?B?Q0JhWGxJUmJtQTNYVVJoZmhDN2VIOGk4T29nWktWWDVTQkErVDNCbWNmM1Nl?= =?utf-8?B?dmZVV1Z3MzJsK2JXdmxGK2VsNU1UVlVmeEVsMWVHTFVJMUpYUlJ5dkt5OVIv?= =?utf-8?B?TTdHUkd6dW53S0o5aUE5WS9UVm9Qa3VYd0ZGR3NqWnErc3l1L2JyY1VtanlJ?= =?utf-8?B?WXlxL2lCS2ZOV1cyeDR0STRQOGREME8vbENjR3BmeXI5RVZMMkNKWnBHVGFN?= =?utf-8?B?RDRzNXdpdlozSWNZK1lJWllHKzRFMUlsY2ZIeDRCN2twZlBtRXdUWXpVQ2Jh?= =?utf-8?B?LzZUb0t0VzhyanFBdnE0SjRIWmJMbjc0WjBteWIwR09kYWdiK0Iyb0dRRjk1?= =?utf-8?B?UnVnaEZBMzlZclBTc1NQMGZibXc3ZWdsKzI3Sy9zbUhvZmVVeThtUjJKUEh2?= =?utf-8?Q?sZFPGxmuxTGNkvTLreUiIQ1BP?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2774542f-bd4f-4fbd-aa69-08dbd399061d X-MS-Exchange-CrossTenant-AuthSource: DU2PR04MB8790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Oct 2023 07:23:59.0195 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 71akQl3ieYAOdX0yp2m1+VRxBdpEtFdco5EMPMeRdIDV5S1Q+VkiSzgRuZUMV1ZXUBrde/xb3NlIUK2elHE+zQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR04MB7651 X-Spam-Status: No, score=-3028.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 23.10.2023 04:57, Hu, Lin1 wrote: > Thanks for your reviews, I have responded to your comments. The new patch will be attached to a subsequent email. Up-front remark: Your way of replying makes it hard to spot what your responses actually are. Many mail programs allow you to properly prefix (by convention using '>') responded-to mail contents, it's usually merely a matter of enabling that mode of operation. > -----Original Message----- > From: Jan Beulich > Sent: Thursday, September 28, 2023 5:30 PM > > On 19.09.2023 17:25, Cui, Lili wrote: >> --- a/gas/config/tc-i386.c >> +++ b/gas/config/tc-i386.c >> @@ -7091,6 +7091,46 @@ check_EgprOperands (const insn_template *t) >> return 0; >> } >> >> +/* Optimize APX NDD insns to non-NDD insns. */ >> + >> +static int > > "bool" please when the function merely returns a yes/no indicator. > > * Have modified. > >> +optimize_NDD_to_nonNDD (const insn_template *t) { >> + if (t->opcode_modifier.vexvvvv >> + && t->opcode_space == SPACE_EVEXMAP4 >> + && i.reg_operands >= 2 > > See the remark near the bottom of the changes to this file: This condition is likely insufficient, as > - further insns allowing ND may not be treated this way (CCMPscc, > CTESTscc, and one of the CFCMOVcc forms at the very least), > - {nf} uses will want excluding, as it would be merely a waste of > time to try to re-match with fewer operands. > > * CCMPSCC and CTESTSCC’s vexvvvv will be false. I think one of the CFCMOVCC forms is same. By "is same" do you mean "fits the optimization pattern here"? >> @@ -7562,6 +7602,15 @@ match_template (char mnem_suffix) >> slip through to break. */ >> } >> >> + /* If we can optimize a NDD insn to non-NDD insn, like >> + add %r16, %r8, %r8 -> add %r16, %r8, then rematch template. */ >> + if (optimize_NDD_to_nonNDD (t)) > > I don't think such an optimization should be done without any form of -O. > > As to the function name, maybe better optimize_NDD_to_REX2()? > > * Refer to the optimization of VOP, temporarily set to O1 will be optimized. If we use 32bit register, some instructions will be optimized from NDD to rex or legacy. Like cmovg 0x90909090(%eax),%edx,%edx, imul %rdx,%rax,%rdx in our test. What you you mean by saying "temporarily"? >> --- /dev/null >> +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s >> @@ -0,0 +1,115 @@ >> +# Check 64bit APX NDD instructions with optimized encoding >> + >> + .allow_index_reg >> + .text >> +_start: >> +inc %ax,%ax >> +inc %eax,%eax >> +inc %rax,%rax >> +inc %r16,%r16 >> +add %r31b,%r8b,%r8b >> +add %r31b,%r8b,%r31b >> +addb %r31b,%r8b,%r8b >> +add %r31,%r8,%r8 >> +addq %r31,%r8,%r8 >> +add %r31d,%r8d,%r8d >> +addl %r31d,%r8d,%r8d >> +add %r31w,%r8w,%r8w >> +addw %r31w,%r8w,%r8w >> +{store} add %r31,%r8,%r8 >> +{load} add %r31,%r8,%r8 >> +add %r31,(%r8),%r31 >> +add (%r31),%r8,%r8 >> +add $0x12344433,%r15,%r15 >> +add $0xfffffffff4332211,%r8,%r8 >> +dec %r17,%r17 >> +not %r17,%r17 >> +neg %r17,%r17 >> +sub %r15,%r17,%r17 >> +sub %r15d,(%r8),%r15d >> +sub (%r15,%rax,1),%r16,%r16 >> +sub $0x1234,%r30,%r30 >> +sbb %r15,%r17,%r17 >> +sbb %r15,(%r8),%r15 >> +sbb (%r15,%rax,1),%r16,%r16 >> +sbb $0x1234,%r30,%r30 >> +adc %r15,%r17,%r17 >> +adc %r15d,(%r8),%r15d >> +adc (%r15,%rax,1),%r16,%r16 >> +adc $0x1234,%r30,%r30 >> +or %r15,%r17,%r17 >> +or %r15d,(%r8),%r15d >> +or (%r15,%rax,1),%r16,%r16 >> +or $0x1234,%r30,%r30 >> +xor %r15,%r17,%r17 >> +xor %r15d,(%r8),%r15d >> +xor (%r15,%rax,1),%r16,%r16 >> +xor $0x1234,%r30,%r30 >> +and %r15,%r17,%r17 >> +and %r15d,(%r8),%r15d >> +and (%r15,%rax,1),%r16,%r16 >> +and $0x1234,%r30,%r30 >> +and $0x1234,%r30 >> +ror %r31,%r31 >> +rorb %r31b,%r31b > > Please be consistent with omitting (or having) suffixes (further up you simply test both variants, but that's not the case throughout ... > >> +ror $0x2,%r12,%r12 >> +rol %r31,%r31 >> +rolb %r31b,%r31b >> +rol $0x2,%r12,%r12 >> +rcr %r31,%r31 >> +rcrb %r31b,%r31b >> +rcr $0x2,%r12b,%r12b >> +rcr $0x2,%r12,%r12 >> +rcl %r31,%r31 >> +rclb %r31b,%r31b >> +rcl $0x2,%r12b,%r12b >> +rcl $0x2,%r12,%r12 >> +shl %r31,%r31 >> +shlb %r31b,%r31b >> +shl $0x2,%r12b,%r12b >> +shl $0x2,%r12,%r12 >> +sar %r31,%r31 >> +sarb %r31b,%r31b >> +sar $0x2,%r12b,%r12b >> +sar $0x2,%r12,%r12 >> +shl %r31,%r31 >> +shlb %r31b,%r31b >> +shl $0x2,%r12b,%r12b >> +shl $0x2,%r12,%r12 >> +shr %r31,%r31 >> +shrb %r31b,%r31b > > ... here. > > * Added suffixes to the command names, I don't understand very well what is suggested inside the parentheses, I just aligned most of the insns' test items. In parentheses I've merely tried to make more clear what aspects I mean to be considered for the result being consistent. Jan