From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140059.outbound.protection.outlook.com [40.107.14.59]) by sourceware.org (Postfix) with ESMTPS id EBE393858D37 for ; Thu, 6 Oct 2022 17:29:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EBE393858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=SN/9le9yqOhKydHvTvyS5CCGL8mArw5b0/GnoY9wibVGcKL27bFSh+QHH16/TpBad1FLaWhx19StBBada4Eo7+A4W8q0I1nFiLMuV5JoZtQBFUOt3qyBcfWLe2VSLs1P5Zloen6yyThPKYUeFIiVnwkrs6nBK7TT74b+T6wHS2GZ1+u+C15iPgaIfwMMZWC+iYP30UTLonEmFqWxIbXXt8M/cKs4eQ6wUkroj9tNTW5iXG9XGsBX1ePEfAfst2Tm3KII7XEYVitgdm0OW6z2HKYb42gHA5CZtSyYhd2eboM9gdu1TrbgilKJO4tNC0J5lpso+qPAzl5F8Fmz5r3LzA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FNgYZgFjB/cUbkqEQr2d8tGWfnT70bUAvHiCastGHxo=; b=M8p9LhCogNPIIuh/6R3a9esU7acq1ioaPxjC8x3a/JEh52jEcCgdZWLE+D0RANyctV709nHk2xu3eCeJsYkTK9VTK579lvrFX+4CN5tqVMFYw/tperhAFECX64m5VW/6gZoS0gs9iBsJmOMALbgDrxjEbXSQohetVZgjswqVMhKypxB5N/JrKreCQkzEOXmr3vvqkQLT36H5WawqQ8/hQD/ffYXxITNzK9SN3nH+7VglJ7nfEnlxWQ4Sts7K9szqhbmj5ISFwmwwpOpAxCGqzZi9jyxzLk46eBEQ8v3sZYLTagLDWxm+vb3K2IYXlXSeeu/3Lf4TBPTmuPsp4V3nsA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FNgYZgFjB/cUbkqEQr2d8tGWfnT70bUAvHiCastGHxo=; b=oTjj+K7xfIbk/rwbVZyIey9PTTaLJoaLLn2D6a/Sd4FVriA89SPwaSjPlLPJbJQtda/tjngdas6RvaYkFjFkT9idh+9cUs69/5u1Zzau4AKBDBDRyr/h0RQWDUTKNsgvpBX7arFtIgO1zKPuZDzdv1/03z9ZjNujX1CcEikitBU= Received: from AS8PR07CA0011.eurprd07.prod.outlook.com (2603:10a6:20b:451::29) by PAVPR08MB8895.eurprd08.prod.outlook.com (2603:10a6:102:323::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.23; Thu, 6 Oct 2022 17:29:17 +0000 Received: from VE1EUR03FT006.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:451:cafe::b8) by AS8PR07CA0011.outlook.office365.com (2603:10a6:20b:451::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5723.10 via Frontend Transport; Thu, 6 Oct 2022 17:29:17 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT006.mail.protection.outlook.com (10.152.18.116) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5709.10 via Frontend Transport; Thu, 6 Oct 2022 17:29:17 +0000 Received: ("Tessian outbound d354c7aef2bc:v128"); Thu, 06 Oct 2022 17:29:16 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 689ba8b50ad7ff98 X-CR-MTA-TID: 64aa7808 Received: from 47a74e573b28.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 4A9F2C07-5430-4DCE-9813-B6DEE8F670D5.1; Thu, 06 Oct 2022 17:29:05 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 47a74e573b28.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 06 Oct 2022 17:29:05 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ffCB3HLwiQHnySyUL0fC3pk+biYIEW04z99ZS4Oxh9fLTqson4Lt4SU0GYBLHn7NtY+ecYZ+RAVs9YXXtpNKpjX8MWPfrHsoEgxMm/NK2xkOiJdfokm934Xqv2xWMmIasAz+vuV9T8B2bIQnYG01R2+HiCoruxVwnGa613JTuvI4OnzSvOdOlBlZSgUNlW4NlDZei/BAZJng9oeKlaa6/ui2umHQfOfoIS8PwQsjYBkjkagwSHqBrVdFWPb6PZlYC/wsTXzeiTR4AdTbBFG/htl2r/OC+ywCCOwEOVpdJmfArChvFYDi7qWXbwqGmMNZTO1S0fWdNx8i0qW2r2FlUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FNgYZgFjB/cUbkqEQr2d8tGWfnT70bUAvHiCastGHxo=; b=Yyit7BT9LlrrS7To9eRJSh/h4bVlVi9YlpfP7oMR7HSm/6Ue6UnpW7GUKtzBB0KzyZWGHySVv//pOwWTKw6fG4tCgB6RkjbBXM4IRaP2nOZ8ehOdwnAcZhF3v7SbXuzUBxsMlfUFreawpK+xAgndbNNglDyPHXSPOM1IotKiAJG6oGkzMfON3FqQg806NbmorSwkr79zr+wHTxll2z+a0NPKeSHSxkTr4yq5uHsDq+XmPx1Rte+AkIeUoMWGlhlEPVP7l8gxHhqYh2RPeA9QekYinQnmrWwygQtPqG0XQTWz6NX5t6/xQxDfTPHpzB3j4G0rd4/n2lhz6wId1/uicA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FNgYZgFjB/cUbkqEQr2d8tGWfnT70bUAvHiCastGHxo=; b=oTjj+K7xfIbk/rwbVZyIey9PTTaLJoaLLn2D6a/Sd4FVriA89SPwaSjPlLPJbJQtda/tjngdas6RvaYkFjFkT9idh+9cUs69/5u1Zzau4AKBDBDRyr/h0RQWDUTKNsgvpBX7arFtIgO1zKPuZDzdv1/03z9ZjNujX1CcEikitBU= Received: from AS4PR08MB7901.eurprd08.prod.outlook.com (2603:10a6:20b:51c::16) by DU0PR08MB7860.eurprd08.prod.outlook.com (2603:10a6:10:3b0::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.24; Thu, 6 Oct 2022 17:29:03 +0000 Received: from AS4PR08MB7901.eurprd08.prod.outlook.com ([fe80::4d64:ef01:4d4c:6ba1]) by AS4PR08MB7901.eurprd08.prod.outlook.com ([fe80::4d64:ef01:4d4c:6ba1%8]) with mapi id 15.20.5676.032; Thu, 6 Oct 2022 17:29:03 +0000 From: Wilco Dijkstra To: Richard Sandiford CC: GCC Patches Subject: Re: [PATCH][AArch64] Improve immediate expansion [PR106583] Thread-Topic: [PATCH][AArch64] Improve immediate expansion [PR106583] Thread-Index: AQHY2A60YLzBQMfh40K/8+uaM+WmL63/fNejgAIgoBE= Date: Thu, 6 Oct 2022 17:29:03 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: AS4PR08MB7901:EE_|DU0PR08MB7860:EE_|VE1EUR03FT006:EE_|PAVPR08MB8895:EE_ X-MS-Office365-Filtering-Correlation-Id: 54fad25b-b088-4560-fc62-08daa7c04bc9 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: /3F6B9g8n/LtgjtWNfAjSiEpcX18oMRqVgTx06+lTOLNi3+c/JVmwx77DkdMb2gwHLdXnL0xMkjzjZihcEv/feURCDCBQPZV7S/96ltdUBDGiC6cH5WhQK5JWOVIFFqprKBlNe4yjZ2VPHgaMyvXX8VP27d54BT4S47zdvhQSKnJ1H/yJzeZAsFAp56s9vpBQ0oyuXrvgWP8yz1EwQpD+T4n07LQCcuamipSgV/yyL/iMn3ZaXnlRcJAmQSS6IARVYCItyC2QAMJ4elEmDfNjps3uaqTKWy2LRaMz805+23OLS1rOVGUrDJLI3jcvvo+1WFGN4qQr5YQAdfTIzhG87pXp18uFjjWw5vWPl7lNDFVnHMvO1JpD1Qioh0PrhB+tv5OK7RJl2xxA8X7O4KFslZuRzU7Dw+ob6PborqMH/Dl8Rk3CQcoFYMyqWgOvBWIbd2yZbZCGy0nlQLRcsQOURVwMaIoqwcgo443kt6Ug4nB0jXIhG2yG94Zg4DJkM3IQCLlYbS0wX20CNaz6BMfAdLAaT+6Q2MNr3kwM+IikGZ62pq4QS5byPsqINQ/GqFMnAgNbhbP3cAWCK/b4BD/3m8UdJaS8MUr4H5/uwsybIrt5WDDYgoTXkCdVwQmSWQh7fvfOl3NFcN8Pvjk6r+ZcxJQab7X8gLseWfmP8JmC1r6AbynQRmTimN0j7nSeRFiFnAakh0JE9sWThi7rHKFWvwT1AvFYVh4aWxcEZET57oJrJnxNOSTLRzMv0sM5DaxbgFGD4FkcrDbHbNUjW+ptvVKRmrdRSp1emHBjBYHhUOfE/1o9PERkomFIDngxtt5 X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:AS4PR08MB7901.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(366004)(346002)(396003)(39860400002)(376002)(136003)(451199015)(71200400001)(478600001)(9686003)(316002)(91956017)(122000001)(8936002)(6862004)(33656002)(2906002)(38070700005)(5660300002)(186003)(8676002)(84970400001)(66476007)(6506007)(66556008)(66446008)(66946007)(4326008)(64756008)(76116006)(7696005)(26005)(6636002)(55016003)(52536014)(38100700002)(86362001)(41300700001)(14773001);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB7860 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT006.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 0be9391a-ed2f-4bad-de64-08daa7c04365 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: CSSe7PoT8PaniQy8emZJD53xcwFhD7R/0k1gyiUiCt1iYbBLlT1AbxyQrXVrBzIjjTN+0MjdqyDhAdDWSvoE041GioIDIKaYYaC64gNAtd6zT2Rxij685MTxvsxRW5j9N2z8RipVl1bB35KLxpKoPlsccJ9KfXCyvZmm29kPK8wjJoklmuKqTW44QQoYHUF5OmABDalRsSz5asnmgfEaC8wODsNnMVfK8R1fizwKDZ5mmtx38aNuodmcdC/SNAlBlZnQz2v7zCAE9cFVNmwP+LcoR2PwSqaj1xmPN2GHs+rQk3Ut6QA0ly7u+ChvjEdc/z8v/kvtLhAxzVadmh87hnyjCYMs+v1bjMSlSvp4eXinVWKpY3gJjyHwyc5xHFHJ224tdKSl/zr1EPeaNe8QlTzcunkBAwOxIQvB2eIjz+UXJMdGF5Fe8k0EZXXSh36YTFvGGGCNu/FJpWqM4NaFyLhD8oGCl29uPIcbpz/jp7HiW1HY/7/odhlZRRqyULemk6D7JKduuX7MKTLybDvSmrYK3ZXndC/5sUOTa0ei4WsFu8VOc4C7Z53iIRDkHj+PErnQbM/fH7JulnlISMAhiqdAwPryPnhM/U3glKL4IUvpxVRJudOlBtPEKi8h9X33Wb63moJQ04Tj5Mxs3870oDncPgk3CoWEquZZjEPDBVYf6w6jWSIP+dNfkAEQCLMuwlczl96S0uR3O7A0Mda5dPETMt1QSra+a21yctYtrZ9i5IMGayL0q+5ugo60PDu+6sDbtyvFBXncTR0H5eeigVV8cBfC4H/7Uxw26b6qho63VUKZ0ysLKqW0nJh4zbMBAGCBfyi9wizbM8lyMICULw== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(136003)(39860400002)(396003)(346002)(376002)(451199015)(36840700001)(46966006)(40470700004)(478600001)(55016003)(8676002)(41300700001)(5660300002)(84970400001)(40480700001)(52536014)(40460700003)(6862004)(70586007)(70206006)(356005)(4326008)(8936002)(47076005)(9686003)(26005)(33656002)(186003)(86362001)(336012)(6506007)(316002)(2906002)(82740400003)(6636002)(82310400005)(36860700001)(7696005)(81166007)(14773001);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Oct 2022 17:29:17.2028 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 54fad25b-b088-4560-fc62-08daa7c04bc9 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT006.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAVPR08MB8895 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Richard,=0A= =0A= > Did you consider handling the case where the movks aren't for=0A= > consecutive bitranges?=A0 E.g. the patch handles:=0A= =0A= > but it looks like it would be fairly easy to extend it to:=0A= >=0A= >=A0 0x1234cccc5678cccc=0A= =0A= Yes, with a more general search loop we can get that case too -=0A= it doesn't trigger much though. The code that checks for this is=0A= now refactored into a new function. Given there are now many=0A= more calls to aarch64_bitmask_imm, I added a streamlined internal=0A= entry point that assumes the input is 64-bit.=0A= =0A= Cheers,=0A= Wilco=0A= =0A= [PATCH v2][AArch64] Improve immediate expansion [PR106583]=0A= =0A= Improve immediate expansion of immediates which can be created from a=0A= bitmask immediate and 2 MOVKs. Simplify, refactor and improve =0A= efficiency of bitmask checks. This reduces the number of 4-instruction=0A= immediates in SPECINT/FP by 10-15%.=0A= =0A= Passes regress, OK for commit?=0A= =0A= gcc/ChangeLog:=0A= =0A= PR target/106583=0A= * config/aarch64/aarch64.cc (aarch64_internal_mov_immediate)=0A= Add support for a bitmask immediate with 2 MOVKs.=0A= (aarch64_check_bitmask): New function after refactorization.=0A= (aarch64_replicate_bitmask_imm): Remove function, merge into...=0A= (aarch64_bitmask_imm): Simplify replication of small modes.=0A= Split function into 64-bit only version for efficiency. =0A= =0A= gcc/testsuite:=0A= PR target/106583=0A= * gcc.target/aarch64/pr106583.c: Add new test.=0A= =0A= ---=0A= =0A= diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc= =0A= index 926e81f028c82aac9a5fecc18f921f84399c24ae..b2d9c7380975028131d0fe731a9= 7b3909874b87b 100644=0A= --- a/gcc/config/aarch64/aarch64.cc=0A= +++ b/gcc/config/aarch64/aarch64.cc=0A= @@ -306,6 +306,7 @@ static machine_mode aarch64_simd_container_mode (scalar= _mode, poly_int64);=0A= static bool aarch64_print_address_internal (FILE*, machine_mode, rtx,=0A= aarch64_addr_query_type);=0A= static HOST_WIDE_INT aarch64_clamp_to_uimm12_shift (HOST_WIDE_INT val);=0A= +static bool aarch64_bitmask_imm (unsigned HOST_WIDE_INT);=0A= =0A= /* The processor for which instructions should be scheduled. */=0A= enum aarch64_processor aarch64_tune =3D cortexa53;=0A= @@ -5502,6 +5503,30 @@ aarch64_output_sve_vector_inc_dec (const char *opera= nds, rtx x)=0A= factor, nelts_per_vq);=0A= }=0A= =0A= +/* Return true if the immediate VAL can be a bitfield immediate=0A= + by changing the given MASK bits in VAL to zeroes, ones or bits=0A= + from the other half of VAL. Return the new immediate in VAL2. */=0A= +static inline bool=0A= +aarch64_check_bitmask (unsigned HOST_WIDE_INT val,=0A= + unsigned HOST_WIDE_INT &val2,=0A= + unsigned HOST_WIDE_INT mask)=0A= +{=0A= + val2 =3D val & ~mask;=0A= + if (val2 !=3D val && aarch64_bitmask_imm (val2))=0A= + return true;=0A= + val2 =3D val | mask;=0A= + if (val2 !=3D val && aarch64_bitmask_imm (val2))=0A= + return true;=0A= + val =3D val & ~mask;=0A= + val2 =3D val | (((val >> 32) | (val << 32)) & mask);=0A= + if (val2 !=3D val && aarch64_bitmask_imm (val2))=0A= + return true;=0A= + val2 =3D val | (((val >> 16) | (val << 48)) & mask);=0A= + if (val2 !=3D val && aarch64_bitmask_imm (val2))=0A= + return true;=0A= + return false;=0A= +}=0A= +=0A= static int=0A= aarch64_internal_mov_immediate (rtx dest, rtx imm, bool generate,=0A= scalar_int_mode mode)=0A= @@ -5568,36 +5593,43 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, = bool generate,=0A= one_match =3D ((~val & mask) =3D=3D 0) + ((~val & (mask << 16)) =3D=3D 0= ) +=0A= ((~val & (mask << 32)) =3D=3D 0) + ((~val & (mask << 48)) =3D=3D 0);= =0A= =0A= - if (zero_match !=3D 2 && one_match !=3D 2)=0A= + if (zero_match < 2 && one_match < 2)=0A= {=0A= /* Try emitting a bitmask immediate with a movk replacing 16 bits.= =0A= For a 64-bit bitmask try whether changing 16 bits to all ones or=0A= zeroes creates a valid bitmask. To check any repeated bitmask,=0A= try using 16 bits from the other 32-bit half of val. */=0A= =0A= - for (i =3D 0; i < 64; i +=3D 16, mask <<=3D 16)=0A= - {=0A= - val2 =3D val & ~mask;=0A= - if (val2 !=3D val && aarch64_bitmask_imm (val2, mode))=0A= - break;=0A= - val2 =3D val | mask;=0A= - if (val2 !=3D val && aarch64_bitmask_imm (val2, mode))=0A= - break;=0A= - val2 =3D val2 & ~mask;=0A= - val2 =3D val2 | (((val2 >> 32) | (val2 << 32)) & mask);=0A= - if (val2 !=3D val && aarch64_bitmask_imm (val2, mode))=0A= - break;=0A= - }=0A= - if (i !=3D 64)=0A= - {=0A= - if (generate)=0A= + for (i =3D 0; i < 64; i +=3D 16)=0A= + if (aarch64_check_bitmask (val, val2, mask << i))=0A= + {=0A= + if (generate)=0A= + {=0A= + emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));=0A= + emit_insn (gen_insv_immdi (dest, GEN_INT (i),=0A= + GEN_INT ((val >> i) & 0xffff)));=0A= + }=0A= + return 2;=0A= + }=0A= + }=0A= +=0A= + /* Try a bitmask plus 2 movk to generate the immediate in 3 instructions= . */=0A= + if (zero_match + one_match =3D=3D 0)=0A= + {=0A= + for (i =3D 0; i < 48; i +=3D 16)=0A= + for (int j =3D i + 16; j < 64; j +=3D 16)=0A= + if (aarch64_check_bitmask (val, val2, (mask << i) | (mask << j)))=0A= {=0A= - emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));=0A= - emit_insn (gen_insv_immdi (dest, GEN_INT (i),=0A= - GEN_INT ((val >> i) & 0xffff)));=0A= + if (generate)=0A= + {=0A= + emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));=0A= + emit_insn (gen_insv_immdi (dest, GEN_INT (i),=0A= + GEN_INT ((val >> i) & 0xffff)));=0A= + emit_insn (gen_insv_immdi (dest, GEN_INT (j),=0A= + GEN_INT ((val >> j) & 0xffff)));=0A= + }=0A= + return 3;=0A= }=0A= - return 2;=0A= - }=0A= }=0A= =0A= /* Generate 2-4 instructions, skipping 16 bits of all zeroes or ones whi= ch=0A= @@ -10168,22 +10200,6 @@ aarch64_movk_shift (const wide_int_ref &and_val,= =0A= return -1;=0A= }=0A= =0A= -/* VAL is a value with the inner mode of MODE. Replicate it to fill a=0A= - 64-bit (DImode) integer. */=0A= -=0A= -static unsigned HOST_WIDE_INT=0A= -aarch64_replicate_bitmask_imm (unsigned HOST_WIDE_INT val, machine_mode mo= de)=0A= -{=0A= - unsigned int size =3D GET_MODE_UNIT_PRECISION (mode);=0A= - while (size < 64)=0A= - {=0A= - val &=3D (HOST_WIDE_INT_1U << size) - 1;=0A= - val |=3D val << size;=0A= - size *=3D 2;=0A= - }=0A= - return val;=0A= -}=0A= -=0A= /* Multipliers for repeating bitmasks of width 32, 16, 8, 4, and 2. */=0A= =0A= static const unsigned HOST_WIDE_INT bitmask_imm_mul[] =3D=0A= @@ -10196,26 +10212,42 @@ static const unsigned HOST_WIDE_INT bitmask_imm_m= ul[] =3D=0A= };=0A= =0A= =0A= -/* Return true if val is a valid bitmask immediate. */=0A= -=0A= +/* Return true if val is a valid bitmask immediate for any mode. */=0A= bool=0A= aarch64_bitmask_imm (HOST_WIDE_INT val_in, machine_mode mode)=0A= {=0A= - unsigned HOST_WIDE_INT val, tmp, mask, first_one, next_one;=0A= + if (mode =3D=3D DImode)=0A= + return aarch64_bitmask_imm (val_in);=0A= +=0A= + unsigned HOST_WIDE_INT val =3D val_in;=0A= +=0A= + if (mode =3D=3D SImode)=0A= + return aarch64_bitmask_imm ((val & 0xffffffff) | (val << 32));=0A= +=0A= + /* Replicate small immediates to fit 64 bits. */=0A= + int size =3D GET_MODE_UNIT_PRECISION (mode);=0A= + val &=3D (HOST_WIDE_INT_1U << size) - 1;=0A= + val *=3D bitmask_imm_mul[__builtin_clz (size) - 26];=0A= +=0A= + return aarch64_bitmask_imm (val);=0A= +}=0A= +=0A= +=0A= +/* Return true if 64-bit val is a valid bitmask immediate. */=0A= +=0A= +static bool=0A= +aarch64_bitmask_imm (unsigned HOST_WIDE_INT val)=0A= +{=0A= + unsigned HOST_WIDE_INT tmp, mask, first_one, next_one;=0A= int bits;=0A= =0A= /* Check for a single sequence of one bits and return quickly if so.=0A= The special cases of all ones and all zeroes returns false. */=0A= - val =3D aarch64_replicate_bitmask_imm (val_in, mode);=0A= tmp =3D val + (val & -val);=0A= =0A= if (tmp =3D=3D (tmp & -tmp))=0A= return (val + 1) > 1;=0A= =0A= - /* Replicate 32-bit immediates so we can treat them as 64-bit. */=0A= - if (mode =3D=3D SImode)=0A= - val =3D (val << 32) | (val & 0xffffffff);=0A= -=0A= /* Invert if the immediate doesn't start with a zero bit - this means we= =0A= only need to search for sequences of one bits. */=0A= if (val & 1)=0A= diff --git a/gcc/testsuite/gcc.target/aarch64/pr106583.c b/gcc/testsuite/gc= c.target/aarch64/pr106583.c=0A= new file mode 100644=0A= index 0000000000000000000000000000000000000000..0f931580817d78dc1cc58f03b25= 1bd21bec71f59=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.target/aarch64/pr106583.c=0A= @@ -0,0 +1,41 @@=0A= +/* { dg-do assemble } */=0A= +/* { dg-options "-O2 --save-temps" } */=0A= +=0A= +long f1 (void)=0A= +{=0A= + return 0x7efefefefefefeff;=0A= +}=0A= +=0A= +long f2 (void)=0A= +{=0A= + return 0x12345678aaaaaaaa;=0A= +}=0A= +=0A= +long f3 (void)=0A= +{=0A= + return 0x1234cccccccc5678;=0A= +}=0A= +=0A= +long f4 (void)=0A= +{=0A= + return 0x7777123456787777;=0A= +}=0A= +=0A= +long f5 (void)=0A= +{=0A= + return 0x5555555512345678;=0A= +}=0A= +=0A= +long f6 (void)=0A= +{=0A= + return 0x1234bbbb5678bbbb;=0A= +}=0A= +=0A= +long f7 (void)=0A= +{=0A= + return 0x4444123444445678;=0A= +}=0A= +=0A= +=0A= +/* { dg-final { scan-assembler-times {\tmovk\t} 14 } } */=0A= +/* { dg-final { scan-assembler-times {\tmov\t} 7 } } */=0A= =0A=