From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30083.outbound.protection.outlook.com [40.107.3.83]) by sourceware.org (Postfix) with ESMTPS id 267973857C64 for ; Wed, 14 Oct 2020 14:56:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 267973857C64 Received: from AM5PR0202CA0003.eurprd02.prod.outlook.com (2603:10a6:203:69::13) by AM0PR08MB4498.eurprd08.prod.outlook.com (2603:10a6:208:13c::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.21; Wed, 14 Oct 2020 14:56:46 +0000 Received: from AM5EUR03FT008.eop-EUR03.prod.protection.outlook.com (2603:10a6:203:69:cafe::e1) by AM5PR0202CA0003.outlook.office365.com (2603:10a6:203:69::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.23 via Frontend Transport; Wed, 14 Oct 2020 14:56:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT008.mail.protection.outlook.com (10.152.16.123) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3477.21 via Frontend Transport; Wed, 14 Oct 2020 14:56:45 +0000 Received: ("Tessian outbound a64c3afb6fc9:v64"); Wed, 14 Oct 2020 14:56:45 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 77b0ac19fb8855c9 X-CR-MTA-TID: 64aa7808 Received: from f577c2c24da4.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 597645C5-0C63-4C33-8BEA-4314FE63458A.1; Wed, 14 Oct 2020 14:56:39 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f577c2c24da4.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 14 Oct 2020 14:56:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QVATRhHfWFzjhyXbnpJ2WsynTSbNu2ZgK1AqfKtbVK6+Nh7PhmpD3TvhV4R/uHEh29QtpbOs3Ep9dcT/C3/U1JyGYjFDsU6HTKDOj/msG/cwJBb34e5GTCWcTQ3MVJqe6DWxHVjkEpiaJO8MO43VU7nUOYTkTPH1SKy/J4SlYEWQv7I/J+sbOY5dji5/nWvZ76N/Ee2MnyHC+Pgj1vTt5Jw1yz1/XkLZ/Pzhq+4dowfbGpDKy8wosECp3C7VU9H89l0Xg3580UkZmD86s4JEufrdsRL0uLSV5blF5/+9XFr5UvtKcfNgiAVfz0a+EpxGtqr8y6gmEBFjiMNcw81dPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=3LLix2b0vIAuRRPq4wYfGrBbzSH3qWyK44KXWuGzp/k=; b=TyVpDig/APMJ9v2oCsqb1d1gyCqiv2Pla9RpOIUPIz9pyVx2xgB3O82DVeXxpwRX+wZoVJ9l0B+6b9Vk2C32Vph96mRtfbo6Wm/ak5y3KQsO1X20SN07ej3HdIDEM9IldXdf0+RnOsa3VP2tMbumZGXAPFo0YCCxuVuWk9a94qayxHom2h6f61Z3uQA20hp6WeccEcNyK8usox7IYaUiQxIPM3TqXBggPm4VWQhYINm2hzfCVTeUzJVdxAwZXZT59O+yZ37AuZAbya8rt4FffciWqh+8Cc+BFIi7zIUs2OhW/sIUU/++GI6lD8PVnGc8B+yUApmt7TRYyOujFe5DQw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VE1PR08MB5599.eurprd08.prod.outlook.com (2603:10a6:800:1a1::12) by VI1PR08MB2991.eurprd08.prod.outlook.com (2603:10a6:803:40::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.21; Wed, 14 Oct 2020 14:56:38 +0000 Received: from VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::60b7:2d8b:81cb:bc0d]) by VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::60b7:2d8b:81cb:bc0d%3]) with mapi id 15.20.3477.021; Wed, 14 Oct 2020 14:56:38 +0000 From: Wilco Dijkstra To: "libc-stable@sourceware.org" CC: nd Subject: [2.32 COMMITTED] AArch64: Backport memcpy improvements Thread-Topic: [2.32 COMMITTED] AArch64: Backport memcpy improvements Thread-Index: AQHWojnSTXyhtYIIhk+hS2J8+XAm7g== Date: Wed, 14 Oct 2020 14:56:38 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [82.24.199.97] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 1d2386d3-6921-48c2-cf90-08d870515ef0 x-ms-traffictypediagnostic: VI1PR08MB2991:|AM0PR08MB4498: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:565;OLM:565; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: hjIkpxSsOJnbqudvopf79vmEerxhAtsPnXTlJm/UndJD+FsAW1WvkD76qXVm/6r8l2YqpahIP2hw7WOz3ymaaA1GVe8HDRAz1IOwbYSZJAYJ1bobCRwWHeRL43ICltHgYtcgDheMvKFAZNYBCXqkLSYYFO1YXCfbENz8UTsFyOudWmvlu0Xatakyjtz7CWQFXbMxVnoZeGlvxL6a+6em5q5LEEotPLjunYMPOMbeEwmjlYdP+smbMK1NeZrxfhRnSkf2N/dW5u5JH+o4J4HQV+EmjAT8Fd19h6y1fv+r38y/6VpfDpJ2ocV3LX1IB/HJUdr5Ju5bEmMnVdb6eT8OdA== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB5599.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39860400002)(376002)(136003)(346002)(396003)(366004)(5660300002)(76116006)(9686003)(66946007)(83380400001)(4326008)(8936002)(55016002)(8676002)(66476007)(66556008)(86362001)(6916009)(64756008)(52536014)(2906002)(186003)(71200400001)(33656002)(66446008)(478600001)(6506007)(316002)(7696005)(26005); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: fipJvyFGOjeNVL4j1EQAq159FzmjCTyRYNTtZ+RK+8DLIS5f9/up2IPJAV/eH0Iw7r20jG/dYJHr27wcsby0X0893jAj/1zJmOa58coAUUAo8gTAxZ3FZr6zQAchsBSjJ5ukpCuUZwX6OdQZUCz557urLJU06ccgm798f4rF6sVp3SVoVhjNrr1fUAeTBU29F+0oVbmWX8jQgU+90CkXArb1LXOTecZUoLH4e3p4O/S4n1UPQjsd4v7YbsVZQF/roTT8ypa9vQG962DJM4gd65Cse5JLDxo/CPC7BCYZ+Zfe/a3yvoA30CZE0iKLhexmgkNSgmSJlNr3VBVc3MMm/2LTyxgquQ/USsJH+wsesPd4ZPk8HbUpd0vhZLgeciFzriJrdFZCUIGEsrhYHd3io80oe3n6z9VsXHaZz2Pag/y+8rH9g1hpRy58qpX73TOs89BAYc2uPtDmcSlyqVcPCoCVHNREAipmyQrzbPrrS2t4hecooINxEcSbj1eXL6HuBQm0XIeOThjTf4+NEYo+mb6twbp1Aq80oEMzVsit07HWplSKhzM5hlrr8fr6hXw8VhmYUjpIokk3yP26jEP8nbt5LKacY9VqvC+eqGlGCS11b+ADtV9eHvrHsO3ZN5UoFoeVO5+x5qxP7CYWJl1kpQ== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB2991 Original-Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none; sourceware.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT008.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 2992eab0-c15b-4f26-a2e8-08d870515aaa X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: V0FeSp6OtSbtreJQnHu1Lv67m4hCPRMJZiLsCJ9qLwO3TIREI7dfYn40wP8J32GuJOxAOAiyLzIvQaU4EvXIzMIENd+lKS2DYj2jq3HjqbEn+LXPpb6HhDBJ+Fewdu0mT+qbIXqA3Djg0lSHn0aUTLC3cw1X3dg1Wlo24kiOo/UiG2iNU4OOieubdlf1znpv8kOdA/YLs6RlPwA8y2WuK5sffiXBEntwwFwTDXdQsEbxqU39e920hvu2SeSgGN1CiZpmZ/OeB67KwEZJYDs+qKwCC9dYh+oBChr8igEsQUJujvFZsq6Xguua5BIIZB3xHXp4Q9qZVfk1T3AakRvwp520lFR0bU6TUP9OF4elmWfGCHymecrMYq9gXkQWzERwhmgR6+fBNy+7LdBRKEeGxA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(396003)(136003)(346002)(39860400002)(46966005)(33656002)(83380400001)(70586007)(36906005)(316002)(8936002)(52536014)(81166007)(5660300002)(70206006)(82310400003)(8676002)(356005)(2906002)(478600001)(6916009)(4326008)(336012)(86362001)(186003)(26005)(82740400003)(47076004)(7696005)(9686003)(6506007)(55016002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Oct 2020 14:56:45.9683 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1d2386d3-6921-48c2-cf90-08d870515ef0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT008.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB4498 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-stable@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-stable mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Oct 2020 14:56:50 -0000 commit 81c5484d93a7768a8acc4cfdc228d925d60cd906=0A= Author: Wilco Dijkstra =0A= Date: =A0 Wed Oct 14 13:56:21 2020 +0100=0A= =0A= =A0 =A0 AArch64: Use __memcpy_simd on Neoverse N2/V1=0A= =0A= =A0 =A0 Add CPU detection of Neoverse N2 and Neoverse V1, and select __memc= py_simd as=0A= =A0 =A0 the memcpy/memmove ifunc.=0A= =0A= =A0 =A0 Reviewed-by: Adhemerval Zanella =A0= =0A= =A0 =A0 (cherry picked from commit e11ed9d2b4558eeacff81557dc9557001af42a6b= )=0A= =0A= commit 0f8f0ed25c196cfb93edf461aefdad15314ae05c=0A= Author: Wilco Dijkstra =0A= Date: =A0 Fri Aug 28 17:51:40 2020 +0100=0A= =0A= =A0 =A0 AArch64: Improve backwards memmove performance=0A= =0A= =A0 =A0 On some microarchitectures performance of the backwards memmove imp= roves if=0A= =A0 =A0 the stores use STR with decreasing addresses. =A0So change the memm= ove loop=0A= =A0 =A0 in memcpy_advsimd.S to use 2x STR rather than STP.=0A= =0A= =A0 =A0 Reviewed-by: Adhemerval Zanella =0A= =A0 =A0 (cherry picked from commit bd394d131c10c9ec22c6424197b79410042eed99= )=0A= =0A= diff --git a/sysdeps/aarch64/multiarch/memcpy.c b/sysdeps/aarch64/multiarch= /memcpy.c=0A= index 7cf5f03..799d60c 100644=0A= --- a/sysdeps/aarch64/multiarch/memcpy.c=0A= +++ b/sysdeps/aarch64/multiarch/memcpy.c=0A= @@ -41,7 +41,8 @@ libc_ifunc (__libc_memcpy,=0A= ? __memcpy_falkor=0A= : (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr)=0A= ? __memcpy_thunderx2=0A= - : (IS_NEOVERSE_N1 (midr)=0A= + : (IS_NEOVERSE_N1 (midr) || IS_NEOVERSE_N2 (midr)=0A= + || IS_NEOVERSE_V1 (midr)=0A= ? __memcpy_simd=0A= : __memcpy_generic)))));=0A= =0A= diff --git a/sysdeps/aarch64/multiarch/memcpy_advsimd.S b/sysdeps/aarch64/m= ultiarch/memcpy_advsimd.S=0A= index d4ba747..48bb6d7 100644=0A= --- a/sysdeps/aarch64/multiarch/memcpy_advsimd.S=0A= +++ b/sysdeps/aarch64/multiarch/memcpy_advsimd.S=0A= @@ -223,12 +223,13 @@ L(copy_long_backwards):=0A= b.ls L(copy64_from_start)=0A= =0A= L(loop64_backwards):=0A= - stp A_q, B_q, [dstend, -32]=0A= + str B_q, [dstend, -16]=0A= + str A_q, [dstend, -32]=0A= ldp A_q, B_q, [srcend, -96]=0A= - stp C_q, D_q, [dstend, -64]=0A= + str D_q, [dstend, -48]=0A= + str C_q, [dstend, -64]!=0A= ldp C_q, D_q, [srcend, -128]=0A= sub srcend, srcend, 64=0A= - sub dstend, dstend, 64=0A= subs count, count, 64=0A= b.hi L(loop64_backwards)=0A= =0A= diff --git a/sysdeps/aarch64/multiarch/memmove.c b/sysdeps/aarch64/multiarc= h/memmove.c=0A= index ad10aa8..46a4cb3 100644=0A= --- a/sysdeps/aarch64/multiarch/memmove.c=0A= +++ b/sysdeps/aarch64/multiarch/memmove.c=0A= @@ -41,7 +41,8 @@ libc_ifunc (__libc_memmove,=0A= ? __memmove_falkor=0A= : (IS_THUNDERX2 (midr) || IS_THUNDERX2PA (midr)=0A= ? __memmove_thunderx2=0A= - : (IS_NEOVERSE_N1 (midr)=0A= + : (IS_NEOVERSE_N1 (midr) || IS_NEOVERSE_N2 (midr)=0A= + || IS_NEOVERSE_V1 (midr)=0A= ? __memmove_simd=0A= : __memmove_generic)))));=0A= =0A= diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h b/sysdeps/unix/= sysv/linux/aarch64/cpu-features.h=0A= index fc68845..00a4d0c 100644=0A= --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h=0A= +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.h=0A= @@ -54,6 +54,10 @@=0A= && MIDR_PARTNUM(midr) =3D=3D 0x000)=0A= #define IS_NEOVERSE_N1(midr) (MIDR_IMPLEMENTOR(midr) =3D=3D 'A' = \=0A= && MIDR_PARTNUM(midr) =3D=3D 0xd0c)=0A= +#define IS_NEOVERSE_N2(midr) (MIDR_IMPLEMENTOR(midr) =3D=3D 'A' = \=0A= + && MIDR_PARTNUM(midr) =3D=3D 0xd49)=0A= +#define IS_NEOVERSE_V1(midr) (MIDR_IMPLEMENTOR(midr) =3D=3D 'A' = \=0A= + && MIDR_PARTNUM(midr) =3D=3D 0xd40)=0A= =0A= #define IS_EMAG(midr) (MIDR_IMPLEMENTOR(midr) =3D=3D 'P' = \=0A= && MIDR_PARTNUM(midr) =3D=3D 0x000)=0A= =0A=