From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-eopbgr130050.outbound.protection.outlook.com [40.107.13.50]) by sourceware.org (Postfix) with ESMTPS id A681B3857C49 for ; Sat, 25 Jul 2020 08:05:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A681B3857C49 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Szabolcs.Nagy@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DforFNiqTNoE3oHPM+Hpec1LDBchJeTq/as9SCQ6+UA=; b=iLm5BL66YibqTuYldpN7IDyD9R94HU5o7rxngVp3N8I11Th3AUBoSL253P8JYOdtNgqx9V0CKneVmxP6Bt3TeHBlcOG6AwZvkwIpHNjoXINAL0f/nWePygw1mtYyg+jaolq5/V5f+DWRJXL96j/FitP4bhMo3BpXe/u9U/sV5DU= Received: from DB8PR06CA0003.eurprd06.prod.outlook.com (2603:10a6:10:100::16) by VI1PR08MB3918.eurprd08.prod.outlook.com (2603:10a6:803:b6::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.23; Sat, 25 Jul 2020 08:05:51 +0000 Received: from DB5EUR03FT041.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:100:cafe::e7) by DB8PR06CA0003.outlook.office365.com (2603:10a6:10:100::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.20 via Frontend Transport; Sat, 25 Jul 2020 08:05:50 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; sourceware.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT041.mail.protection.outlook.com (10.152.21.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.10 via Frontend Transport; Sat, 25 Jul 2020 08:05:50 +0000 Received: ("Tessian outbound c4059ed8d7bf:v62"); Sat, 25 Jul 2020 08:05:50 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ef47e27f2331412b X-CR-MTA-TID: 64aa7808 Received: from 169e741b94c0.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 44FC2EC6-7693-4724-BD0A-567207ACAA11.1; Sat, 25 Jul 2020 08:05:44 +0000 Received: from EUR02-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 169e741b94c0.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Sat, 25 Jul 2020 08:05:44 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Rm+DwdOBqninl0ST3V0SMciCTNjr3Vy84YtCIuPFYDy2Bla+FAhw13bA6zvkgG+J7PreITBipGt3qdfZhBOIf8YI/fUT4ZL9cWu9OCaMM2Sl2aKkvP6mv9Uh8AaqC0jm6UYwhdIRs4+ULbv/tQYWdLLar+FTsxN5MHw2NxX7q7J2OnWlmuOTjbLeKsVfAGmqqjnL2HZX3CNMO1WUzjGZoGKfGewXfmX6ubGuiWHbzTq0Ew9mHT8XD3k9/DuE52buqHt2u8lLWCIHadjLPp3C7DpELiZz6xZpKsRms+G0i7OmPbQGnB9/zaASha7ibWOOh50/5X4D1WWGKUM/Z4DagA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DforFNiqTNoE3oHPM+Hpec1LDBchJeTq/as9SCQ6+UA=; b=J9FXAjdapUszJGjT591f9l+1d6Ivv71sbhRO9NoknTy3fHWrT9XuyYwL03glAQjLBZwMt1PUJxCLP3nxFuvC9F3ldx5ZBxYStZpp8p/j8UfCR03QJ2UY8evLuh+b8dAd0Wq88cDd1M7qEvJT2gL1va727CvZPLyybYPFAxN9ySxpRpxprWg9CmGzlSdtsQCLt3q1XXcFw8H1vmWGKXuHwnUSBjEQSwky/OOtyzu9KBvy/pSw8wEX77TBbji3jBZr8nLSIzs1Rm00B3FTExtBsWWBOhFE2sYsn5GKWBi/9uYsC3uR+rmz0jnpQWE1JhrNdp62opknebMFtac7WZwOqA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DforFNiqTNoE3oHPM+Hpec1LDBchJeTq/as9SCQ6+UA=; b=iLm5BL66YibqTuYldpN7IDyD9R94HU5o7rxngVp3N8I11Th3AUBoSL253P8JYOdtNgqx9V0CKneVmxP6Bt3TeHBlcOG6AwZvkwIpHNjoXINAL0f/nWePygw1mtYyg+jaolq5/V5f+DWRJXL96j/FitP4bhMo3BpXe/u9U/sV5DU= Authentication-Results-Original: huawei.com; dkim=none (message not signed) header.d=none;huawei.com; dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB3054.eurprd08.prod.outlook.com (2603:10a6:803:4a::32) by VE1PR08MB5183.eurprd08.prod.outlook.com (2603:10a6:803:10b::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.21; Sat, 25 Jul 2020 08:05:38 +0000 Received: from VI1PR08MB3054.eurprd08.prod.outlook.com ([fe80::c5a6:3e3a:9e11:a463]) by VI1PR08MB3054.eurprd08.prod.outlook.com ([fe80::c5a6:3e3a:9e11:a463%7]) with mapi id 15.20.3216.024; Sat, 25 Jul 2020 08:05:37 +0000 Date: Sat, 25 Jul 2020 09:05:35 +0100 From: Szabolcs Nagy To: "wangshuo (AF)" Cc: Wilco.Dijkstra@arm.com, Hushiyuan , "libc-alpha@sourceware.org" Subject: Re: what is the application scenes of adding optimized Q-register for memcpy Message-ID: <20200725080509.GM7127@arm.com> References: Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: LO2P265CA0368.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a3::20) To VI1PR08MB3054.eurprd08.prod.outlook.com (2603:10a6:803:4a::32) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from arm.com (217.140.106.52) by LO2P265CA0368.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a3::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.20 via Frontend Transport; Sat, 25 Jul 2020 08:05:37 +0000 X-Originating-IP: [217.140.106.52] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: e5cdd6f9-4f3f-4d64-1794-08d830718bac X-MS-TrafficTypeDiagnostic: VE1PR08MB5183:|VI1PR08MB3918: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:5236;OLM:5236; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: O6OdChohRSK3XhrMmyyAUEMDrosgBwELGkKi/BwvjEesw1Bvo6GQ3+Nvoimjb9jMOriVSuWxdDBsyxFadGth+58v2sa+x3wiPI3qx+vTJu1OyPj/GgxxezXUZYgVnYV4tkg6BxTwL3DflWOPi35Ldn5QDTQF0/2wV0zqWphC9ZUccs2k5HFQLFBVFkeY8SBi394cE6Lc2c24IT0jHKJEfBGO00b4FbFjc4GGgE1dpzWAsNSENZWAkdpAO8fE802VS1oueGXD3GIcKLuR1yWP0GK72Q/2dLmywQgy/Qt6AVR67LyWU9uaHxW/92qG9+pTm1fdh7iEyd15eoCQBGtOig== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB3054.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(376002)(396003)(39860400002)(136003)(346002)(366004)(16526019)(26005)(33656002)(186003)(8886007)(52116002)(316002)(7696005)(5660300002)(54906003)(4326008)(55016002)(66556008)(2906002)(2616005)(86362001)(1076003)(8936002)(36756003)(956004)(6916009)(44832011)(66476007)(8676002)(66946007)(83380400001)(478600001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: KNDlIBYhGV7+H9b3jOriDNDiJCnKfzKxHQ0pmy7wt1Pgx595TcAsgFzUb09qtiwgv6r5gEZLzuBVR3DSihnu02GwUpLN8wh4u+lHXJF+aUfWq1F/RevcX3Nip6w4rUYuPZ6R43W75/djYQc6N7LJSd6rZtwIxA/VDUG1K+O0k7+GFw8U/c9iFX9xSz8FARhPdC9mN7zLNDMdm32vP/cE3jtor8J6Rtl2apB4hGTINAm/SwZNqIYUhnj6sVfVoMRyOUMct0X/MwWgEaV8ILT+imRCLwzS2bGaVZzKNqSz3o6JZ1IHOWBdTT+FQyCsML3V2yNLg6+4QAyPFBMJh/d1O/mBS1VM2Xm8xqRmYuKyd95Hfh0bJmtSrXcVxRwcSu+PGTFsWW5P0v/gcURMilSYRxx3lKtOqEfkSvhHiFEOOWQZu894SGvn1KlW+ZPgVixbUuV9mjkaW/8+qBH+3aa/1o2M1hCFxtkdUNjFqlgTGakke9UemYuWkU16t8H0DaJi X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5183 Original-Authentication-Results: huawei.com; dkim=none (message not signed) header.d=none;huawei.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 5dfe1c05-1d75-48eb-4e45-08d8307183ce X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: t3iEJi12CLmrCcugeOkn/qsjGMglXbqBuukDSJybfF4zABbxtWW3Up9tdem5G7HinLT0aM33uQBBtS80UHrlK0FKuK0Z/Q9Q6+zvBfu8x8NVKG1AuOu3qW3dk7kNbSPL3Ip17W4H+utEfuiryBnl8aHpXI9dchzUwy5dlzZGhhfp4FloBMCmubI9ZBaB9j39ioJTZ0h3IFzY1MVnBVaYnGHYP+1QabuOFe7OygWtGbFrTLpK1eW2gdQZOS6mv2yNMk3gLKZy9TR03WStwg1MAILuqvUHyj+YOtVuOvV1f0H31WN2lU2tQRukAKEhaWB3ZCTXFyPVavuYsPXxXtvKdfIYcg2fyTM5GMDkb7q6XqFYwXjKgQDnLalqEn0tmc8t7T5cVPOLu5GwGsRlFZXu9Q== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFTY:; SFS:(4636009)(39860400002)(396003)(346002)(376002)(136003)(46966005)(5660300002)(70206006)(70586007)(55016002)(47076004)(4326008)(54906003)(316002)(44832011)(2906002)(1076003)(6862004)(8886007)(26005)(82740400003)(478600001)(8676002)(16526019)(356005)(81166007)(82310400002)(336012)(8936002)(186003)(83380400001)(86362001)(7696005)(36756003)(2616005)(956004)(33656002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 25 Jul 2020 08:05:50.5604 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e5cdd6f9-4f3f-4d64-1794-08d830718bac X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB3918 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jul 2020 08:06:00 -0000 The 07/25/2020 09:49, wangshuo (AF) wrote: > this commit 4a733bf375238a6a595033b5785cea7f27d61307 adds optimized > Q-register memcpy. please add "aarch64" to the email subject if it's for aarch64 only. that commit should not alter the kunpeng920 memcpy. did you change the ifunc logic to use the new one? the previous commit increases the entry alignment and this one may move the memcpy to a slightly different location in libc.so, but that's about it. > However, I can not get an ideal results in my enviornment. This is my test: > > test suite: libMicro-0.4.0 > > ./memcpy -E -C 200 -L -S -W -N "memcpy_10" -s 10 -I 10 > ./memcpy -E -C 200 -L -S -W -N "memcpy_1k" -s 1k -I 50 > ./memcpy -E -C 200 -L -S -W -N "memcpy_10k" -s 10k -I 800 > ./memcpy -E -C 200 -L -S -W -N "memcpy_1m" -s 1m -I 500000 > ./memcpy -E -C 200 -L -S -W -N "memcpy_10m" -s 10m -I 5000000 > > > hardware platform: > Kunpeng-920 @ 2600.0000MHz > L1d cache: 6 MiB > L1i cache: 6 MiB > L2 cache: 48 MiB > L3 cache: 192 MiB > > before this commit(usecs) after this commit(usecs) > memcpy_10 0.0065 0.0065 > memcpy_1k 0.0299 0.0294 > memcpy_10k 0.2642 0.2642 > memcpy_1m 27.9040 27.6480 > memcpy_10m 265.9840 274.6880 > strlen_10 0.0039 0.0039 > strlen_1k 0.0571 0.0450 > > I was wondering if you could give me some advices about my test results. 3% regression on large copies may be explained by uarch implementation internals, you can verify that by keeping the code the same just add some nop padding around memcpy.