From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50074.outbound.protection.outlook.com [40.107.5.74]) by sourceware.org (Postfix) with ESMTPS id 8B0663858D20 for ; Tue, 15 Feb 2022 13:38:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8B0663858D20 Received: from AM6P195CA0021.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:81::34) by AM9PR08MB6020.eurprd08.prod.outlook.com (2603:10a6:20b:2d6::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4975.11; Tue, 15 Feb 2022 13:38:43 +0000 Received: from VE1EUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:81:cafe::31) by AM6P195CA0021.outlook.office365.com (2603:10a6:209:81::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4975.15 via Frontend Transport; Tue, 15 Feb 2022 13:38:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT023.mail.protection.outlook.com (10.152.18.133) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4975.11 via Frontend Transport; Tue, 15 Feb 2022 13:38:43 +0000 Received: ("Tessian outbound 31aeb3346a45:v113"); Tue, 15 Feb 2022 13:38:42 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 07a1272e8978faf2 X-CR-MTA-TID: 64aa7808 Received: from edfccc5d5497.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 278A85AE-57CC-4D72-B93B-C46651A83486.1; Tue, 15 Feb 2022 13:38:35 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id edfccc5d5497.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 15 Feb 2022 13:38:35 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CHXfkv/7DaGTSHUC1IOY9AEEE2oHLcEju6Cu9Mdh4VWqzZQqd+/YQQL7knUVS1/Es0iMdsyx5ccTAwhn5dcc9Iq7uODSQyuXBFBI4wA2wTYedGg4etTmDiDsJ6P2j9Wncs220zyyf/u5GiJDhDnBWlkaKEiM6h9I+5CtJgFIKicV8JcAyiwLV+ACCh5Xxdy/RHjsIZvoujYOdc9HvYuE/YfAULb4mqx87EQJXhMstKHba8XwmjG3UmnfvHM1feQmxtN/y6yCh+UvQqJ4aiK9IWxHTusx9aQ1E/NZE0F+9FHXvE4/Vkc3rHhq2nn4Jiz+KOvw6JIGOBT1bGn/WRa6gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jdKuxGcG8FT/9dbFkIR3KCsq5CHcm6MzWvcmB2ByDmw=; b=ng3GGVQstj2uBdC+kgSSeqgUfFxGAepbOeDEBmoh8wYGxgHkzTHpidgVJ2KX5NXf4iXn5LXFJ88gW2p21afL3ABKv/2r6HHxyCnD3acnE/aYJlkssFxGLjPTkuvXbw2uYCdKJZwIKz9zwGdSABt7FWR5y4tWwbKW3l0HqK1Jh/oEMy5joQcMy4+TtWV6u4tmDZZLMvbbSYSF4Ly2Mm+0kpXYcQEzfFnpWyj0IqBI7UEfWY4g+6tfPKuS5Csh6369NqJS8k/N2sjTHsCPOdTA1+frjEKQAcGbXnjP+5zx7NQddnHPOTWh40IJWGMLVXlth9jqDhb+RgEQ6st62dgK2g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none Received: from AS8PR08MB6534.eurprd08.prod.outlook.com (2603:10a6:20b:31c::10) by DBBPR08MB4741.eurprd08.prod.outlook.com (2603:10a6:10:d8::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4975.15; Tue, 15 Feb 2022 13:38:30 +0000 Received: from AS8PR08MB6534.eurprd08.prod.outlook.com ([fe80::4c4f:f584:ce98:e21d]) by AS8PR08MB6534.eurprd08.prod.outlook.com ([fe80::4c4f:f584:ce98:e21d%8]) with mapi id 15.20.4995.014; Tue, 15 Feb 2022 13:38:30 +0000 From: Wilco Dijkstra To: Noah Goldstein CC: 'GNU C Library' , Adhemerval Zanella , "H.J. Lu" Subject: [PATCH v2] x86-64: Optimize bzero Thread-Topic: [PATCH v2] x86-64: Optimize bzero Thread-Index: AQHYIm67PrwzS/JcpkCsQiJPFtvB1g== Date: Tue, 15 Feb 2022 13:38:29 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-MS-Office365-Filtering-Correlation-Id: a6ef181e-4b8c-4bdd-3f7e-08d9f0887bb6 x-ms-traffictypediagnostic: DBBPR08MB4741:EE_|VE1EUR03FT023:EE_|AM9PR08MB6020:EE_ X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Lc2eDFVB5vQ1teWO8hcXPJC8vlc5mj1YRpr7JUF45HinILnn3W70HdnkgnR5UeafvhFb1PAt9VCUJrXwtf4Soh3Wlx4Q+X5B1QyaMSvNaXEveq8FFCUIkrAMIuW65xptHn9GUZ/kTJ6wEZQewea1P6Y2K/rfiyR7rp8mjJ/9UmtgbkmTrYrf8t9l72cZDBquS3KF6kOxL04vmXbrCo1F+0E08cRltWQRL+A9VwqGXoQPMkCBoYU+AQtAZq7w17QRnli7sh0F8XeRi6XGSvxVk09BtgEtdi4uSmYRXkbcG+zeknbzS7Tsc5TnHv7j2v44R+znpNWiUm1y8OKx0V1Psl5c8sA1K4f7GHUmtEmk4x+HnqnFsN/xsn/VnhBXep2aZsNDGxTbAQEGtzCKyKDp1+XxTgE7CCJ+edtobWBeFeISLBN7IxVsAQReDB9XvQBKYCmOqELHS3DE6MiSKTwxMv0GlWu/owCjcRbxUWGkr6KSqoz/bWikhAG7uSkSgmx5qE4t7nL7iCVtgPFsBDY/hGgVR/eMPeXcI8143SYgbi8lSFN/gGZhvTerhWWyK6bRX5fsm4elaHrexmjDCv+uAgZOAKFV7OUHH2pH+amIJzIkmTbbQiO+cAD6p1TqBHn51/ZswcwBh9nGBwe33WkTqNgtWzcT3hEnBAW+u9saS0fYmbb+WzrLhEkrd4fimDCLGjbxnLBxfc/GBZHXEHnuWg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AS8PR08MB6534.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(4636009)(366004)(86362001)(54906003)(64756008)(122000001)(8676002)(71200400001)(66946007)(38070700005)(316002)(6916009)(508600001)(66476007)(91956017)(4326008)(66446008)(38100700002)(5660300002)(76116006)(8936002)(52536014)(66556008)(186003)(26005)(55016003)(6506007)(7696005)(33656002)(2906002)(9686003); DIR:OUT; SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB4741 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: ec3a19c9-129a-489c-5d31-08d9f08873e5 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ZVdA+pOZCXmiw14KMaS1Z3qg9bVL5mFksndTk6Jn6HAVCDbwnCmW+NBI4NBDdd5h7dD3A4/HIKmgThh9G6GdWMWBfB66L0BHrjR+I5YWTZwK/SSy1XDVUkySw5mTivkLQ2JMeh4md2/gBCM/T7SrLIQH4jXniBeLc/AmzhnhdgrgtPtAooXCv4/hjoJPTv/YEM0Nl8XTZBJ/pQoZxcrEm9ivnCCRJI7/T2flm2+1kvSY4Mx767GNYFBTRjkC/z1rIx1fP3lVFFRhWa9QsYBfHYsehPM36ddYSUPL+v2z6r+hCv50HB+YN7atH6Cgxn8AvJhUJt41PaUf98REhAI9gD4WKw07lgSygwLAIttN7/1e583gUZV43j9cVocTBAr7NOcAKaAyaPB33nbm4V/ckOEjhnZgenq0j9BS9+MVc5OpXU1dIOFrZ4ctce0E+13VxF53pY3AUcrKq11gr+YzsvLhQXjw2W9+3i55xB4+gGXnzSxfiwkzr5Hm70P7sO1+7B1hizUZ3gRbGRvijCmA0Gyz+TExOxzdQG1w5ukufGJ7egIrX4gy2RbIgNzEfHT/+YXj5Grw5tZGPsi0Xs8su8tFJj0gYJUioL4xAK1jAPcgt0zEs1n0qLboCBypMjvOeMRhWCSVpA2Vhecvf03OeArgPAeyZTZxmaWu+Z7ZbeUxSJIa8/by6j34DGuGg53p X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230001)(4636009)(40470700004)(46966006)(36840700001)(81166007)(316002)(356005)(8936002)(9686003)(54906003)(5660300002)(33656002)(6506007)(508600001)(52536014)(55016003)(40460700003)(107886003)(8676002)(6862004)(82310400004)(4326008)(47076005)(86362001)(7696005)(26005)(70206006)(36860700001)(70586007)(186003)(2906002)(336012); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Feb 2022 13:38:43.0302 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a6ef181e-4b8c-4bdd-3f7e-08d9f0887bb6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM9PR08MB6020 X-Spam-Status: No, score=-6.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Feb 2022 13:38:48 -0000 Hi,=0A= =0A= > Is there any way it can be setup so that one C impl can cover all the=0A= > arch that want to just leave `__memsetzero` as an alias to `memset`?=0A= > I know they have incompatible interfaces that make it hard but would=0A= > a weak static inline in string.h work?=0A= =0A= No that won't work. A C implementation similar to current string/bzero.c=0A= adds unacceptable overhead (since most targets just implement memset and=0A= will continue to do so). An inline function in string.h would introduce tar= get=0A= hacks in our headers, something we've been working hard to remove over the= =0A= years.=0A= =0A= The only reasonable option is a target specific optimization in GCC and LLV= M=0A= so that memsetzero is only emitted when it is known an optimized GLIBC=0A= implementation exists (similar to mempcpy).=0A= =0A= > It's worth noting that between the two `memset` is the cold function=0A= > and `__memsetzero` is the hot one. Based on profiles of GCC11 and=0A= > Python3.7.7 setting zero covers 99%+ cases.=0A= =0A= There is no doubt memset of zero is by far the most common. What is in doub= t=0A= is whether micro-optimizing is worth it on modern cores. Does Python speed = up=0A= by a measurable amount if you use memsetzero?=0A= =0A= Cheers,=0A= Wilco=