From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-AM0-obe.outbound.protection.outlook.com (mail-am0eur02on2057.outbound.protection.outlook.com [40.107.247.57]) by sourceware.org (Postfix) with ESMTPS id AED413858D1E for ; Tue, 19 Mar 2024 15:13:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AED413858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AED413858D1E Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.247.57 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1710861185; cv=pass; b=T4dlFmPqg8/LHfuqZ+PC0H11qPL1Q5g4xVFRebu6SvJ7iT2daE58+Levy0vhUSIgecQQ/R5MxXqaCQFx4K3G0Iu0oo8m0MuCqYz4ntkviWSHPydN5ZIWs7sLP0HFm9/cZIWlYIbnHr+zx+wH2ZGlh8h9g8urcEG/u3O+HSdyEhw= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1710861185; c=relaxed/simple; bh=XQAque0wJbDyX9fyqIxInSpNOaoJAFc8A6d4pdNF+mM=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=uH4JNKNmVlYG2+IHE92x5w4hABDS6fbCnLtW4py4hdq0Lw3BwnxBQ4m/v50e50E38TFdi+zMNXP8IlI1nWE6wCL1DkvON19qvWHolt3xoCA1a8sGEMJSs3WYymiJ1tbnVAFWeHhv4jBkwm+uZC9wox9R2AKOK4hjI6lHqGErT68= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=YStIHbejMjY8zVIAc8ZQ5MHH1tsqsyZyaTZLx409qYvdFUjR/aE0J806uKFLrOWtoynte/AxlygiVPp6OtYGe2tUIGIgT48l75vb0AyRYHJ3L2e8uF/BApH11P/0pMH2VYRg/8v7bZLQT50yKL1kIUCNf6BiWzBqurLEL8z3l52rfqvyUMnw6P0B1EuKJMxnTyLVeEIBHec/oeXXpGY2sjKcNJGUk0op+HtTUPMEgzOSa5loqE6ZVOmQmMAs1hwpkDl+qcEw0hX+SIevD11t0lAQubkjaF17eSyNxRwLonsUru8uJ5eHU3AeghOwTQlBbkb8hRVzZRHJZh/BDi5pdg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IPOyHzRHmaAq/lzhNIbS473ehYzklUMEwxTdZbeCC08=; b=b37NWE5BfWGbcMMUZlV4xqClV5WAXVIE21+Rfy/Yc0wpxhDa79SM9NBkoS8uRgX/wapgKyJXBZKLtJ4PPO7thydF4bokpIL5r/Q1tE9Eoiv9EC1vWwgu2SpF0IrFYS0FyASee9ucfAY2T2rdEPyh3Nn59j8CeOBxbZzO+UFWHM6Zb8q4ViB3+qAoJK5MG4bw+DfH5RwRbJZEF2Xp/IERtRzBfYCIRd/TTSzApxIXnxTcxjjMiqOzvSstPJT8EB9d90M2j3lRdeBn7azpmWonTDwUk1RgZX4xDR0qNpMeV+W1jLXn7N6dRORbTykr8Pg9pdGAWxWSwFbIChtnrIKu9w== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IPOyHzRHmaAq/lzhNIbS473ehYzklUMEwxTdZbeCC08=; b=B8KlOOJi+NFv4mPvRAU3k1sLzcOC+auMoYZOfS3z5qe6f84aEjEvRAlcMER7ENCwWoOxz5WKZqIpw/RP25Blh3MZbJfLAg19dFbNDNz64rgqf7zoRlkGzVitwhPOPdWkJ/f3Yp0hfg7kSWhqL9m0XJ7L1OacGPPMQ1EcD4fBBq0= Received: from AM6PR08CA0039.eurprd08.prod.outlook.com (2603:10a6:20b:c0::27) by VI0PR08MB10582.eurprd08.prod.outlook.com (2603:10a6:800:20f::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7386.25; Tue, 19 Mar 2024 15:12:59 +0000 Received: from AM4PEPF00027A60.eurprd04.prod.outlook.com (2603:10a6:20b:c0:cafe::af) by AM6PR08CA0039.outlook.office365.com (2603:10a6:20b:c0::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7386.26 via Frontend Transport; Tue, 19 Mar 2024 15:12:59 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00027A60.mail.protection.outlook.com (10.167.16.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.10 via Frontend Transport; Tue, 19 Mar 2024 15:12:58 +0000 Received: ("Tessian outbound 9b7417e2a8eb:v300"); Tue, 19 Mar 2024 15:12:58 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 0283ece90e62ba0d X-CR-MTA-TID: 64aa7808 Received: from c5ac0c28be9b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 428D3548-B2AA-4A3E-91F5-959C948F1666.1; Tue, 19 Mar 2024 15:07:41 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c5ac0c28be9b.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 19 Mar 2024 15:07:41 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BYopSN/eVeKBfwqGU7HyqWcBdqQWBR8IJlPHlLUagVjyPF/TDXSwKv2mHBufXEDoofbNjhVzoeNA6cc16RoeAdWEOHz0qITEe8r9DLrzF7EEbLiTlFKqGfENz8fc7NssKT4knBjVKTujWlDlyaMGALia5PxXZxXHc5VNqfT7RerALs+QEdiwKU/vp9+DcfZqVuXb4GdKUi+dURTRUSoaFJCYOVglDRi97fxVPv07IqM5TJtNJ5a3Hh+kYFo0D4urq5U5SuKSlQvgOhiPqoOGG9zGU42xset5/dPkXEQlUCLu0jTV9L/AP9JIs0PDDkK0XCNIQafXlXb5+UEZAeS0/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IPOyHzRHmaAq/lzhNIbS473ehYzklUMEwxTdZbeCC08=; b=l7JGaZ1GjYxfZbD7KlfzuWbrokKA4vmV1Jf5o95tz06fF90l0+dD7ErTwrURXf0Y2ZHT2iB7a5Ix7PigT9/BACugDlxhfbOF9YS8h6eYc7K7Kse6ZAW8pqcEC17TDmHr3XYHMeBBZ7oaGdlQ7DviGqKWESEzauOTZUdV0JwWMkRLKWmalBzTv0MADr6wY7tEdlolhj+CcV6Wi/JOCrPaCeelXYEDyE0mWfftLZ9sFvhelr1HKuzEZA/Do+WvGpG/pKx3EWiyMz6mUB3B+PMZvZBwlnHg5sbRz205XloJFjY9uHZ+pP0F2OHxN+CiH931Ur5F2dJe1FOJ3Rf4xPnqOg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IPOyHzRHmaAq/lzhNIbS473ehYzklUMEwxTdZbeCC08=; b=B8KlOOJi+NFv4mPvRAU3k1sLzcOC+auMoYZOfS3z5qe6f84aEjEvRAlcMER7ENCwWoOxz5WKZqIpw/RP25Blh3MZbJfLAg19dFbNDNz64rgqf7zoRlkGzVitwhPOPdWkJ/f3Yp0hfg7kSWhqL9m0XJ7L1OacGPPMQ1EcD4fBBq0= Received: from PAWPR08MB8982.eurprd08.prod.outlook.com (2603:10a6:102:33f::20) by PA6PR08MB10527.eurprd08.prod.outlook.com (2603:10a6:102:3d0::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7386.25; Tue, 19 Mar 2024 15:07:37 +0000 Received: from PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::8b1b:5f28:5006:ac18]) by PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::8b1b:5f28:5006:ac18%4]) with mapi id 15.20.7386.023; Tue, 19 Mar 2024 15:07:37 +0000 From: Wilco Dijkstra To: Adhemerval Zanella , Noah Goldstein CC: 'GNU C Library' Subject: [PATCH v4 0/2] Improve wcsstr Thread-Topic: [PATCH v4 0/2] Improve wcsstr Thread-Index: AQHaegO6a4vdSFLjvEGFzVEtuMCLUA== Date: Tue, 19 Mar 2024 15:07:37 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAWPR08MB8982:EE_|PA6PR08MB10527:EE_|AM4PEPF00027A60:EE_|VI0PR08MB10582:EE_ X-MS-Office365-Filtering-Correlation-Id: 980e3612-ecd0-4a38-5bfe-08dc48270ff8 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 30Z15oKsQsjkwTvphLOi7UZ8vPM9cGvLhPW+H5a5wT/U/RklvWC4fQuXSDnsJw+6ud0rFaqkim4twmtRIQJK/YxUXMxyV+v79bM4rOLWMntwK6MZt8EVhGWvBPU+pTwXwIupxqHA10Hd21XNTeACTqgsq9/1BSlUAQX+1UuuZnLAUfPHZyVt9Lmu/uSdQoR+80GV5v7YFg4cYrjf3NOz7I+7oHmMkb5Ge7hWHaRPFtJ4Gipfwh9oPFLLxoME9stB9BkkVajR33hIWavNJPD4yUvPYwPxFiuS6dTFYSju9OLNjXXb4GXb2oA7f/mjyNDDicjcJUp6Vz4QXu2F2bDN8UBmW/9Lkk15rcenItkSUt1KqCp9Xekvm9rMegtzP/tJc3leOvEhh2AP6KjWjeUyWwqSoUoTgqUrnEY1hqzPFUUx5Tshoa7kc1Ntz69C2ZTGdHuSIZYFqCdiM8yUxGLWxLDLk1lwKZRlcwC4CLpg1Ws4Aiqf1Zia3HjkLVoY/Y7Cgft3XBAbux7gr8qlHx/a5cWWafVMjZ9A5X644Rt/EC3+6y0ECWHQwzgYSYHqt7TENGcw9Tbo02gYwWLgfvjGl8hbkZ1krj7EXKEgRix8NJlHhzOSguK6xebvnWPJRFIYuILdobiCfipdh5am3zMS3SfDIEftpzyR8qtnvnG5/LcxVlrFb6fHCnNY9oWSMCLcbBPqVoSI9mXxxpnC4hBG/Q== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PAWPR08MB8982.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(376005)(366007)(1800799015)(38070700009);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA6PR08MB10527 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00027A60.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 1df08c4f-b32a-44ad-041f-08dc48265097 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: uWmZJ9gZ+XPEgQzdCSvlWDK6Orowwb8UalDab3yhww7n17lxM9YM3ZM/LkYVCKsBZQ1IvlI6QyffM7FMssxZEYR7SOU7RJ96m6SF2/NvprZQFlo3Vt322e0K9nh794+2e6RDF11yUtH2AxNIXMOnI35SPRL6xN9VzDecW0NxkCIVAIqjv5f1oA14zRF7syK5PVMYyRfPEiLDjxUq7SiLvH57A46CTAMHyeZ86hOGflqMs7tK1IZuzbVELslXzZ0tCjGWmY8DzfJUZ44rZrCi/roUPC11M82oMAr38wzeqsiRy0fq4UOCr6RM+ZbS7a/gC3y0HWwt1OdHg8SjG33w1R34Izf9Bn0Y+zdyA/Ru3h6ljv38ibBJmLyiClCXRj2Q5GnhOG1SB9/vN+ZM6rjY0RBPJBkQ7NS2hotqJ5bfWB8XZVtN1/dCK5fbdgtaoJGj2RTgFTb7vNfVDWrRUfFETu/MS8QN7hxdnUSGNUQHgd3yxxrMBmUXEjaVkthdpVzv4CkFMxZeqy54Q+nLzeZWLLwRdr/lHQN06/1Mw5Gl948X2aHSpsZ1t61WihiUwZZjjrYX0DCA/P1RUHJ0oK5VuaPYusQbJlNefeIvT/pe8dL4euA5YHShGBkOT0bXsitY70vjYpwmhAv8Q+4PcBcMJeiAMsXHGrGTHF1rZ67C+idiGnGaBsvGstgCvao8yckLLNItaPU5+arb7qauhC6rbM+sGsN43lWiLIqT/4oCRZ90RX2bO7Vlm1mAOxP4u/Hb X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(376005)(82310400014)(1800799015)(36860700004);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Mar 2024 15:12:58.8001 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 980e3612-ecd0-4a38-5bfe-08dc48270ff8 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00027A60.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI0PR08MB10582 X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Adhemerval,=0A= =0A= > With this fix, and with the removal of the powerpc strcasestr=0A= > optimization [2], it seems that only x86_64 still provides a non=0A= > O(m*n) implementation [3]. Noah already gave a +1, so it would be=0A= > good to have some confirmation that this implementation can really=0A= > show some quadradic behaviour before propose a removal.=0A= =0A= Yes it is a simple brute-force algorithm that checks the whole needle=0A= at a matching character pair (and does so 1 byte at a time after the=0A= first 64 bytes of a needle). Also it never skips ahead and thus can match= =0A= at every haystack position after trying to match all of the needle.=0A= =0A= I added a quick test for this (every different implementation requires=0A= a unique test for its worst-case), and I got:=0A= =0A= "ifuncs": ["basic_strstr", "twoway_strstr", "__strstr_avx512", "__strstr_= sse2_unaligned", "__strstr_generic"],=0A= =0A= {=0A= "len_haystack": 65536,=0A= "len_needle": 1024,=0A= "align_haystack": 0,=0A= "align_needle": 0,=0A= "fail": 1,=0A= "desc": "Difficult bruteforce needle",=0A= "timings": [4.0948e+07, 15094.5, 3.20818e+07, 108558, 10839.2]=0A= },=0A= {=0A= "len_haystack": 1048576,=0A= "len_needle": 4096,=0A= "align_haystack": 0,=0A= "align_needle": 0,=0A= "fail": 1,=0A= "desc": "Difficult bruteforce needle",=0A= "timings": [2.69767e+09, 100797, 2.08535e+09, 495706, 82666.9]=0A= }=0A= =0A= So a 4x larger needle and 16x larger haystack gives a clear 65x slowdown on= =0A= both basic_strstr and __strstr_avx512.=0A= =0A= Cheers,=0A= Wilco=0A=