From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-eopbgr80053.outbound.protection.outlook.com [40.107.8.53]) by sourceware.org (Postfix) with ESMTPS id 8CA783858418 for ; Sat, 3 Sep 2022 13:13:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8CA783858418 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=g7FZNo51hxZNAl8eJ0IG6kgeRKx6qaEu+mMf4iwEntd2ffU8p1BO/vmNdK9s8GLd6W+/Pt+EdgNT+vS5gWDoC/fgBpGvV/eoosxyenofZnljq/+vuU0iB0rLX9pKWWbcAacDii0Cvwr0Z6mR7klVhA63dEJhIYUctrAtbaMRxSbrKUuIapPv/t9gAyDG2RueuBXU+7nXAMAk/xcLYtwtJ6bexpNt7jvOVtty/0grfosqx066iUCW0PFds3x/4yh5+maeyfbUrW2+lrngsYyb823DCTOg/+W/Qq9KjyB1UjCVt9TkoIMubysz6rJ06ChW5jMGEQsnZx6JQWx6R2WEjA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=i76LRcerysK6LI9AXwIa0Qgjv4xUK6LrElFg318+93g=; b=iWr3lm1NhLx8qTqIiHQPu8TOALraTiWiyT/bW5bo0CnIpZRIJ47p177m6Oc5qVbi3r1Vp1vpPypA+us3RYnYGDCPLjgEGlxFsE+CAQUWAbSiFrEoJ9V+UQreoD534mObZzGgsHJsdq8bKkR2JGeVTA6krhXIAZBlIeNwx6x8LuBji7GT3haYyjwHC1q9BIvFI7hU9treOtFcEa/nEVBoGY+T8dyPUo8pV0zfqD0Qa/d6Wr+ydpQO307Y07JVbKAcN0azA+tvhiGsus0zexloBoC5GqTcUfFx23PMnl1wXwIcy0Leg3tW4diyczatSbShk2MBvoZRYCY9vV3lr2HO+w== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=i76LRcerysK6LI9AXwIa0Qgjv4xUK6LrElFg318+93g=; b=nhUg/odWUkpsg5Q/YqTajxUpaiVx4C52ueCLZ4u3JATkrzroy7zr2OPpk1a+jHw/1dlaCMjOVm43NuZqAG4ywfU0X/cciwxTHg5+CLSrabuBzNQBcTQjSkN/CImXAtRYm2JyJCoaoqWtaLIVs+T2MOdtkS5ofCK8vBhatzofBP8= Received: from AM3PR07CA0132.eurprd07.prod.outlook.com (2603:10a6:207:8::18) by PA4PR08MB6319.eurprd08.prod.outlook.com (2603:10a6:102:e8::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5588.10; Sat, 3 Sep 2022 13:13:29 +0000 Received: from AM7EUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:207:8:cafe::29) by AM3PR07CA0132.outlook.office365.com (2603:10a6:207:8::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5612.10 via Frontend Transport; Sat, 3 Sep 2022 13:13:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT023.mail.protection.outlook.com (100.127.140.73) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5588.10 via Frontend Transport; Sat, 3 Sep 2022 13:13:29 +0000 Received: ("Tessian outbound fa99bf31ee7d:v123"); Sat, 03 Sep 2022 13:13:29 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 2ac45934dcb097b9 X-CR-MTA-TID: 64aa7808 Received: from f3ab865c82b4.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 3ACB1FF2-5963-4627-96FE-22546D9F93F2.1; Sat, 03 Sep 2022 13:13:21 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f3ab865c82b4.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Sat, 03 Sep 2022 13:13:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=k7HLe3wAdjXs6nekd9sOfFQODHGNrRGyhveBrIPBGgNeA6vnKn/J+qREcRZXXmyiEYcOK+P4xyQZjIr25fkggmMQ+gRQALjK8fS7ELquHwgfKa90GHyXwIacDmSr8nveAnDM9D0zIFL06e7bFrllZcnnoH1ogRYn5DFzU12JmbLoWVOgH8+7c4vtUVAj9Jj7pxxpHvJB5nDG+OAn5e1jJ/JPmT5K2AjPOgrc7W8TOJTYH23iqEwQMsP9B7gZva6v9bkdCXyHfbFJb502t2T4YvsnICpJm6WkaNVEJvZU7ADSvld18jK/At8uGhA7EgoDK2j2Bc9PH8rOu9q6q5rWdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=i76LRcerysK6LI9AXwIa0Qgjv4xUK6LrElFg318+93g=; b=KuxSAvcfTznSNbkx23j2hyQ/O4Iza/xyO5qg422/x6lH6EWz/ZSjm85MwBa3uOYxgqcjFV+vyWoxewEMNnuRSSQd1WYPU2kwcnJBFTdkRloaPG/Grgdd3W32XbLMOJ/SvxbXz+cyiqvuzJu02b+1vhlHwbSznEsE0cP4ABXvZPDrW+e+tAC6MQGwvvpWmOzOiIVTCQouzRoQ8nQx0sDBY/6IvdyggOMvqqfGertR/OumznAYo5VuqTr5jUav82GS2BwQ9cfZ5lh64ObPzM7d274rpLbIwo6R/5pYIloK/htK3gYDowqYbd04o6orQm+cc0EsC8oqIkJikaU1TfNFzQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=i76LRcerysK6LI9AXwIa0Qgjv4xUK6LrElFg318+93g=; b=nhUg/odWUkpsg5Q/YqTajxUpaiVx4C52ueCLZ4u3JATkrzroy7zr2OPpk1a+jHw/1dlaCMjOVm43NuZqAG4ywfU0X/cciwxTHg5+CLSrabuBzNQBcTQjSkN/CImXAtRYm2JyJCoaoqWtaLIVs+T2MOdtkS5ofCK8vBhatzofBP8= Received: from AS4PR08MB7901.eurprd08.prod.outlook.com (2603:10a6:20b:51c::16) by PAXPR08MB6525.eurprd08.prod.outlook.com (2603:10a6:102:154::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5588.10; Sat, 3 Sep 2022 13:13:20 +0000 Received: from AS4PR08MB7901.eurprd08.prod.outlook.com ([fe80::b58b:c477:7fd2:77bf]) by AS4PR08MB7901.eurprd08.prod.outlook.com ([fe80::b58b:c477:7fd2:77bf%4]) with mapi id 15.20.5588.010; Sat, 3 Sep 2022 13:13:20 +0000 From: Wilco Dijkstra To: 'GNU C Library' CC: Adhemerval Zanella Subject: [PATCH 04/17] Add string vectorized find and detection functions Thread-Topic: [PATCH 04/17] Add string vectorized find and detection functions Thread-Index: AQHYv5CEmICFS0bD/U+EHChlDNVJQQ== Date: Sat, 3 Sep 2022 13:13:19 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-MS-Office365-Filtering-Correlation-Id: f1a5d13c-6054-4889-eef4-08da8dae180c x-ms-traffictypediagnostic: PAXPR08MB6525:EE_|AM7EUR03FT023:EE_|PA4PR08MB6319:EE_ x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 7IN23YowjV0L6E5ma88Qq1lVmqDULw4WA0yCghzxH4DwKfuU1khjsT+Nuy+dXhPYizeGGabeUOQW00D09ILTrbkvPlIrk2l+0dDbxpDoeZSfWo1n0jth3NKruiLPvoMyYoB+IJSHQ7U/ua6T4gb8PZC0h3AoNy37l8A7g+ih34AZhieoV2Br+jwR19pQUBJ0axFxhOGvL0B3gkeDW99gWh7ZVyXEaZ4CnCsA8WOf+A+xZtYSrssC0r77WzT4NnCkZpZzgj8sRHeRtBrAagE/N6smFOi4Mm1ZKRkQJLHM29uLudQ8yQ1ce3sEuJ3LvLFu11j8nk+vcHjC4UNBN+p29hHL+qODmBtrGzmM8q472dMTzCdjlhHj6b9h3GlJq/61if3dJuUkuYk1AvGuU/dGRBr5Lm+BRxgcMJ3ehT00QvwjBKWZY6ltdpwc5t4FefO3A8mKsjmGz9hrqRXAxJ0f33LRk3hmNqv9fBVWdSwQkhbsfA518zWWAEsCGvlCTpoNe+X8m17OP3MzqQSBqVe/TFhzgDj74Vg5G9EweBRUqLso2//vRlh1mDg7Ae3nTxYnBDbiftWRtPNzsKrVat8SYIyd8UETiNq+dcf9+jDV+We/+LGo72FlLbA2Uj0fiwkxw/r++uCoauVW8JVecgZTr6rdXSQbRaZsjTRPFWUPgnL87yCHnsJUdNZn+ua6tHsAU/p8/dyuwq6OuRPIpJWMLMaFZ4okRmnhv3tsvDluSHRGrKBSmOZXtL5JJAiXacrqJ7gbK4Tm2MxiT4Lqazrszw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:AS4PR08MB7901.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(4636009)(136003)(346002)(396003)(366004)(39850400004)(376002)(86362001)(26005)(9686003)(71200400001)(478600001)(7696005)(6506007)(41300700001)(122000001)(38100700002)(38070700005)(186003)(55016003)(83380400001)(4326008)(5660300002)(91956017)(66556008)(66946007)(66476007)(2906002)(8936002)(52536014)(6916009)(76116006)(8676002)(33656002)(316002)(64756008)(66446008);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6525 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3588ef6f-d828-4652-235a-08da8dae127a X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: /rKH/Cy4U3dnnMkxlMw15JTYvRoRH9FPZ3YXU2VeUnlEMKQtpCAr6xNRh91kpXp/dXht218jtuBJyQrZLzaBdgItJzHs7DRqM7mFmDnEJ3xoey6yoobsdyXey0JHxi78sldAFChBnZnf1EPmQiPgGU+LzRnsOIwCD3JwPLVaSaqkFwrB9IfM8qZNY1rpmuyKcM+H0l3lNovWtAtwGV6QyqTbORedmUEnWjXIv7t/h7bEG816lvK2BHL74Z+SmoDBIeNo1UQao9OLuIcqzlGonpjbJwpwbLc09xiNP0bM9NaiVUv02uqWPE1tH2T5jGzgihKttvA60Ey9S+jS5BpoVL++D0Z8evRTpzJ8P3G5V779Z8R/lF5EJM7nhYD0bhoeCB+zfYym4zNjMPHGSqVFckEs8kDqQdICIy4uLKINccL2gF+Y/Q/z/ADH6P6K4bTdvpvMaJ8iDMDGaFNIv7IQToSkFYOPcaI5Uh9MapmFODbObJIWCwCd2kTaNCprdyqAhgt3v+bou7ihBUaIZI7ZAT7HB0Y1SjkQtrJaEoEehh9FQrlgdvPzTU2MfVfMD3KEgmdi1AD4Ck9+0fu6s/8LeLhVcrg6ddBYIEqqNeBdBPQFpe6vmPzaFtvi0Ru7XbVoSLt5yfwmBcYenCS2qwztTbJnVIt3hnHQgh/MK9XXQulieQV/9oDaaWbvgjG2vfkCr0l8LDd9qW1XKxoRFGuWmWycjjCkIxO0Diwgwac/l2CNR6ycdpwoTnrU5H1a7hjScfpMIoMNvVXGEmj0z4/pqQ== X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230016)(4636009)(396003)(39860400002)(136003)(376002)(346002)(36840700001)(40470700004)(46966006)(40460700003)(82310400005)(186003)(5660300002)(47076005)(41300700001)(52536014)(356005)(336012)(6916009)(8936002)(33656002)(81166007)(82740400003)(83380400001)(316002)(9686003)(2906002)(26005)(478600001)(86362001)(70206006)(8676002)(70586007)(7696005)(6506007)(55016003)(107886003)(36860700001)(40480700001)(4326008);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Sep 2022 13:13:29.3048 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f1a5d13c-6054-4889-eef4-08da8dae180c X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6319 X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Adhemerval,=0A= =0A= +static inline unsigned int=0A= +__clz (op_t x)=0A= +{=0A= +#if !HAVE_BUILTIN_CLZ=0A= + unsigned r;=0A= + op_t i;=0A= +=0A= + x |=3D x >> 1;=0A= + x |=3D x >> 2;=0A= + x |=3D x >> 4;=0A= + x |=3D x >> 8;=0A= + x |=3D x >> 16;=0A= +# if __WORDSIZE =3D=3D 64=0A= + x |=3D x >> 32;=0A= + i =3D x * 0x03F79D71B4CB0A89ull >> 58;=0A= +# else=0A= + i =3D x * 0x07C4ACDDU >> 27;=0A= +# endif=0A= + r =3D index_access (i);=0A= + return r ^ (sizeof (op_t) * CHAR_BIT - 1);=0A= +#else=0A= + if (sizeof (op_t) =3D=3D sizeof (long int))=0A= + return __builtin_clzl (x);=0A= + else=0A= + return __builtin_clzll (x);=0A= +#endif=0A= +}=0A= =0A= This is a really bad idea. Firstly it is incorrect - sizeof (op_t) !=3D __W= ORDSIZE due to=0A= the odd way it is defined (it can be 64 bits on 32-bit targets). That in it= self is=0A= problematic since it isn't clear that using 64 bits operations extensively = is efficient=0A= on 32-bit targets (using 64-bit multiplies in GMP is different from using 6= 4-bit=0A= load/store in memcpy/memset which is different from 64-bit logical operatio= ns and=0A= shifts, so all of these should be decoupled rather than forced together).= =0A= =0A= Secondly, there are already several ways to use count leading zeroes in GLI= BC.=0A= One is use the builtin unconditionally (done in lots of places, eg. by math= code),=0A= another is count_leading_zeros defined in longlong.h. This would add the th= ird way. =0A= It's not clear how much gain inlining gives over using the libgcc implement= ation,=0A= but if it is significant then we could provide a generic inline clzl/clzll = that can be=0A= used throughout GLIBC (replacing existing builtin_clz and count_leading_zer= os).=0A= =0A= Finally, emulating a full clz is inefficient. If you have already called fi= nd_zero_low=0A= then there are at most 4 bits set on a 32-bit LE target, so you can trivial= ly get the=0A= index of the first zero byte via:=0A= =0A= x =3D x & -x;=0A= x =3D (x >> 15) + (x >> 22) + 3 * (x >> 31);=0A= =0A= This is many times faster. There may be similar sequences for big-endian, b= ut=0A= you could just do a multiply with a magic word that gives the correct resul= t=0A= without needing a lookup table.=0A= =0A= Cheers,=0A= Wilco=0A=