From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x232.google.com (mail-oi1-x232.google.com [IPv6:2607:f8b0:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id 3297A3858D33 for ; Wed, 1 Feb 2023 20:03:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3297A3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x232.google.com with SMTP id bj22so7603995oib.11 for ; Wed, 01 Feb 2023 12:03:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=+eExK9wEBlF+ZF9MEB2aySmg5qt93A+niwme+drNMO4=; b=X4FIUetZI6SMNQ6czxrq8UaygGvdkN0ox7POgHQQwOtvS27m+zQT8E2e94kv0j1hVM 2OzZPohSmkaNlY7hB6Etw799Slx3lSMxcJ5umV5vwQw3LFi6aXHhzX1zXmHEY8GjF55d /b3mV+zesu5KPyMGTujfSgNL1lB2d7uAcHBbRkyyiplZojfbxETcrHdZljk67Ph9GXUl WgDctWfVtPgNtet2qeL9tQIP9SIzFQDfoaGqp1/xB9sWV29PXIGjYhuB9o5P7jiypoCe ab6JKINbaNX4Mhtbg5NKI6904Q59ixUJWsG4kVXgLWcptUZT90aaeE4XpzK6FgZlZTvo hJaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+eExK9wEBlF+ZF9MEB2aySmg5qt93A+niwme+drNMO4=; b=1EKpn9P1OWBOFoUFtrSNevS7zytqYIkw+MMah42v979qS7yygNtRc+9NbaPHHplT3l 8/exc903xdnqcXpWEmyaL2zMaK3jzsknS0v26OgOa1AfTkRkL/CUJgYJTAlWnC9snay9 BLoklf7mYH8VlF4Qt6sDhvSvWnyOG/YcfcrpT0IKFKWJmLMe7/xQjXk1mf+4garpcTpj zQAWZUd4px/oG7M/4kPJXp/U/iydLUObZwjzucvT6LMbtBPfkDXj65zznYd7GPV0qaTc PbIH0syuvZ6s0BL2/yugmGFsiVk04I/83KdCEmj8W12heV9nGOd9DpknSuRQ3IvKD+5y N8tw== X-Gm-Message-State: AO0yUKWVX1KOWMrnxb3+ON91R3+fZ1Fz20JGmuxReePOjGxCp3GS7SaQ fEOc7YBZpRHk/GyTK3Jncc5g6g== X-Google-Smtp-Source: AK7set9g6EiEMujLzgDSjuMQzrv2X0GxL8Ejkdz4xVIMkHqTUK4jDcI92/TlDJ0wJy3g5U2xWTlZfQ== X-Received: by 2002:a05:6808:54d:b0:36a:8422:d962 with SMTP id i13-20020a056808054d00b0036a8422d962mr1765590oig.2.1675281781398; Wed, 01 Feb 2023 12:03:01 -0800 (PST) Received: from ?IPV6:2804:1b3:a7c2:1887:5d31:5c36:95c5:9e2e? ([2804:1b3:a7c2:1887:5d31:5c36:95c5:9e2e]) by smtp.gmail.com with ESMTPSA id c82-20020acab355000000b003645b64d7b3sm7438937oif.4.2023.02.01.12.02.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 01 Feb 2023 12:03:00 -0800 (PST) Message-ID: Date: Wed, 1 Feb 2023 17:02:57 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.0 Subject: Re: [PATCH v11 05/29] string: Improve generic strnlen with memchr Content-Language: en-US To: Noah Goldstein Cc: libc-alpha@sourceware.org, Richard Henderson , Jeff Law , Xi Ruoyao References: <20230201170406.303978-1-adhemerval.zanella@linaro.org> <20230201170406.303978-6-adhemerval.zanella@linaro.org> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 01/02/23 16:39, Noah Goldstein wrote: > On Wed, Feb 1, 2023 at 11:04 AM Adhemerval Zanella > wrote: >> >> It also cleanups the multiple inclusion by leaving the ifunc >> implementation to undef the weak_alias and libc_hidden_def. >> >> Co-authored-by: Richard Henderson >> --- >> string/strnlen.c | 137 +----------------- >> sysdeps/i386/i686/multiarch/strnlen-c.c | 14 +- >> .../power4/multiarch/strnlen-ppc32.c | 14 +- >> sysdeps/s390/strnlen-c.c | 14 +- >> 4 files changed, 27 insertions(+), 152 deletions(-) >> >> diff --git a/string/strnlen.c b/string/strnlen.c >> index 6ff294eab1..dc23354ec8 100644 >> --- a/string/strnlen.c >> +++ b/string/strnlen.c >> @@ -1,10 +1,6 @@ >> /* Find the length of STRING, but scan at most MAXLEN characters. >> Copyright (C) 1991-2023 Free Software Foundation, Inc. >> >> - Based on strlen written by Torbjorn Granlund (tege@sics.se), >> - with help from Dan Sahlin (dan@sics.se); >> - commentary by Jim Blandy (jimb@ai.mit.edu). >> - >> The GNU C Library is free software; you can redistribute it and/or >> modify it under the terms of the GNU Lesser General Public License as >> published by the Free Software Foundation; either version 2.1 of the >> @@ -20,7 +16,6 @@ >> not, see . */ >> >> #include >> -#include >> >> /* Find the length of S, but scan at most MAXLEN characters. If no >> '\0' terminator is found in that many characters, return MAXLEN. */ >> @@ -32,134 +27,12 @@ >> size_t >> __strnlen (const char *str, size_t maxlen) >> { >> - const char *char_ptr, *end_ptr = str + maxlen; >> - const unsigned long int *longword_ptr; >> - unsigned long int longword, himagic, lomagic; >> - >> - if (maxlen == 0) >> - return 0; >> - >> - if (__glibc_unlikely (end_ptr < str)) >> - end_ptr = (const char *) ~0UL; >> - >> - /* Handle the first few characters by reading one character at a time. >> - Do this until CHAR_PTR is aligned on a longword boundary. */ >> - for (char_ptr = str; ((unsigned long int) char_ptr >> - & (sizeof (longword) - 1)) != 0; >> - ++char_ptr) >> - if (*char_ptr == '\0') >> - { >> - if (char_ptr > end_ptr) >> - char_ptr = end_ptr; >> - return char_ptr - str; >> - } >> - >> - /* All these elucidatory comments refer to 4-byte longwords, >> - but the theory applies equally well to 8-byte longwords. */ >> - >> - longword_ptr = (unsigned long int *) char_ptr; >> - >> - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits >> - the "holes." Note that there is a hole just to the left of >> - each byte, with an extra at the end: >> - >> - bits: 01111110 11111110 11111110 11111111 >> - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD >> - >> - The 1-bits make sure that carries propagate to the next 0-bit. >> - The 0-bits provide holes for carries to fall into. */ >> - himagic = 0x80808080L; >> - lomagic = 0x01010101L; >> - if (sizeof (longword) > 4) >> - { >> - /* 64-bit version of the magic. */ >> - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ >> - himagic = ((himagic << 16) << 16) | himagic; >> - lomagic = ((lomagic << 16) << 16) | lomagic; >> - } >> - if (sizeof (longword) > 8) >> - abort (); >> - >> - /* Instead of the traditional loop which tests each character, >> - we will test a longword at a time. The tricky part is testing >> - if *any of the four* bytes in the longword in question are zero. */ >> - while (longword_ptr < (unsigned long int *) end_ptr) >> - { >> - /* We tentatively exit the loop if adding MAGIC_BITS to >> - LONGWORD fails to change any of the hole bits of LONGWORD. >> - >> - 1) Is this safe? Will it catch all the zero bytes? >> - Suppose there is a byte with all zeros. Any carry bits >> - propagating from its left will fall into the hole at its >> - least significant bit and stop. Since there will be no >> - carry from its most significant bit, the LSB of the >> - byte to the left will be unchanged, and the zero will be >> - detected. >> - >> - 2) Is this worthwhile? Will it ignore everything except >> - zero bytes? Suppose every byte of LONGWORD has a bit set >> - somewhere. There will be a carry into bit 8. If bit 8 >> - is set, this will carry into bit 16. If bit 8 is clear, >> - one of bits 9-15 must be set, so there will be a carry >> - into bit 16. Similarly, there will be a carry into bit >> - 24. If one of bits 24-30 is set, there will be a carry >> - into bit 31, so all of the hole bits will be changed. >> - >> - The one misfire occurs when bits 24-30 are clear and bit >> - 31 is set; in this case, the hole at bit 31 is not >> - changed. If we had access to the processor carry flag, >> - we could close this loophole by putting the fourth hole >> - at bit 32! >> - >> - So it ignores everything except 128's, when they're aligned >> - properly. */ >> - >> - longword = *longword_ptr++; >> - >> - if ((longword - lomagic) & himagic) >> - { >> - /* Which of the bytes was the zero? If none of them were, it was >> - a misfire; continue the search. */ >> - >> - const char *cp = (const char *) (longword_ptr - 1); >> - >> - char_ptr = cp; >> - if (cp[0] == 0) >> - break; >> - char_ptr = cp + 1; >> - if (cp[1] == 0) >> - break; >> - char_ptr = cp + 2; >> - if (cp[2] == 0) >> - break; >> - char_ptr = cp + 3; >> - if (cp[3] == 0) >> - break; >> - if (sizeof (longword) > 4) >> - { >> - char_ptr = cp + 4; >> - if (cp[4] == 0) >> - break; >> - char_ptr = cp + 5; >> - if (cp[5] == 0) >> - break; >> - char_ptr = cp + 6; >> - if (cp[6] == 0) >> - break; >> - char_ptr = cp + 7; >> - if (cp[7] == 0) >> - break; >> - } >> - } >> - char_ptr = end_ptr; >> - } >> - >> - if (char_ptr > end_ptr) >> - char_ptr = end_ptr; >> - return char_ptr - str; >> + const char *found = memchr (str, '\0', maxlen); > Can be __memchr to skip PLT no? The internal memchr alias is defined with libc_hidden_builtin_def (memchr), so calling 'memchr' will issue the internal alias. For instance, on loongarch (which uses generic strnlen): $ objdump -t string/strnlen.os | grep memchr 0000000000000000 *UND* 0000000000000000 __GI_memchr