From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb29.google.com (mail-yb1-xb29.google.com [IPv6:2607:f8b0:4864:20::b29]) by sourceware.org (Postfix) with ESMTPS id 147693858D37 for ; Wed, 1 Feb 2023 19:51:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 147693858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb29.google.com with SMTP id p141so23713797ybg.12 for ; Wed, 01 Feb 2023 11:51:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=tNMzB6OfShHpKcHAY8psmejj3ANGRTUlFqNjNiXKcCY=; b=oNuouGrFmrnSWFkW1qDy70d5tIwRneZ6ojto9JlvUAg/XgP2VYsvoiMYN2NPWtj+Qn vo5oitgr8Q3vFS7EOUUPo1/U0uGydfEON0UtiXiQiz5qdkuRtFL5Op5rkeYJ4z4yWJjS MCae2UgrmI7ahORWJfPy57zHNW4f2SU2PfHqAm+WayweMXp/C5xfKJkzQBpAfDFfEALG SpcyZqaarleWZL3p/vMwBKvwsQm+bqANVTjivqOmuKG4nY1Uo7g/+73bcGmPPjX2vbbI zgvrnRKfRvAIYExTZL4F2dqPYtDIi0pvcjFTWduV0wDDVI4ThkIfroaqXE/PFZIXjOe4 EF5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=tNMzB6OfShHpKcHAY8psmejj3ANGRTUlFqNjNiXKcCY=; b=ENjI6+Vl1Im5mD0c3fosacHnE3aOa20gXVaSWCXHMqrVMjpi58Gvp40wXYsI8K160Z XNpDn8CCizdpZ568PfD0G4lwUEF5sojSqOUPl4QGU9S5bZMp8vA4UbFQaiw3RaMoC1LQ y7un7eNFvMteoVTZ4GYO2nXJhra+CETq4RK0a1G9g+bv5e4Mw/uS4ZLvz0waClKdwYUt GTqfyMlhDDZcdt1OzoYVp0Qijzzqcx1UpQV5WooFUM0qkD55GoE9NUYGL3jib3ybIut+ IRVPIKevfA4m4BL7Q1QSINQut9b48z3XzY8A9g0ZYkPBOT0tr2czb5B1Kc6Ol2xM2v4Z JuUA== X-Gm-Message-State: AO0yUKUdVfniCXLsXvxcWFflCr+hRuwTIqVzbbxMBM0l3y8VnEGmdNh8 635PTFcRu1mGGnRfrvTJFwanQSISjME/+KfGUTY= X-Google-Smtp-Source: AK7set/exoyiM6Nh5gb57S12/BW4O83USvlDCG4txTr9CnED8dF7+JP3Uoz9XOcr9Yu1iX7xJkE97HkGzYkoElL4s08= X-Received: by 2002:a25:aa27:0:b0:740:b601:45e6 with SMTP id s36-20020a25aa27000000b00740b60145e6mr476014ybi.121.1675281113346; Wed, 01 Feb 2023 11:51:53 -0800 (PST) MIME-Version: 1.0 References: <20230201170406.303978-1-adhemerval.zanella@linaro.org> <20230201170406.303978-5-adhemerval.zanella@linaro.org> In-Reply-To: <20230201170406.303978-5-adhemerval.zanella@linaro.org> From: Noah Goldstein Date: Wed, 1 Feb 2023 13:51:41 -0600 Message-ID: Subject: Re: [PATCH v11 04/29] string: Improve generic strlen To: Adhemerval Zanella Cc: libc-alpha@sourceware.org, Richard Henderson , Jeff Law , Xi Ruoyao Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Feb 1, 2023 at 11:04 AM Adhemerval Zanella wrote: > > New algorithm read the first aligned address and mask off the > unwanted bytes (this strategy is similar to arch-specific > implementations used on powerpc, sparc, and sh). > > The loop now read word-aligned address and check using the has_zero > macro. > > Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, > and powercp64-linux-gnu by removing the arch-specific assembly > implementation and disabling multi-arch (it covers both LE and BE > for 64 and 32 bits). > > Co-authored-by: Richard Henderson > --- > string/strlen.c | 92 ++++++++++------------------------------- > sysdeps/s390/strlen-c.c | 10 +++-- > 2 files changed, 28 insertions(+), 74 deletions(-) > > diff --git a/string/strlen.c b/string/strlen.c > index ee1aae0fff..5a4424f9a5 100644 > --- a/string/strlen.c > +++ b/string/strlen.c > @@ -15,86 +15,38 @@ > License along with the GNU C Library; if not, see > . */ > > +#include > +#include > +#include > +#include > +#include > #include > -#include > > -#undef strlen > - > -#ifndef STRLEN > -# define STRLEN strlen > +#ifdef STRLEN > +# define __strlen STRLEN > #endif > > /* Return the length of the null-terminated string STR. Scan for > the null terminator quickly by testing four bytes at a time. */ > size_t > -STRLEN (const char *str) > +__strlen (const char *str) > { > - const char *char_ptr; > - const unsigned long int *longword_ptr; > - unsigned long int longword, himagic, lomagic; > - > - /* Handle the first few characters by reading one character at a time. > - Do this until CHAR_PTR is aligned on a longword boundary. */ > - for (char_ptr = str; ((unsigned long int) char_ptr > - & (sizeof (longword) - 1)) != 0; > - ++char_ptr) > - if (*char_ptr == '\0') > - return char_ptr - str; > - > - /* All these elucidatory comments refer to 4-byte longwords, > - but the theory applies equally well to 8-byte longwords. */ > - > - longword_ptr = (unsigned long int *) char_ptr; > + /* Align pointer to sizeof op_t. */ > + const uintptr_t s_int = (uintptr_t) str; > + const op_t *word_ptr = (const op_t*) PTR_ALIGN_DOWN (str, sizeof (op_t)); > > - /* Computing (longword - lomagic) sets the high bit of any corresponding > - byte that is either zero or greater than 0x80. The latter case can be > - filtered out by computing (~longword & himagic). The final result > - will always be non-zero if one of the bytes of longword is zero. */ > - himagic = 0x80808080L; > - lomagic = 0x01010101L; > - if (sizeof (longword) > 4) > - { > - /* 64-bit version of the magic. */ > - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ > - himagic = ((himagic << 16) << 16) | himagic; > - lomagic = ((lomagic << 16) << 16) | lomagic; > - } > - if (sizeof (longword) > 8) > - abort (); > + op_t word = *word_ptr; > + find_t mask = shift_find (find_zero_all (word), s_int); > + if (mask != 0) > + return index_first (mask); > > - /* Instead of the traditional loop which tests each character, > - we will test a longword at a time. The tricky part is testing > - if *any of the four* bytes in the longword in question are zero. */ > - for (;;) > - { > - longword = *longword_ptr++; > + do > + word = *++word_ptr; > + while (! has_zero (word)); > > - if (((longword - lomagic) & ~longword & himagic) != 0) > - { > - /* Which of the bytes was the zero? */ > - > - const char *cp = (const char *) (longword_ptr - 1); > - > - if (cp[0] == 0) > - return cp - str; > - if (cp[1] == 0) > - return cp - str + 1; > - if (cp[2] == 0) > - return cp - str + 2; > - if (cp[3] == 0) > - return cp - str + 3; > - if (sizeof (longword) > 4) > - { > - if (cp[4] == 0) > - return cp - str + 4; > - if (cp[5] == 0) > - return cp - str + 5; > - if (cp[6] == 0) > - return cp - str + 6; > - if (cp[7] == 0) > - return cp - str + 7; > - } > - } > - } > + return ((const char *) word_ptr) + index_first_zero (word) - str; > } > +#ifndef STRLEN > +weak_alias (__strlen, strlen) > libc_hidden_builtin_def (strlen) > +#endif > diff --git a/sysdeps/s390/strlen-c.c b/sysdeps/s390/strlen-c.c > index b829ef2452..0a33a6f8e5 100644 > --- a/sysdeps/s390/strlen-c.c > +++ b/sysdeps/s390/strlen-c.c > @@ -21,12 +21,14 @@ > #if HAVE_STRLEN_C > # if HAVE_STRLEN_IFUNC > # define STRLEN STRLEN_C > +# endif > + > +# include > + > +# if HAVE_STRLEN_IFUNC > # if defined SHARED && IS_IN (libc) > -# undef libc_hidden_builtin_def > -# define libc_hidden_builtin_def(name) \ > - __hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); > +__hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); > # endif > # endif > > -# include > #endif > -- > 2.34.1 > LGTM. Reviewed-by: Noah Goldstein