From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id ADF903858C60 for ; Fri, 3 Feb 2023 23:23:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ADF903858C60 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ej1-x636.google.com with SMTP id ud5so19562457ejc.4 for ; Fri, 03 Feb 2023 15:23:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=1o6omolfoCbBWLSBmNV1lYNeMDFrVzl5Kucky1zcLgg=; b=gpUZ9/gbFhn4/+WGyyagkP3E4ZL1wAL9VP7C+GGkMBVkQ3EhY20v7XQUF4BaPR2/lp 0pbpQNBTOenjHF4X46bItG5U3JPWMVQ4sWgel2UdDxy2xAKF/HqUBHLAlHiHPhbQZWBD f5smcBKZB6dODi8AueQyRnUpT4zXbpNG19arq6P0wsqDVRKDvoSLrpgwFOEX81s7NmOJ q4QNhUSmqeS9QldnPVvfi5z57UIkaEbl6mMsIjzzUNw8/e1mizn7xZ/li5NzSV7ozTmK qPwnRtDJ0oq2VdZ+dHUmKpvrEYJIQwC4/Xv6p/f/grSQC+lQ6Ub4D9T18mX4HQYUlmER 7E7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1o6omolfoCbBWLSBmNV1lYNeMDFrVzl5Kucky1zcLgg=; b=QP3WU3txyEEczAdz8VplDsJq4aCVqUFo8VFzCA8agscL63nTy2HvD7MLpMG1/duAcD vFBYrVKEgpH4hWjP72YsHl11AXcsTbghDq+csrkuIoYSUSksx9HXG+UBEuRPQ2+CYJDv J7eOgBadH0kgAO6qIotpbt98KgRtSc8Rwaq9pTt8814kTdmUkYHG+q9x3H0YJfjZz/X4 9O9O1JdSjchPtchVq993nLcLEl33PFZ0Ug5ZzTtisg2+MW8x1TcH7bGmrmuctnocQcr6 hMqVdKsQndgfPU+3XHxw0s6WLe/4FpeguavW2Qf7tgo6syWNx1Iksp7itcyRQeL0UETQ D/Sg== X-Gm-Message-State: AO0yUKWt5KtNa8lDwxjHXiCxcKn07ACn6W2whGe0mtDtkijRpgRVk8QQ 3lAVhickaJ2t6+LBAzRc14+G6X1+X7NWdGTO4AzZd14OOkY= X-Google-Smtp-Source: AK7set/qtJ9yuxbjtFUQQnlFpGV2XTNfeA/zbI/wRT8IA7eHhFEx2UtEhaR4qtfJgUiLRlK4cHU29Ro4nyOxugdH4Mc= X-Received: by 2002:a17:907:7670:b0:87b:db55:f3e5 with SMTP id kk16-20020a170907767000b0087bdb55f3e5mr3499729ejc.289.1675466633262; Fri, 03 Feb 2023 15:23:53 -0800 (PST) MIME-Version: 1.0 References: <20230202181149.2181553-1-adhemerval.zanella@linaro.org> <20230202181149.2181553-5-adhemerval.zanella@linaro.org> In-Reply-To: <20230202181149.2181553-5-adhemerval.zanella@linaro.org> From: Noah Goldstein Date: Fri, 3 Feb 2023 17:23:42 -0600 Message-ID: Subject: Re: [PATCH v12 04/31] string: Improve generic strlen To: Adhemerval Zanella Cc: libc-alpha@sourceware.org, Richard Henderson , Jeff Law , Xi Ruoyao Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Feb 2, 2023 at 12:12 PM Adhemerval Zanella wrote: > > New algorithm read the first aligned address and mask off the > unwanted bytes (this strategy is similar to arch-specific > implementations used on powerpc, sparc, and sh). > > The loop now read word-aligned address and check using the has_zero > macro. > > Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, > and powercp64-linux-gnu by removing the arch-specific assembly > implementation and disabling multi-arch (it covers both LE and BE > for 64 and 32 bits). > > Co-authored-by: Richard Henderson > Reviewed-by: Noah Goldstein > --- > string/strlen.c | 92 ++++++++++------------------------------- > sysdeps/s390/strlen-c.c | 10 +++-- > 2 files changed, 28 insertions(+), 74 deletions(-) > > diff --git a/string/strlen.c b/string/strlen.c > index ee1aae0fff..5a4424f9a5 100644 > --- a/string/strlen.c > +++ b/string/strlen.c > @@ -15,86 +15,38 @@ > License along with the GNU C Library; if not, see > . */ > > +#include > +#include > +#include > +#include > +#include > #include > -#include > > -#undef strlen > - > -#ifndef STRLEN > -# define STRLEN strlen > +#ifdef STRLEN > +# define __strlen STRLEN > #endif > > /* Return the length of the null-terminated string STR. Scan for > the null terminator quickly by testing four bytes at a time. */ > size_t > -STRLEN (const char *str) > +__strlen (const char *str) > { > - const char *char_ptr; > - const unsigned long int *longword_ptr; > - unsigned long int longword, himagic, lomagic; > - > - /* Handle the first few characters by reading one character at a time. > - Do this until CHAR_PTR is aligned on a longword boundary. */ > - for (char_ptr = str; ((unsigned long int) char_ptr > - & (sizeof (longword) - 1)) != 0; > - ++char_ptr) > - if (*char_ptr == '\0') > - return char_ptr - str; > - > - /* All these elucidatory comments refer to 4-byte longwords, > - but the theory applies equally well to 8-byte longwords. */ > - > - longword_ptr = (unsigned long int *) char_ptr; > + /* Align pointer to sizeof op_t. */ > + const uintptr_t s_int = (uintptr_t) str; > + const op_t *word_ptr = (const op_t*) PTR_ALIGN_DOWN (str, sizeof (op_t)); > > - /* Computing (longword - lomagic) sets the high bit of any corresponding > - byte that is either zero or greater than 0x80. The latter case can be > - filtered out by computing (~longword & himagic). The final result > - will always be non-zero if one of the bytes of longword is zero. */ > - himagic = 0x80808080L; > - lomagic = 0x01010101L; > - if (sizeof (longword) > 4) > - { > - /* 64-bit version of the magic. */ > - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ > - himagic = ((himagic << 16) << 16) | himagic; > - lomagic = ((lomagic << 16) << 16) | lomagic; > - } > - if (sizeof (longword) > 8) > - abort (); > + op_t word = *word_ptr; > + find_t mask = shift_find (find_zero_all (word), s_int); > + if (mask != 0) > + return index_first (mask); > > - /* Instead of the traditional loop which tests each character, > - we will test a longword at a time. The tricky part is testing > - if *any of the four* bytes in the longword in question are zero. */ > - for (;;) > - { > - longword = *longword_ptr++; > + do > + word = *++word_ptr; > + while (! has_zero (word)); > > - if (((longword - lomagic) & ~longword & himagic) != 0) > - { > - /* Which of the bytes was the zero? */ > - > - const char *cp = (const char *) (longword_ptr - 1); > - > - if (cp[0] == 0) > - return cp - str; > - if (cp[1] == 0) > - return cp - str + 1; > - if (cp[2] == 0) > - return cp - str + 2; > - if (cp[3] == 0) > - return cp - str + 3; > - if (sizeof (longword) > 4) > - { > - if (cp[4] == 0) > - return cp - str + 4; > - if (cp[5] == 0) > - return cp - str + 5; > - if (cp[6] == 0) > - return cp - str + 6; > - if (cp[7] == 0) > - return cp - str + 7; > - } > - } > - } > + return ((const char *) word_ptr) + index_first_zero (word) - str; > } > +#ifndef STRLEN > +weak_alias (__strlen, strlen) > libc_hidden_builtin_def (strlen) > +#endif > diff --git a/sysdeps/s390/strlen-c.c b/sysdeps/s390/strlen-c.c > index b829ef2452..0a33a6f8e5 100644 > --- a/sysdeps/s390/strlen-c.c > +++ b/sysdeps/s390/strlen-c.c > @@ -21,12 +21,14 @@ > #if HAVE_STRLEN_C > # if HAVE_STRLEN_IFUNC > # define STRLEN STRLEN_C > +# endif > + > +# include > + > +# if HAVE_STRLEN_IFUNC > # if defined SHARED && IS_IN (libc) > -# undef libc_hidden_builtin_def > -# define libc_hidden_builtin_def(name) \ > - __hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); > +__hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); > # endif > # endif > > -# include > #endif > -- > 2.34.1 > LGTM. Reviewed-by: Noah Goldstein