From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x235.google.com (mail-oi1-x235.google.com [IPv6:2607:f8b0:4864:20::235]) by sourceware.org (Postfix) with ESMTPS id A98C23858D1E for ; Mon, 19 Sep 2022 14:04:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A98C23858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x235.google.com with SMTP id n83so14752019oif.11 for ; Mon, 19 Sep 2022 07:04:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date; bh=Bu7etqEVHQO1WhCo7spBebjUF10t90lTwaDCQ+GMehQ=; b=HkTIxSc51YmdrKSs5Rj6pFAyzP+9aYUnK7OTTXsYgs1fXoKG55l3rnOTeuRzIYkrLR E68AMfDWRN/ZJSrFg3PnK/ksK1/N4Id2nFmmcJolfB7Bq738EgjVrN4mvH6q6w3PzwwO IxUJmvXHTBVYMNTZ6CaXkeopZRba7xAeNn2k2m49SQtTB39CK1eTlcgnpFKoGvHqhWKQ 1MF/eJYRQ76F+juFyIY5G/kqba3uf+CCoapr0HLQZwmGbzlT8FeUcXf32wWEX5diBAef 7Gxklsf96M9yYwiAKCnmlLFU6/ohS2A5L9FllH9gbL4yhkMQF7+rnpbKbSZWVV5yzBiG 14wg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date; bh=Bu7etqEVHQO1WhCo7spBebjUF10t90lTwaDCQ+GMehQ=; b=QZWKQUuRvwsRqKXbtJCA96x0QyJQ9mdcHnOhUuGZ/5g3/osURadozWI9eJau9RrZid IjQcaezuLk5SxhPwDkFynpruBMd6/xRoLV7DiUnZlrbkBShbiVNRL4FYBUToQqDss1lt h3jA7lVjhOSzeZTVuBn/Fqjf2dFSxsuwrN1jf69C6yTaCdMq2QnH15mY3d+YlUPASdTc mcBJRkAPA0dwA9NYISBmiFER2+Uolh4gdlwukvG9JgWQwZW6CYuMjGSzUN3dHEhB9Qa7 GdxbCEn6Pr2JNacCxujxOCvngaBtKkknXxCfwiX3pfYIDu2QNPSJ9/Ea076MaD1DY28Z TYPg== X-Gm-Message-State: ACrzQf1eOBg7f0r7XWugAivqYWIiY7JafvrkcmmT6ggb6xsF9VNtfTI2 xUkmlT7OVkpiq4LcSqdEgDe4pg== X-Google-Smtp-Source: AMsMyM7H5nCTKP+8BQeb2spZIHBenSra/78jQbTzBNVi2NHilRV+Ss57+54wT7OBk92vEHR8WPJ0Kg== X-Received: by 2002:aca:5955:0:b0:345:b9a3:6e11 with SMTP id n82-20020aca5955000000b00345b9a36e11mr7772658oib.162.1663596252786; Mon, 19 Sep 2022 07:04:12 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c1:c266:6474:c804:752d:521c? ([2804:1b3:a7c1:c266:6474:c804:752d:521c]) by smtp.gmail.com with ESMTPSA id m27-20020a056870a11b00b0010e73e252b8sm7862602oae.6.2022.09.19.07.04.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 19 Sep 2022 07:04:11 -0700 (PDT) Message-ID: <0ef36fa3-c9f6-963c-0dc7-49227c22f322@linaro.org> Date: Mon, 19 Sep 2022 11:04:09 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Subject: Re: [PATCH 09/17] string: Improve generic strcmp Content-Language: en-US To: Noah Goldstein Cc: GNU C Library , Richard Henderson , Joseph Myers , caiyinyu References: <20220902203940.2385967-1-adhemerval.zanella@linaro.org> <20220902203940.2385967-10-adhemerval.zanella@linaro.org> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 03/09/22 00:31, Noah Goldstein wrote: > On Fri, Sep 2, 2022 at 1:41 PM Adhemerval Zanella via Libc-alpha > wrote: >> >> New generic implementation tries to use word operations along with >> the new string-fz{b,i} functions even for inputs with different >> alignments (with still uses aligned access plus merge operation >> to get a correct word by word comparison). >> >> Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, >> and powerpc-linux-gnu by removing the arch-specific assembly >> implementation and disabling multi-arch (it covers both LE and BE >> for 64 and 32 bits). >> >> Co-authored-by: Richard Henderson >> --- >> string/strcmp.c | 117 +++++++++++++++++++++++++++++++++++++++++------- >> 1 file changed, 101 insertions(+), 16 deletions(-) >> >> diff --git a/string/strcmp.c b/string/strcmp.c >> index d4962be4ec..c8acc5c0b5 100644 >> --- a/string/strcmp.c >> +++ b/string/strcmp.c >> @@ -15,33 +15,118 @@ >> License along with the GNU C Library; if not, see >> . */ >> >> +#include >> +#include >> +#include >> +#include >> #include >> +#include >> >> -#undef strcmp >> - >> -#ifndef STRCMP >> -# define STRCMP strcmp >> +#ifdef STRCMP >> +# define strcmp STRCMP >> #endif >> >> +static inline int >> +final_cmp (const op_t w1, const op_t w2) >> +{ >> + unsigned char c1, c2; >> + for (size_t i = 0; i < sizeof (op_t); i++) >> + { >> + c1 = extractbyte (w1, i); >> + c2 = extractbyte (w2, i); > > Is using extractbyte here better than just reloading indicices from memory? > > As well, maybe (for 64 bit atleast) > maybe worth cutting in half with a 32-bit xor on the lower half then > maybe skipping forward > 4-bytes. Not sure in fact, I tried to replace with 'i = index_first_zero_ne(w1, w2);' as Richard has suggested but the issue is we might have non initialized bytes that prevents us to use it. I will check if I can simplify this a bit.