From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x36.google.com (mail-oa1-x36.google.com [IPv6:2001:4860:4864:20::36]) by sourceware.org (Postfix) with ESMTPS id DB7AC3858C83 for ; Thu, 22 Sep 2022 17:51:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DB7AC3858C83 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oa1-x36.google.com with SMTP id 586e51a60fabf-12c8312131fso14966846fac.4 for ; Thu, 22 Sep 2022 10:51:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date; bh=f1UrN1sKMWnH72anLNqO6cGAqMc7UH6dCUhOFbqcVyA=; b=f3/vMRWPSqT2RGWbtcSNOVlNwpqOuvvDA5m+cty5a4i5KEYgYoVbzni+Q2hOwBQLkh QbBoiIqgF7x1jme7r1w/dMiXnjeF2qjatZraXnCqbn1UsZY5UN9nNG3ySnOdd4WW7l8+ mNbbEpeHMHJcJX6dFrlQ6hH/AVBclcel5u4jINThwsRX1kbUkA/KtS2yekYhTUdWrgN9 KqyM2pwyYWT/bDc7D87+5jB6jXQ/NPCf2u6srYDdWrzemUq95h2fm3/yUpr/YSFkIWMz mPSEmcbftOLbmOzbRFUgVytuX18XuNUg3ruXqb9GmkTFp02Kd0fyGbsqTnJXB1MQjIim uhcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date; bh=f1UrN1sKMWnH72anLNqO6cGAqMc7UH6dCUhOFbqcVyA=; b=2gDfCiRBDlWJxsZA8CegjdvNLs7LdlQPdkLKaYu74/GMLL69pBe6ctEQdh8unWFnD7 GVKpYLc8M6wUnsnAALxRb3nK8DrSdXddVFLmbbeTTf91RahrWB5CaLVPbSR6ovPJC27a 8avxqXM7kP2hZ8NILnhqd2Wno6jXjIMvG4Mr8M2VlvnCNh/eNA7cgud6S4n8j5axepdG crhbaqgQA2wXVxuixSf+KIOYbs+9+mT5fPkD1rnQLTfi4qTibu43xdWZkky3fgV7QgFc i2PD3lIKRInwv57b9pyGLiQxS9gCViHSUOfbojLuc/gDut6xPw/FsBwZWwabfcMdYZB2 8vUg== X-Gm-Message-State: ACrzQf1DtANbpVkup46aFqzEsQ+dk+tS+mCKy/qPJZ7lkLgwmSSVGyay G34pJWrXsTMw60J25e4T/0+EWXmPiuBpK2qu X-Google-Smtp-Source: AMsMyM6D49Q4qmOgo6AVqYg84ZcG+JgzsBe78CFjowwtBgDJQeNVx6Xp4AL5fwCvdRYUR9a/OjEE8A== X-Received: by 2002:a05:6870:e616:b0:11c:bfaf:e69e with SMTP id q22-20020a056870e61600b0011cbfafe69emr2763768oag.196.1663869091244; Thu, 22 Sep 2022 10:51:31 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c1:c266:202e:f71c:c0e7:6b4e? ([2804:1b3:a7c1:c266:202e:f71c:c0e7:6b4e]) by smtp.gmail.com with ESMTPSA id er21-20020a056870c89500b00127a6357bd5sm3419415oab.49.2022.09.22.10.51.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 22 Sep 2022 10:51:30 -0700 (PDT) Message-ID: <2710d66f-2a13-be75-8692-6ffbd5302ccf@linaro.org> Date: Thu, 22 Sep 2022 14:51:28 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH 10/17] string: Improve generic memchr Content-Language: en-US To: Noah Goldstein Cc: GNU C Library , Richard Henderson , Joseph Myers , caiyinyu References: <20220902203940.2385967-1-adhemerval.zanella@linaro.org> <20220902203940.2385967-11-adhemerval.zanella@linaro.org> <0358c009-eda0-ae01-5d63-69003e3fe375@linaro.org> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 19/09/22 18:59, Noah Goldstein wrote: > On Mon, Sep 19, 2022 at 12:17 PM Adhemerval Zanella Netto > wrote: >> >> >> >> On 03/09/22 00:47, Noah Goldstein wrote: >> >>>> >>>> - longword_ptr = (const longword *) char_ptr; >>>> + /* Compute the address of the word containing the last byte. */ >>>> + const op_t *lword = word_containing (lbyte); >>>> >>>> - /* All these elucidatory comments refer to 4-byte longwords, >>>> - but the theory applies equally well to any size longwords. */ >>>> + /* Read the first word, but munge it so that bytes before the array >>>> + will not match goal. */ >>>> + const op_t * word_ptr = word_containing (s); >>>> + op_t word = (*word_ptr | before_mask) ^ (repeated_c & before_mask); >>> >>> Why do you xor with repeated_c & before_mask here? >>> >>> Doesn't the has_eq(word, repeated_c) do that? >> >> For the case of c_in being 0xff, since for this case or with before_mask >> will make has_eq to return early. The test-memchr does not trigger it, >> but test-memccpy does fail without the XOR. > > I see. Since a match in the first several bytes is fairly common > maybe it would be better to special case the first iteration and just do > > has_eq(word, repeated_c) >> (CHAR_BIT * (addr % sizeof(addr)). > The result can just be added to `s` if there is a match. I think you mean something like: has_eq (word >> (CHAR_BIT * (s % sizeof(op_t)), repeated_c) Since has_eq returns _Bool. However in this case we will need to shift the repeated_c as well, and it will bleed endianess definition (the shift direction) on generic implementation. On both cases not sure if this will be a gain. Maybe we can also parametrize the first check: static inline _Bool has_eq_first (op_t *word, const op_t *word_ptr, op_t repeated_c, op_t before_mask) { *word = (*word_ptr | before_mask) ^ (repeated_c & before_mask); return has_eq (*word, repeated_c); } [...] op_t word; if (!has_eq_first (&word, word_ptr, repeated_c, before_mask)) { do { if (word_ptr == lword) return NULL; word = *++word_ptr; } while (!has_eq (word, repeated_c)); } If the architecture has a better strategy to check. But I also not sure if this would indeed yield any improvement in the end.