From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x232.google.com (mail-oi1-x232.google.com [IPv6:2607:f8b0:4864:20::232]) by sourceware.org (Postfix) with ESMTPS id C97E63858D20 for ; Fri, 3 Feb 2023 12:39:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C97E63858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oi1-x232.google.com with SMTP id j21so4034741oie.4 for ; Fri, 03 Feb 2023 04:39:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=vknckbyue6EfOe3x053dbRufCBKeA/4NtFbx2o1h9BU=; b=rU4IgtLU37ST/Up+BBXhBofmH/XNWTS2fnYdwH0AUkmvKNkACgfvKYO9dTewkMG7S2 YI/nFNMZz3YH8g23AvIW99qNH55TkCmDA0p1GvX7QvvMBsRv42Z+WTzY8tpgdG7Q4ibk Et20OJqjsFvxad3txRc9TnSQk6EaHHon2WCtbc1rIMYu5dp5LxxTHknu8HCtj1lmXQdQ uR1IjkKaKrQKJythMSD6li8/hjDlj8emLx+zoa2FDaSTfGT+X3QDHXvNAyaSAcJnJbHK CWRZt+WAEvo6BTtnGq1emo1r+ZSDgSSTL9R7WtcTW8zNFBmUhr4BOBRGbVD0iucZdaDu l4uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vknckbyue6EfOe3x053dbRufCBKeA/4NtFbx2o1h9BU=; b=nPYRNTBkM0lGQ9Pds5lbHr0Fk2bVsFCAaiaNpS6wRoKtjhBm3Iv8ijWgR9YP8qDRcQ XlC18BS+kZrvPTczurkoOGVdPDiIIoHmHo6jjnxnCjN0+cwBj8fSOtu+m7N08afB/fbj S9NNM0nhcFDxfZq++zdjjyruTgcFo2Etlakb0Hm2ljljd4qFLMBRwvV4praBushX4PsC nHPiMKSBAEjIasBlt10LmXqQJ/QKZPPsRyZndM8V8oaJzDzjNmJvIoOpYkjZQrw6K4KU DZq2JPEX5w0OeYKp02MmwKpLkEwVdokfDwAriq4Zu6xmL6/On97gPopbr3ZGx/rD4hos O/mw== X-Gm-Message-State: AO0yUKU3Vs5X+IrwDvesy6olHjHSAQv67G5+ZWMw/5OiOGVHs8IiA7yd L2tlw1AgyeI5G0TnpUlYoFshkg== X-Google-Smtp-Source: AK7set8F+dT2hxxrq4bKzmmrWR2Sw2f6IPsb1pr1dMU/zdF8O95XsS/tQnHSgwYZ9qE0Da/vCodd8g== X-Received: by 2002:aca:1002:0:b0:364:d3b1:d552 with SMTP id 2-20020aca1002000000b00364d3b1d552mr4489800oiq.9.1675427957085; Fri, 03 Feb 2023 04:39:17 -0800 (PST) Received: from ?IPV6:2804:1b3:a7c2:1887:71e4:6a44:32a:be62? ([2804:1b3:a7c2:1887:71e4:6a44:32a:be62]) by smtp.gmail.com with ESMTPSA id y18-20020a056808061200b0037868f9e657sm736571oih.37.2023.02.03.04.39.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 03 Feb 2023 04:39:16 -0800 (PST) Message-ID: Date: Fri, 3 Feb 2023 09:39:13 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.0 Subject: Re: [PATCH v12 03/31] Add string vectorized find and detection functions Content-Language: en-US To: Richard Henderson , libc-alpha@sourceware.org, Jeff Law , Xi Ruoyao , Noah Goldstein References: <20230202181149.2181553-1-adhemerval.zanella@linaro.org> <20230202181149.2181553-4-adhemerval.zanella@linaro.org> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 02/02/23 21:24, Richard Henderson wrote: > On 2/2/23 08:11, Adhemerval Zanella wrote: >> +/* With similar caveats, identify zero bytes in X1 and bytes that are >> +   not equal between in X1 and X2.  */ >> +static __always_inline find_t >> +find_zero_ne_low (op_t x1, op_t x2) >> +{ >> +  return (~find_zero_eq_low (x1, x2)) + 1; >> +} > > This is no longer used, and suspected buggy.  Duh, I now see why it's buggy -- it's inverting both zero comparison and eq comparison -- we wanted to invert only eq. > > Let's remove this rather than attempting to fix.  I suspect it'll work out very similar to our existing find_zero_ne_all(). Ack, we haven't see any issue because it not really use anywhere. > > >> +/* Similarly, but perform the search for byte equality between X1 and X2.  */ >> +static __always_inline unsigned int >> +index_first_zero (op_t x1, op_t x2) >> +{ >> +  if (__BYTE_ORDER == __LITTLE_ENDIAN) >> +    x1 = find_zero_low (x1, x2); >> +  else >> +    x1 = find_zero_all (x1, x2); >> +  return index_first (x1); >> +} > ... >> +/* Similarly, but search for the last zero within X.  */ >> +static __always_inline unsigned int >> +index_last_zero (op_t x) >> +{ >> +  return index_last (find_zero_all (x)); >> +} > > Why did this lose the __BYTE_ORDER test?  It should be the inverse of index_first_zero. I think because find_zero_all works for both LE and BE. But we can optimize it slight for BE: static __always_inline unsigned int index_last_zero (op_t x) { if (__BYTE_ORDER == __LITTLE_ENDIAN) x = find_zero_all (x); else x = find_zero_low (x); return index_last (x); } > > >> +static __always_inline unsigned int >> +index_last_eq (op_t x1, op_t x2) >> +{ >> +  return index_last_zero (x1 ^ x2); >> +} I think it should not be required with index_last_zero using __BYTE_ORDER test as above.