From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 368C63858CDA for ; Mon, 5 Sep 2022 15:50:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 368C63858CDA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=foss.arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 308311063; Mon, 5 Sep 2022 08:50:11 -0700 (PDT) Received: from [10.2.78.56] (unknown [10.2.78.56]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 269D83F534; Mon, 5 Sep 2022 08:50:04 -0700 (PDT) Message-ID: Date: Mon, 5 Sep 2022 16:50:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH 15/17] arm: Add string-fza.h Content-Language: en-GB To: Adhemerval Zanella , libc-alpha@sourceware.org Cc: caiyinyu , Joseph Myers , Richard Henderson References: <20220902203940.2385967-1-adhemerval.zanella@linaro.org> <20220902203940.2385967-16-adhemerval.zanella@linaro.org> From: Richard Earnshaw In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3495.5 required=5.0 tests=BAYES_00,BODY_8BITS,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 05/09/2022 16:40, Richard Earnshaw via Libc-alpha wrote: > > > On 02/09/2022 21:39, Adhemerval Zanella via Libc-alpha wrote: >> From: Richard Henderson >> >> While arm has the more important string functions in assembly, >> there are still a few generic routines used. >> >> Use the UQSUB8 insn for testing of zeros. > > UQSUB8 requires ARMv6 or above.  While that's pretty likely these days, > you might want to consider a fall-back for Armv5 or earlier if you still > want to support those. > Hmm, nevermind, I've just noticed this is in the armv6t2 directory, so ARMv6 is a given. Sorry for the noise. R. > R. > >> >> Checked on armv7-linux-gnueabihf >> --- >>   sysdeps/arm/armv6t2/string-fza.h | 70 ++++++++++++++++++++++++++++++++ >>   1 file changed, 70 insertions(+) >>   create mode 100644 sysdeps/arm/armv6t2/string-fza.h >> >> diff --git a/sysdeps/arm/armv6t2/string-fza.h >> b/sysdeps/arm/armv6t2/string-fza.h >> new file mode 100644 >> index 0000000000..4fe2e8383f >> --- /dev/null >> +++ b/sysdeps/arm/armv6t2/string-fza.h >> @@ -0,0 +1,70 @@ >> +/* Zero byte detection; basics.  ARM version. >> +   Copyright (C) 2022 Free Software Foundation, Inc. >> +   This file is part of the GNU C Library. >> + >> +   The GNU C Library is free software; you can redistribute it and/or >> +   modify it under the terms of the GNU Lesser General Public >> +   License as published by the Free Software Foundation; either >> +   version 2.1 of the License, or (at your option) any later version. >> + >> +   The GNU C Library is distributed in the hope that it will be useful, >> +   but WITHOUT ANY WARRANTY; without even the implied warranty of >> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU >> +   Lesser General Public License for more details. >> + >> +   You should have received a copy of the GNU Lesser General Public >> +   License along with the GNU C Library; if not, see >> +   .  */ >> + >> +#ifndef _STRING_FZA_H >> +#define _STRING_FZA_H 1 >> + >> +#include >> +#include >> + >> +/* This function returns at least one bit set within every byte >> +   of X that is zero.  */ >> + >> +static inline op_t >> +find_zero_all (op_t x) >> +{ >> +  /* Use unsigned saturated subtraction from 1 in each byte. >> +     That leaves 1 for every byte that was zero.  */ >> +  op_t ret, ones = repeat_bytes (0x01); >> +  asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); >> +  return ret; >> +} >> + >> +/* Identify bytes that are equal between X1 and X2.  */ >> + >> +static inline op_t >> +find_eq_all (op_t x1, op_t x2) >> +{ >> +  return find_zero_all (x1 ^ x2); >> +} >> + >> +/* Identify zero bytes in X1 or equality between X1 and X2.  */ >> + >> +static inline op_t >> +find_zero_eq_all (op_t x1, op_t x2) >> +{ >> +  return find_zero_all (x1) | find_zero_all (x1 ^ x2); >> +} >> + >> +/* Identify zero bytes in X1 or inequality between X1 and X2.  */ >> + >> +static inline op_t >> +find_zero_ne_all (op_t x1, op_t x2) >> +{ >> +  /* Make use of the fact that we'll already have ONES in a >> register.  */ >> +  op_t ones = repeat_bytes (0x01); >> +  return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); >> +} >> + >> +/* Define the "inexact" versions in terms of the exact versions.  */ >> +#define find_zero_low        find_zero_all >> +#define find_eq_low        find_eq_all >> +#define find_zero_eq_low    find_zero_eq_all >> +#define find_zero_ne_low    find_zero_ne_all >> + >> +#endif /* _STRING_FZA_H */