From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id A527D3858400 for ; Mon, 5 Sep 2022 15:40:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A527D3858400 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=foss.arm.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=foss.arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 92EEBED1; Mon, 5 Sep 2022 08:40:12 -0700 (PDT) Received: from [10.2.78.56] (unknown [10.2.78.56]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8A7493F534; Mon, 5 Sep 2022 08:40:05 -0700 (PDT) Message-ID: Date: Mon, 5 Sep 2022 16:40:04 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH 15/17] arm: Add string-fza.h Content-Language: en-GB To: Adhemerval Zanella , libc-alpha@sourceware.org Cc: Richard Henderson , Joseph Myers , caiyinyu References: <20220902203940.2385967-1-adhemerval.zanella@linaro.org> <20220902203940.2385967-16-adhemerval.zanella@linaro.org> From: Richard Earnshaw In-Reply-To: <20220902203940.2385967-16-adhemerval.zanella@linaro.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3497.0 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_SHORT,NICE_REPLY_A,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 02/09/2022 21:39, Adhemerval Zanella via Libc-alpha wrote: > From: Richard Henderson > > While arm has the more important string functions in assembly, > there are still a few generic routines used. > > Use the UQSUB8 insn for testing of zeros. UQSUB8 requires ARMv6 or above. While that's pretty likely these days, you might want to consider a fall-back for Armv5 or earlier if you still want to support those. R. > > Checked on armv7-linux-gnueabihf > --- > sysdeps/arm/armv6t2/string-fza.h | 70 ++++++++++++++++++++++++++++++++ > 1 file changed, 70 insertions(+) > create mode 100644 sysdeps/arm/armv6t2/string-fza.h > > diff --git a/sysdeps/arm/armv6t2/string-fza.h b/sysdeps/arm/armv6t2/string-fza.h > new file mode 100644 > index 0000000000..4fe2e8383f > --- /dev/null > +++ b/sysdeps/arm/armv6t2/string-fza.h > @@ -0,0 +1,70 @@ > +/* Zero byte detection; basics. ARM version. > + Copyright (C) 2022 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + . */ > + > +#ifndef _STRING_FZA_H > +#define _STRING_FZA_H 1 > + > +#include > +#include > + > +/* This function returns at least one bit set within every byte > + of X that is zero. */ > + > +static inline op_t > +find_zero_all (op_t x) > +{ > + /* Use unsigned saturated subtraction from 1 in each byte. > + That leaves 1 for every byte that was zero. */ > + op_t ret, ones = repeat_bytes (0x01); > + asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); > + return ret; > +} > + > +/* Identify bytes that are equal between X1 and X2. */ > + > +static inline op_t > +find_eq_all (op_t x1, op_t x2) > +{ > + return find_zero_all (x1 ^ x2); > +} > + > +/* Identify zero bytes in X1 or equality between X1 and X2. */ > + > +static inline op_t > +find_zero_eq_all (op_t x1, op_t x2) > +{ > + return find_zero_all (x1) | find_zero_all (x1 ^ x2); > +} > + > +/* Identify zero bytes in X1 or inequality between X1 and X2. */ > + > +static inline op_t > +find_zero_ne_all (op_t x1, op_t x2) > +{ > + /* Make use of the fact that we'll already have ONES in a register. */ > + op_t ones = repeat_bytes (0x01); > + return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); > +} > + > +/* Define the "inexact" versions in terms of the exact versions. */ > +#define find_zero_low find_zero_all > +#define find_eq_low find_eq_all > +#define find_zero_eq_low find_zero_eq_all > +#define find_zero_ne_low find_zero_ne_all > + > +#endif /* _STRING_FZA_H */