From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oa1-x2e.google.com (mail-oa1-x2e.google.com [IPv6:2001:4860:4864:20::2e]) by sourceware.org (Postfix) with ESMTPS id 2994B3858C83 for ; Wed, 11 Jan 2023 17:28:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2994B3858C83 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-15027746720so16214419fac.13 for ; Wed, 11 Jan 2023 09:28:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=eXGOokpm8nmcDPSuaR0gq4NIacGDNNDBk5TqPkuNO00=; b=QYhEqoRx/isWt2quWox6fpp0kO0h6wx4AqO3/6nhnPUymL2BBH4NrZrTYGYJG6QZSC iicRvp9nfHNZ6yuV1T7+YELN5nYM9GFb62huTh/wcpq3SMtWh+ohotV26yTNNbq9tAtp Z/FwzGN73s5HRIIAjMp47xSpOQI4jTgU0LkmD5wcB4t9RkI3toDsGKjFMjhiZ1Jhu6E9 FTr1gLsgAaBlcEG9h95vG6kXg8ISD1Inm8teMzjC3tQaARqsQu9chgffw5s6CP8b/vUA FbV1d2wuM0i057GA4i0qWJgfGWQdK+nol+zKo84/ipBC7J4Ry7ZS9mbN4q9Exr/rIQPi wUFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=eXGOokpm8nmcDPSuaR0gq4NIacGDNNDBk5TqPkuNO00=; b=nAlTm0sx0bhmyAlYRJa8yx68HCUBFpHgE/BiwMNA2SyPmHC2+AckpFmwVFFkK87YZl LCbBQsYLKoJnI8jmCSRPgP7SEjn6HEb/4bQRzVC/A6ipbD7fytHQ+odkEjR1MC+ECSOs q48OBsusVKVGfmwE/gDn2pH6SY29uWBdYqEMUA2WFD2vGXCIgVisbI9R6hwrjnZ7rUNw x7V/H0kA1r1MhDjm7zu1pJQdZBj7J5XSAKBfVx/p0jRTmB7+zEyVIxeN1qj7E8TVkDmo C/qyKSv/fJZloDMyS7YOAL0yRxIECjsiV6Q0xJJi7nI67Z7VpPI9B1fnIB80n+VjbUCu gl7g== X-Gm-Message-State: AFqh2kr0qsSf6b2QFXF2lB1P77V7VxgpzDmPEEq6W7hF5lGoz1+KAI0p 8VTrgr/DSw3eTgvOQg81kXIsEKNTSYasekGEwoM= X-Google-Smtp-Source: AMrXdXvcbE+tDmvcIsfpPkk7yaNXP29uhmpdaCB6LpoigIb3oSPzV3w/H37VxbsSEIkIf1SW2tlo+w== X-Received: by 2002:a05:6870:88f:b0:150:21db:43e9 with SMTP id fx15-20020a056870088f00b0015021db43e9mr26216064oab.21.1673458079257; Wed, 11 Jan 2023 09:27:59 -0800 (PST) Received: from ?IPV6:2804:1b3:a7c0:a93a:9160:47f0:6805:e35d? ([2804:1b3:a7c0:a93a:9160:47f0:6805:e35d]) by smtp.gmail.com with ESMTPSA id k19-20020a056870571300b00150aca072e8sm7554221oap.38.2023.01.11.09.27.57 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Jan 2023 09:27:58 -0800 (PST) Message-ID: Date: Wed, 11 Jan 2023 14:27:56 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [PATCH v6 14/17] alpha: Add string-fzb.h and string-fzi.h Content-Language: en-US To: Richard Henderson , libc-alpha@sourceware.org, Noah Goldstein References: <20230110210106.1457686-1-adhemerval.zanella@linaro.org> <20230110210106.1457686-15-adhemerval.zanella@linaro.org> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 10/01/23 23:45, Richard Henderson wrote: > On 1/10/23 13:01, Adhemerval Zanella wrote: >> From: Richard Henderson >> >> While alpha has the more important string functions in assembly, >> there are still a few for find the generic routines are used. >> >> Use the CMPBGE insn, via the builtin, for testing of zeros.  Use a >> simplified expansion of __builtin_ctz when the insn isn't available. >> >> Checked on alpha-linux-gnu. >> --- >>   sysdeps/alpha/string-fzb.h |  52 +++++++++++++++++ >>   sysdeps/alpha/string-fzi.h | 113 +++++++++++++++++++++++++++++++++++++ >>   2 files changed, 165 insertions(+) >>   create mode 100644 sysdeps/alpha/string-fzb.h >>   create mode 100644 sysdeps/alpha/string-fzi.h >> >> diff --git a/sysdeps/alpha/string-fzb.h b/sysdeps/alpha/string-fzb.h >> new file mode 100644 >> index 0000000000..e3934ba413 >> --- /dev/null >> +++ b/sysdeps/alpha/string-fzb.h >> @@ -0,0 +1,52 @@ >> +/* Zero byte detection; boolean.  Alpha version. >> +   Copyright (C) 2023 Free Software Foundation, Inc. >> +   This file is part of the GNU C Library. >> + >> +   The GNU C Library is free software; you can redistribute it and/or >> +   modify it under the terms of the GNU Lesser General Public >> +   License as published by the Free Software Foundation; either >> +   version 2.1 of the License, or (at your option) any later version. >> + >> +   The GNU C Library is distributed in the hope that it will be useful, >> +   but WITHOUT ANY WARRANTY; without even the implied warranty of >> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU >> +   Lesser General Public License for more details. >> + >> +   You should have received a copy of the GNU Lesser General Public >> +   License along with the GNU C Library; if not, see >> +   .  */ >> + >> +#ifndef _STRING_FZB_H >> +#define _STRING_FZB_H 1 >> + >> +#include >> +#include >> + >> +/* Note that since CMPBGE creates a bit mask rather than a byte mask, >> +   we cannot simply provide a target-specific string-fza.h.  */ >> + >> +/* Determine if any byte within X is zero.  This is a pure boolean test.  */ >> + >> +static __always_inline _Bool >> +has_zero (op_t x) >> +{ >> +  return __builtin_alpha_cmpbge (0, x) != 0; >> +} >> + >> +/* Likewise, but for byte equality between X1 and X2.  */ >> + >> +static __always_inline _Bool >> +has_eq (op_t x1, op_t x2) >> +{ >> +  return has_zero (x1 ^ x2); >> +} >> + >> +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2.  */ >> + >> +static __always_inline _Bool >> +has_zero_eq (op_t x1, op_t x2) >> +{ >> +  return has_zero (x1) | has_eq (x1, x2); >> +} >> + >> +#endif /* _STRING_FZB_H */ >> diff --git a/sysdeps/alpha/string-fzi.h b/sysdeps/alpha/string-fzi.h >> new file mode 100644 >> index 0000000000..bc2f0bdc91 >> --- /dev/null >> +++ b/sysdeps/alpha/string-fzi.h >> @@ -0,0 +1,113 @@ >> +/* string-fzi.h -- zero byte detection; indices.  Alpha version. >> +   Copyright (C) 2022 Free Software Foundation, Inc. >> +   This file is part of the GNU C Library. >> + >> +   The GNU C Library is free software; you can redistribute it and/or >> +   modify it under the terms of the GNU Lesser General Public >> +   License as published by the Free Software Foundation; either >> +   version 2.1 of the License, or (at your option) any later version. >> + >> +   The GNU C Library is distributed in the hope that it will be useful, >> +   but WITHOUT ANY WARRANTY; without even the implied warranty of >> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU >> +   Lesser General Public License for more details. >> + >> +   You should have received a copy of the GNU Lesser General Public >> +   License along with the GNU C Library; if not, see >> +   .  */ >> + >> +#ifndef _STRING_FZI_H >> +#define _STRING_FZI_H >> + >> +#include >> +#include >> + >> +/* Note that since CMPBGE creates a bit mask rather than a byte mask, >> +   we cannot simply provide a target-specific string-fza.h.  */ >> + >> +/* A subroutine for the index_zero functions.  Given a bitmask C, >> +   return the index of the first bit set in memory order.  */ >> + >> +static __always_inline unsigned int >> +index_first (unsigned long int c) >> +{ >> +#ifdef __alpha_cix__ >> +  return __builtin_ctzl (c); >> +#else >> +  c = c & -c; >> +  return (c & 0xf0 ? 4 : 0) + (c & 0xcc ? 2 : 0) + (c & 0xaa ? 1 : 0); >> +#endif >> +} >> + >> +/* Similarly, but return the (memory order) index of the last bit >> +   that is non-zero.  Note that only the least 8 bits may be nonzero.  */ >> + >> +static __always_inline unsigned int >> +index_last (unsigned long int x) >> +{ >> +#ifdef __alpha_cix__ >> +  return __builtin_clzl (x) ^ 63; >> +#else >> +  unsigned r = 0; >> +  if (x & 0xf0) >> +    r += 4; >> +  if (x & (0xc << r)) >> +    r += 2; >> +  if (x & (0x2 << r)) >> +    r += 1; >> +  return r; >> +#endif >> +} >> + >> +/* Given a word X that is known to contain a zero byte, return the >> +   index of the first such within the word in memory order.  */ >> + >> +static __always_inline unsigned int >> +index_first_zero (op_t x) >> +{ >> +  return index_first (__builtin_alpha_cmpbge (0, x)); > > The header split has drifted somewhat since the original: string-fza.h is missing and check_mask() seems misplaced. > > In particular, from strchrnul, > > > +  op_t mask = check_mask (find_zero_eq_all (word, repeated_c), s_int); > +  if (mask != 0) > +    return (char *) str + index_first (mask); > > > We're using the generic find_zero_eq_all, then (correctly, for the moment) the generic check_mask, and then incorrectly the alpha-specific index_first. > > We want to use an alpha-specific find_zero_eq_all (using cmpbge), an alpha-specific check_mask (shifting by bits, not bytes), and an alpha-specific index_first (searching only 8 low bits, not a full ctz). Right, I had to consult the alpha manual to remind that cmpbge sets the lower 8 bits of the target register. And indeed running on qemu-users triggers a lot of issues. > > I wonder if it makes sense to introduce a target-specific find_t, making it clear that it's not the same type as op_t (for alpha, int instead of unsigned long (uint8_t having it's own issues on ancient alpha ev5)). I think it makes sense, I will add this change. > > I think check_mask (possibly renamed shift_find?) should belong in string-fza.h to match the implementation therein, or within a new header (possibly along with find_t?). It also make sense. > > Anyway, even the current set of headers would work, so long as Alpha implements all of them. A alpha-specific string-fza.h is in fact quite simple, since we do not need to implement all the function. I will update the patch with your suggestions.