From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by sourceware.org (Postfix) with ESMTPS id 0AF7F3858C3A for ; Wed, 6 Sep 2023 10:34:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0AF7F3858C3A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-io1-xd33.google.com with SMTP id ca18e2360f4ac-7951f0e02ecso145851639f.0 for ; Wed, 06 Sep 2023 03:34:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1693996493; x=1694601293; darn=sourceware.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8m2Pu539osgsO4GAmbaieHzTB6hEWG3ErjW1LK8SuGE=; b=RrpfE9lcwgi//7izt683EGJ2tXEppKl7RvQo19q9sUwr1hte7Ql4XdEFyQgCGeItV1 SMtg1Tw8sOSfRmyoVOmH3U8lDCpySSWmAJXtAnbNnkGvOIgDG8rt766+ouSF82JpdjiF hSGkgekBGeU/Ne1Y8Lb7dBL6/2q0cWI3eA0UE9qB8Dclt5qGFDX6ZwDfNBpClGR9uhWT sdIiGC8fLcj/qOsJV+vkrgfAGKeUNImpbtJBcm1PQITgJHDcWWMQtvBsOLcn62ScMfEU A8CnDYNy5eB0NvvSE1rd9Aobg43w59glWAeZ8cC8pC4dlgiMmXTrUw6FilPXH6oUQ2nv xSng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693996493; x=1694601293; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8m2Pu539osgsO4GAmbaieHzTB6hEWG3ErjW1LK8SuGE=; b=WmFgqvmWwHJ+Lpd4QzSAU9m+hGpDRNGSy+I06bhxH4B14UajQ/0Ce6JD6GOhghZqo8 DG6gyeHKvIgat7qlJzjX1nYlYYQLCy319iiUS96elXlFY+VRPyWKREICPDmE/WEfltQ4 x9b2wA0k4Qewmg9Y9YwsuMs37XF+Nd7F7Gous516l/nZVzqUZYZNdQ0RAdsUZkpjYLV8 5d2dDAYHvjSwfMqU29/gFl60AWj09a/1wTtdp4LO6QD0jimCTfDoUbj8fVgyZ0fWg4VO Sv5RXL6T3kOI4OpFfzH8pRuy1DTdksk4KEpb9Rfd8qoyz0xSWLzn/wLYPBSnHeK6hCmB YSdw== X-Gm-Message-State: AOJu0YwgdN7jLyJhvngARo04KrzVJfMZxgWibprxscE0WPPKPG8uwmWz ifMY9R0NJKZA3uGpTDOft6iwUy5x+9AvsYQn8llcmg== X-Google-Smtp-Source: AGHT+IGx0BwhVoyH1zvIOCaxuVlI/iuUSgORUjEnzWcgUqSVNsrM2LDIKOxMz1bkg5vNCZQ2+hbPvJThCi9cTpOzrNw= X-Received: by 2002:a92:cb0e:0:b0:345:a6c5:1ce8 with SMTP id s14-20020a92cb0e000000b00345a6c51ce8mr18714438ilo.14.1693996493091; Wed, 06 Sep 2023 03:34:53 -0700 (PDT) MIME-Version: 1.0 References: <20230823054628.1318615-1-christoph.muellner@vrull.eu> In-Reply-To: From: =?UTF-8?Q?Christoph_M=C3=BCllner?= Date: Wed, 6 Sep 2023 12:34:41 +0200 Message-ID: Subject: Re: [PATCH] riscv: Add support for XTheadBb in string-fz[a,i].h To: Adhemerval Zanella Netto Cc: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , Philipp Tomsich Content-Type: multipart/alternative; boundary="00000000000038caea0604ae4bb8" X-Spam-Status: No, score=-10.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,HTML_MESSAGE,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --00000000000038caea0604ae4bb8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Aug 24, 2023 at 7:21=E2=80=AFPM Adhemerval Zanella Netto < adhemerval.zanella@linaro.org> wrote: > > > On 23/08/23 02:46, Christoph Muellner wrote: > > From: Christoph M=C3=BCllner > > > > XTheadBb has similar instructions like Zbb, which allow optimized > > string processing: > > * th.ff0: find-first zero is a CLZ instruction. > > * th.tstnbz: Similar like orc.b, but with a bit-inverted result. > > > > The instructions are documented here: > > > https://github.com/T-head-Semi/thead-extension-spec/tree/master/xtheadbb > > > > These instructions can be found in the T-Head C906 and the C910. > > > > Tested with the string tests. > > > > Signed-off-by: Christoph M=C3=BCllner > > > LGTM, thanks. Is th.tstnbz available as builtin by chance? > > Reviewed-by: Adhemerval Zanella > Is anything blocking this from getting merged? Thanks, Christoph > > > > --- > > sysdeps/riscv/string-fza.h | 7 ++++++- > > sysdeps/riscv/string-fzi.h | 2 +- > > 2 files changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/sysdeps/riscv/string-fza.h b/sysdeps/riscv/string-fza.h > > index 4429653a00..4958d5d151 100644 > > --- a/sysdeps/riscv/string-fza.h > > +++ b/sysdeps/riscv/string-fza.h > > @@ -19,7 +19,7 @@ > > #ifndef _RISCV_STRING_FZA_H > > #define _RISCV_STRING_FZA_H 1 > > > > -#ifdef __riscv_zbb > > +#if defined __riscv_zbb || defined __riscv_xtheadbb > > /* With bitmap extension we can use orc.b to find all zero bytes. */ > > # include > > # include > > @@ -32,8 +32,13 @@ static __always_inline find_t > > find_zero_all (op_t x) > > { > > find_t r; > > +#ifdef __riscv_xtheadbb > > + asm ("th.tstnbz %0, %1" : "=3Dr" (r) : "r" (x)); > > + return r; > > +#else > > asm ("orc.b %0, %1" : "=3Dr" (r) : "r" (x)); > > return ~r; > > +#endif > > } > > > > /* This function returns 0xff for each byte that is equal between X1 a= nd > > diff --git a/sysdeps/riscv/string-fzi.h b/sysdeps/riscv/string-fzi.h > > index 8f56c378ff..45d6367a10 100644 > > --- a/sysdeps/riscv/string-fzi.h > > +++ b/sysdeps/riscv/string-fzi.h > > @@ -19,7 +19,7 @@ > > #ifndef _STRING_RISCV_FZI_H > > #define _STRING_RISCV_FZI_H 1 > > > > -#ifdef __riscv_zbb > > +#if defined __riscv_zbb || defined __riscv_xtheadbb > > # include > > #else > > /* Without bitmap clz/ctz extensions, it is faster to direct test the > bits > --00000000000038caea0604ae4bb8--