From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [IPv6:2a00:1450:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id 026D83857731 for ; Wed, 23 Aug 2023 05:46:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 026D83857731 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x62a.google.com with SMTP id a640c23a62f3a-99bdeae1d0aso692728066b.1 for ; Tue, 22 Aug 2023 22:46:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1692769596; x=1693374396; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=dXX7vCnnjoRhJSGUAgq3DWn5MOXNQXKdEPhAYQsAK6I=; b=EnAft8VfjW83T534MeqP1m48ycEyZCTJKDmwTLO5YxRN0WXtdhBbNzhcpguNaly1WE iOBdd0YkhW51pLSb55HUSVQxaiisGofVxxM/Kh+illMCnaY5K3M6gmY0OEocjJm4EO0J jvv+1KKcF3VoIokZXzdL37+JA4rBAZXRw9BxZ/JaI9s6bldtnXiMmHNACn3E7X91teJ7 cfSc4LNB2GdgTUg2rP6ZafZujquG1SobGNVxyNdcYAsGVH3pt1B/Kp5E3k8yilOfjHaZ tuMEETwqHD8uuE/vuDLeJcxUmjYGNutuXv0KlFhuw07zSEvR4qGgtX7b4l1L2EhIQlbr T97w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692769596; x=1693374396; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dXX7vCnnjoRhJSGUAgq3DWn5MOXNQXKdEPhAYQsAK6I=; b=YRbcmC9rGvQvYxi3wWbc87WvrGimS186MntJgdg5qIdg3Y4nZymS1mQNvTSKXQXyTv Vsm97QMnizmdCC5D69yXoZDJnplrfw3KwR/0cap28daN+6wNR2oiChKDmSXmcHBRRRXr VWCi9GXD+P3wsgdUIqcsNnTVkfKC2L9Mxqe4It3sN06cAaS32CFwkIyLSImdIxJjOjv0 Lkb54VRsv/h8xV5mKQxo5q8Rbx4p3k3Z5DR57b33wjGycAMfQ+K0D4goAmNpUZdfH1Zq hnEoyHOFsurfLEdMxBdK90b4CqxUGeUXW+M63gCbxgi9uJDk384cJaZmN7Rg5MPRbLN+ pwZA== X-Gm-Message-State: AOJu0Yx1ZHIC6UtNNzS4vBVmZrxv059TYyeAOQmqpVDxm1bhwif0yMyu u1AH79P6X6VwuPysdT7cbqQu+iIfy4HB+r3FUYM= X-Google-Smtp-Source: AGHT+IGizsoWWBP0oug6vilVyHCsPpQ6soncmQIfNvwJkXh11YOlfSiTz0bur0e4lyxoweCRx500GA== X-Received: by 2002:a17:906:530c:b0:99d:7477:189e with SMTP id h12-20020a170906530c00b0099d7477189emr8578188ejo.24.1692769596225; Tue, 22 Aug 2023 22:46:36 -0700 (PDT) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id t17-20020a17090605d100b00993159ce075sm9249732ejt.210.2023.08.22.22.46.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Aug 2023 22:46:35 -0700 (PDT) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , Philipp Tomsich Cc: =?UTF-8?q?Christoph=20M=C3=BCllner?= Subject: [PATCH] riscv: Add support for XTheadBb in string-fz[a,i].h Date: Wed, 23 Aug 2023 07:46:28 +0200 Message-ID: <20230823054628.1318615-1-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: From: Christoph Müllner XTheadBb has similar instructions like Zbb, which allow optimized string processing: * th.ff0: find-first zero is a CLZ instruction. * th.tstnbz: Similar like orc.b, but with a bit-inverted result. The instructions are documented here: https://github.com/T-head-Semi/thead-extension-spec/tree/master/xtheadbb These instructions can be found in the T-Head C906 and the C910. Tested with the string tests. Signed-off-by: Christoph Müllner --- sysdeps/riscv/string-fza.h | 7 ++++++- sysdeps/riscv/string-fzi.h | 2 +- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/sysdeps/riscv/string-fza.h b/sysdeps/riscv/string-fza.h index 4429653a00..4958d5d151 100644 --- a/sysdeps/riscv/string-fza.h +++ b/sysdeps/riscv/string-fza.h @@ -19,7 +19,7 @@ #ifndef _RISCV_STRING_FZA_H #define _RISCV_STRING_FZA_H 1 -#ifdef __riscv_zbb +#if defined __riscv_zbb || defined __riscv_xtheadbb /* With bitmap extension we can use orc.b to find all zero bytes. */ # include # include @@ -32,8 +32,13 @@ static __always_inline find_t find_zero_all (op_t x) { find_t r; +#ifdef __riscv_xtheadbb + asm ("th.tstnbz %0, %1" : "=r" (r) : "r" (x)); + return r; +#else asm ("orc.b %0, %1" : "=r" (r) : "r" (x)); return ~r; +#endif } /* This function returns 0xff for each byte that is equal between X1 and diff --git a/sysdeps/riscv/string-fzi.h b/sysdeps/riscv/string-fzi.h index 8f56c378ff..45d6367a10 100644 --- a/sysdeps/riscv/string-fzi.h +++ b/sysdeps/riscv/string-fzi.h @@ -19,7 +19,7 @@ #ifndef _STRING_RISCV_FZI_H #define _STRING_RISCV_FZI_H 1 -#ifdef __riscv_zbb +#if defined __riscv_zbb || defined __riscv_xtheadbb # include #else /* Without bitmap clz/ctz extensions, it is faster to direct test the bits -- 2.41.0