From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) by sourceware.org (Postfix) with ESMTPS id DADFA3858D33 for ; Wed, 1 Feb 2023 17:18:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DADFA3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-ot1-x329.google.com with SMTP id r34-20020a05683044a200b0068d4a8a8d2dso144502otv.12 for ; Wed, 01 Feb 2023 09:18:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=ufBgHBlRBL5lA49PhrUu1LfPr3+qxgfUFMzgaLppe0Q=; b=fz5EmxMY0AbDMPcyzOor/jz5xAMwFu311pxGEkmD4+HjrnEphM2Yw0A5/jdBTLdAT3 Sk9UEiSV/PFf3weXpdvzwVWljmoIS/K74IKdNxHBpAGyCaSaaSeM0yiisx4n5Sl9SLfO q/UQ9UnciNd1+N8xDX9/04JvLPaz/ngBpXeXLO3rWfVKGFw4kg8udotG0DtpOOgRQ8pH J9i1cjCEwCrb/RnhYiff6fxj4RpJ8qBDM1eSEaCgwpjB0MIiqABk2VTPL991QfCgMd52 NN/93zPRcTqn8JFpJNWUQQjkct7FHuchezo4SoNk/Rj0gYYAGW4g7x5Yp77oVGLY//9r Pw4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ufBgHBlRBL5lA49PhrUu1LfPr3+qxgfUFMzgaLppe0Q=; b=CagL7keWw9aPWDAax4SinC7AQ0jcuw+405MjDNC1MhiuBI7W3Fh6BWjxT1Q2PIP+g4 QuAoICjiTGN1yIij8dHGVJ71OtKanbRJBHqVRT4JHS+pgBGhrHVf4PQFZGGKIGlsSfog 3xIy4vuyLafZWoY+aBecTkGv2L0eVufk3ucwJ5IdyZfK6xp8JMlz+NPfK58zk89ANaGy X57qdnnRpRBVMichRIX45LwE4xSFZNmJv4ayM0eMIXaH3aFZEgQXFHBCo4qSfMl54dK2 nS+K6CBOd4wMgLoxS6kfiCQLkjM3ZiqrOFnzp+28w6RtiGJn13uPr6ohc90PfN9fOtd8 0JJA== X-Gm-Message-State: AO0yUKVpGvTVJMjlfOLiUAoL/nw4QcyRCNAWUdRHo8+mCBwOJ2eTby36 F3rdknNPm0JAOAYUyd/K8iugZA== X-Google-Smtp-Source: AK7set+Q/F8UFTa6ACs2kOUN7nx8PajdIh0umb55f71QKX04pDwFYkYJr+7+ENMWcYp2tfUAfRHk5Q== X-Received: by 2002:a05:6830:3741:b0:68b:cb3c:c924 with SMTP id bm1-20020a056830374100b0068bcb3cc924mr1534595otb.10.1675271881228; Wed, 01 Feb 2023 09:18:01 -0800 (PST) Received: from ?IPV6:2804:1b3:a7c2:1887:5d31:5c36:95c5:9e2e? ([2804:1b3:a7c2:1887:5d31:5c36:95c5:9e2e]) by smtp.gmail.com with ESMTPSA id j24-20020a9d7698000000b0068bb6043362sm6264505otl.11.2023.02.01.09.17.58 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 01 Feb 2023 09:18:00 -0800 (PST) Message-ID: Date: Wed, 1 Feb 2023 14:17:57 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.0 Subject: Re: [PATCH 2/2] riscv: vectorised mem* and str* functions Content-Language: en-US To: Jeff Law , Sergei Lewis , libc-alpha@sourceware.org, Florian Weimer References: <20230201095232.15942-1-slewis@rivosinc.com> <20230201095232.15942-2-slewis@rivosinc.com> <972db14d-390f-f79a-bc56-41afce041257@gmail.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <972db14d-390f-f79a-bc56-41afce041257@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 01/02/23 12:33, Jeff Law via Libc-alpha wrote: > > > On 2/1/23 02:52, Sergei Lewis wrote: >> Initial implementations of memchr, memcmp, memcpy, memmove, memset, strchr, >> strcmp, strcpy, strlen, strncmp, strncpy, strnlen, strrchr, strspn >> targeting the riscv "V" extension, version 1.0 >> >> The vectorised implementations assume VLENB of at least 128 and at least 32 >> registers (as mandated by the "V" extension spec). They also assume that >> VLENB is a power of two which is no larger than the page size, and (as >> vectorised code in glibc for other platforms does) that it is safe to read >> past null terminators / buffer ends provided one does not cross a page >> boundary. >> >> Signed-off-by: Sergei Lewis >> --- >>   sysdeps/riscv/rv64/rvv/Implies     |   2 + >>   sysdeps/riscv/rv64/rvv/memchr.S    | 127 +++++++++++++++++++ >>   sysdeps/riscv/rv64/rvv/memcmp.S    |  93 ++++++++++++++ >>   sysdeps/riscv/rv64/rvv/memcpy.S    | 154 +++++++++++++++++++++++ >>   sysdeps/riscv/rv64/rvv/memmove.c   |  22 ++++ >>   sysdeps/riscv/rv64/rvv/memset.S    |  89 ++++++++++++++ >>   sysdeps/riscv/rv64/rvv/strchr.S    |  92 ++++++++++++++ >>   sysdeps/riscv/rv64/rvv/strchrnul.c |  22 ++++ >>   sysdeps/riscv/rv64/rvv/strcmp.S    | 108 +++++++++++++++++ >>   sysdeps/riscv/rv64/rvv/strcpy.S    |  72 +++++++++++ >>   sysdeps/riscv/rv64/rvv/strcspn.c   |  22 ++++ >>   sysdeps/riscv/rv64/rvv/strlen.S    |  67 ++++++++++ >>   sysdeps/riscv/rv64/rvv/strncmp.S   | 104 ++++++++++++++++ >>   sysdeps/riscv/rv64/rvv/strncpy.S   |  96 +++++++++++++++ >>   sysdeps/riscv/rv64/rvv/strnlen.S   |  81 +++++++++++++ >>   sysdeps/riscv/rv64/rvv/strrchr.S   |  88 ++++++++++++++ >>   sysdeps/riscv/rv64/rvv/strspn.S    | 189 +++++++++++++++++++++++++++++ > Does this need to be revamped given the recent push to do more with generic code and target specific hooks for mem* and str*? > I should be doable, although I think it might require some more hooks since afaiu RISCV vector instruction does not have the concept of address update with his vector approach (as my generic string routines does by using the 'vector' type op_t). So I am not sure it is a well fit for RISCV vec approach. > Shouldn't the implementations be in a multiarch directory?  I would fully expect we're going to need both a vector and scalar implementation selected by an ifunc. If so, we will need both IFUNC support on riscv (which as Florian pointed out it is not default); and a way to discover at runtime the processor/kernel capabilities. It does not seem that RISCV has the expected hwcap support, the UAPI only define COMPAT_HWCAP_ISA_* that also does not seems to match the vector or bitmanip extensions. Does it have a instruction to query for such information, something like cpuid (x86) or midr_elX (aarch64)? It would be interesting to add a ifunc variant for bitmanip as well. > > I'm happy to pass along the current bits from VRULL which put that infrastructure in place.  I just haven't had the time to look at revamping their assembly implementations for the new generic+hooks scheme. I just sent an updated version [1] where I added bitmanip optimization [2]. So now with a recent gcc (I tested with gcc 13 with upstream qemu), the string routines should use ctz/clz/orc.b. I did not add support for xthread [3], but it should be doable. Once I get this upstream, I have a WIP to revamp the memmove/memcpy/memset/memcmp as well. At least for memcmp it should use bitmanip as well. [1] https://patchwork.sourceware.org/project/glibc/list/?series=16622 [2] https://patchwork.sourceware.org/project/glibc/patch/20230201170406.303978-21-adhemerval.zanella@linaro.org/ [3] https://lore.kernel.org/all/20220906122243.1243354-1-christoph.muellner@vrull.eu/