From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by sourceware.org (Postfix) with ESMTPS id DC1C23858C30 for ; Thu, 2 Feb 2023 09:35:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DC1C23858C30 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-lj1-x231.google.com with SMTP id o5so1282397ljj.1 for ; Thu, 02 Feb 2023 01:35:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=JR2L9lRXBzZInAtCF5jlQOL9U6S/KAHdhVQWbtwAozQ=; b=BrXJIn6ZjLc04WXAw4KdixOP6Y9dEXkaOhj0M7qRzjPYK5NswiLXWVAYyRtw/NyxRT pMO/kmXB3t/GZPEMjTB22oZUnwf9m03tvN+ullf+zOqHKcDXJdvUAxGe4ErZpLR1Csyq OO0dTQxiAm24OXmc7nB8XkK3pxi0SrZ2h4zvZSb/iiZcMB9YfYlV5ZHbzNkp2rNUmoqO M9D7QHbKoSxlylIDfGLJv7sbBAaHigpFY1MsbQVZ/jRmfFx6/j2nzRBz1PgMoMuBdNem YIAgh6ZO0XBBAfhTLc4AIl9fYnKjGwL7MGS3XnQdRGIGeEAu6nrtPrOV9+Cd7jHF41Vj cpPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=JR2L9lRXBzZInAtCF5jlQOL9U6S/KAHdhVQWbtwAozQ=; b=Qqz3YCQuj0KlUIz9Ef+irpt0aiE7Y9iA4xTPTsmnZjRlD2Chtt2JgqTnBFRONSpihw ZVoO8biYcf4bZss45gOpOxL89C6BX904pE0wOyaywHbxWYciDeJpveYOYrKgw7zektuw oG7kU5XZb+sa5qMUr64YMMQvjQD5SMJiOjbraTZZD5AJ07ilRXVbYusXjvmSnXvX47OY busAifCaXKO0ZMZu2dKFn45cY5h8ZJmbdbvRNyfyejUbspKmF5aYRZo4bZ85qqEWLjht vOeiGb31yZigz7aWckLdSogvWx03+IuLQvwULMBquFXW+3O5QP990i4aGPL7eNM23pmR i3vw== X-Gm-Message-State: AO0yUKXY4HTxH7F7E2wCtTJLPAtMlHwtFCFiDn8nR0I6DKjI8uIyoYt4 ZEjbw5CXKH6dtvDXPcrvlmGJgWokTKwhcoNyMYgUvQ== X-Google-Smtp-Source: AK7set+STBEqegBl2NdP8jQ8HcP4BjugO4W5WRpUQw5M1lFuDKJlqMkxt/XRKN750T8qKTcOGiQ6vd3A2D76xsS9laU= X-Received: by 2002:a2e:a0d4:0:b0:290:5680:207f with SMTP id f20-20020a2ea0d4000000b002905680207fmr915814ljm.54.1675330510362; Thu, 02 Feb 2023 01:35:10 -0800 (PST) MIME-Version: 1.0 References: <20230201095232.15942-1-slewis@rivosinc.com> <20230201095232.15942-2-slewis@rivosinc.com> <972db14d-390f-f79a-bc56-41afce041257@gmail.com> <877cx1wd5c.fsf@oldenburg.str.redhat.com> In-Reply-To: From: Sergei Lewis Date: Thu, 2 Feb 2023 09:34:59 +0000 Message-ID: Subject: Re: [PATCH 2/2] riscv: vectorised mem* and str* functions To: Jeff Law Cc: Florian Weimer , Jeff Law via Libc-alpha Content-Type: multipart/alternative; boundary="000000000000f3b06305f3b4474e" X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --000000000000f3b06305f3b4474e Content-Type: text/plain; charset="UTF-8" Thank you for the feedback! The idea of this changeset is to provide vectorised implementations for the specific case where someone opts in by explicitly describing the build target as supporting the "V" extension via the build environment when building glibc; where this is not done, existing behaviour is unchanged. I agree that multiarch/ifuncs absolutely has to be the way going forward. Most people in the wild will be using prebuilt binaries, and generally these will be built for the lowest common denominator; we do want to use optimal code where we can detect the target supports it even if this was not known at build time. Moreover, we can reasonably support much more specific optimisation (e.g. specialise on VLENB, determine whether high LMUL or low LMUL / more unrolls is preferable etc) via ifuncs than at compile time. Even once the compiler issues are sorted out, coming up with a robust mechanism for probing the relevant properties of the environment for ifuncs is a significant engineering challenge on RISCV, and one that I expect to be addressing going forward. Once that is in place, it will be possible via ifuncs to make use of the vectorised implementations even in binaries where the V extension was not explicitly target at compile time, if it is available at runtime. Getting the optimised implementations out early for anyone opting in to them will help anyone else working in this space and anyone looking at benchmarks on specific targets; it would also get the code in front of more eyeballs than just mine while I crack on with ifuncs. Certainly, though, if the ifuncs infrastructure has already been created by someone else and can be shared, that would be super helpful, and if the community would prefer I keep these private until that point we can do that. On Wed, Feb 1, 2023 at 5:07 PM Jeff Law wrote: > > > On 2/1/23 09:42, Florian Weimer wrote: > > * Jeff Law via Libc-alpha: > > > >> On 2/1/23 02:52, Sergei Lewis wrote: > >>> Initial implementations of memchr, memcmp, memcpy, memmove, memset, > strchr, > >>> strcmp, strcpy, strlen, strncmp, strncpy, strnlen, strrchr, strspn > >>> targeting the riscv "V" extension, version 1.0 > >>> The vectorised implementations assume VLENB of at least 128 and at > >>> least 32 > >>> registers (as mandated by the "V" extension spec). They also assume > that > >>> VLENB is a power of two which is no larger than the page size, and (as > >>> vectorised code in glibc for other platforms does) that it is safe to > read > >>> past null terminators / buffer ends provided one does not cross a page > >>> boundary. > >>> Signed-off-by: Sergei Lewis > >>> --- > >>> sysdeps/riscv/rv64/rvv/Implies | 2 + > >>> sysdeps/riscv/rv64/rvv/memchr.S | 127 +++++++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/memcmp.S | 93 ++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/memcpy.S | 154 +++++++++++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/memmove.c | 22 ++++ > >>> sysdeps/riscv/rv64/rvv/memset.S | 89 ++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/strchr.S | 92 ++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/strchrnul.c | 22 ++++ > >>> sysdeps/riscv/rv64/rvv/strcmp.S | 108 +++++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/strcpy.S | 72 +++++++++++ > >>> sysdeps/riscv/rv64/rvv/strcspn.c | 22 ++++ > >>> sysdeps/riscv/rv64/rvv/strlen.S | 67 ++++++++++ > >>> sysdeps/riscv/rv64/rvv/strncmp.S | 104 ++++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/strncpy.S | 96 +++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/strnlen.S | 81 +++++++++++++ > >>> sysdeps/riscv/rv64/rvv/strrchr.S | 88 ++++++++++++++ > >>> sysdeps/riscv/rv64/rvv/strspn.S | 189 > +++++++++++++++++++++++++++++ > >> Does this need to be revamped given the recent push to do more with > >> generic code and target specific hooks for mem* and str*? > >> > >> Shouldn't the implementations be in a multiarch directory? I would > >> fully expect we're going to need both a vector and scalar > >> implementation selected by an ifunc. > > > > I think most RISC-V GCC compilers won't have enabled IFUNC support? > > Looking at gcc/config.gcc in GCC 12, I see this: > > > > *-*-linux* | *-*-gnu*) > > case ${target} in > > aarch64*-* | arm*-* | i[34567]86-* | powerpc*-* | s390*-* | > sparc*-* | x86_64-* | loongarch*-*) > > default_gnu_indirect_function=yes > > ;; > > esac > > > > But maybe that's not the right place to look at? > Clearly something we need to fix. > > I'd hesitate to turn on the gcc bits without having the kernel/user > interface settled. There was a proposal that added a syscall to get the > processor capabilities -- I'd asked the authors to reach out to you and > Carlos on whether or not that was acceptable for glibc. I'm not sure if > that happened or not. > > > > > We have an assembler hack to be able to still build IFUNC resolvers > > written in C, but I don't know if this works on RISC-V. > It probably doesn't yet. > > > > > Ideally the GCC defaults would change, too, and well before IFUNCs are > > in common use. > They're not common, but I suspect that'll change in the next ~6 months. > > Jeff > --000000000000f3b06305f3b4474e--