public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] riscv: Vectorized mem*/str* function
@ 2023-04-21  7:54 Hau Hsu
  2023-04-21  7:54 ` [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time Hau Hsu
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:54 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu

I am submitting version 2 of the patch for adding vectorized mem*/str*
functions for RISC-V. This patch builds upon the previous version (v1)
available at
https://patchwork.sourceware.org/project/glibc/list/?series=17710.

In this version, we have included the __memcmpeq function and set lmul=1
for memcmp, which improves its generality.


Jerry Shih (2):
  riscv: vectorized mem* functions
  riscv: vectorized str* functions

Nick Knight (1):
  riscv: vectorized strchr and strnlen functions

Vincent Chen (1):
  riscv: Enabling vectorized mem*/str* functions in build time

Yun Hsiang (1):
  riscv: add vectorized __memcmpeq

 scripts/build-many-glibcs.py   | 10 ++++
 sysdeps/riscv/preconfigure     | 19 +++++++
 sysdeps/riscv/preconfigure.ac  | 18 +++++++
 sysdeps/riscv/rv32/rvv/Implies |  2 +
 sysdeps/riscv/rv64/rvv/Implies |  2 +
 sysdeps/riscv/rvv/memchr.S     | 63 +++++++++++++++++++++++
 sysdeps/riscv/rvv/memcmp.S     | 71 ++++++++++++++++++++++++++
 sysdeps/riscv/rvv/memcmpeq.S   | 69 +++++++++++++++++++++++++
 sysdeps/riscv/rvv/memcpy.S     | 51 +++++++++++++++++++
 sysdeps/riscv/rvv/memmove.S    | 72 ++++++++++++++++++++++++++
 sysdeps/riscv/rvv/memset.S     | 51 +++++++++++++++++++
 sysdeps/riscv/rvv/strcat.S     | 72 ++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strchr.S     | 53 +++++++++++++++++++
 sysdeps/riscv/rvv/strcmp.S     | 93 ++++++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strcpy.S     | 56 ++++++++++++++++++++
 sysdeps/riscv/rvv/strlen.S     | 54 ++++++++++++++++++++
 sysdeps/riscv/rvv/strncat.S    | 83 ++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strncmp.S    | 85 +++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strncpy.S    | 86 +++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strnlen.S    | 56 ++++++++++++++++++++
 20 files changed, 1066 insertions(+)
 create mode 100644 sysdeps/riscv/rv32/rvv/Implies
 create mode 100644 sysdeps/riscv/rv64/rvv/Implies
 create mode 100644 sysdeps/riscv/rvv/memchr.S
 create mode 100644 sysdeps/riscv/rvv/memcmp.S
 create mode 100644 sysdeps/riscv/rvv/memcmpeq.S
 create mode 100644 sysdeps/riscv/rvv/memcpy.S
 create mode 100644 sysdeps/riscv/rvv/memmove.S
 create mode 100644 sysdeps/riscv/rvv/memset.S
 create mode 100644 sysdeps/riscv/rvv/strcat.S
 create mode 100644 sysdeps/riscv/rvv/strchr.S
 create mode 100644 sysdeps/riscv/rvv/strcmp.S
 create mode 100644 sysdeps/riscv/rvv/strcpy.S
 create mode 100644 sysdeps/riscv/rvv/strlen.S
 create mode 100644 sysdeps/riscv/rvv/strncat.S
 create mode 100644 sysdeps/riscv/rvv/strncmp.S
 create mode 100644 sysdeps/riscv/rvv/strncpy.S
 create mode 100644 sysdeps/riscv/rvv/strnlen.S

-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time
  2023-04-21  7:54 [PATCH v2 0/5] riscv: Vectorized mem*/str* function Hau Hsu
@ 2023-04-21  7:54 ` Hau Hsu
  2023-04-21  7:54 ` [PATCH v2 2/5] riscv: vectorized mem* functions Hau Hsu
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:54 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu

From: Vincent Chen <vincent.chen@sifive.com>

Let the build selects the vectorized mem*/str* functions when it detects
the compiler supports RISC-V V extension and enables it in this build.

We agree that the these vectorized mem*/str* functions should be
selected by IFUNC. Therefore, this patch is intended as a
**temporary solution** to enable reviewers to evaluate the effectiveness
of these vectorized mem*/str* functions.
---
 scripts/build-many-glibcs.py   | 10 ++++++++++
 sysdeps/riscv/preconfigure     | 19 +++++++++++++++++++
 sysdeps/riscv/preconfigure.ac  | 18 ++++++++++++++++++
 sysdeps/riscv/rv32/rvv/Implies |  2 ++
 sysdeps/riscv/rv64/rvv/Implies |  2 ++
 5 files changed, 51 insertions(+)
 create mode 100644 sysdeps/riscv/rv32/rvv/Implies
 create mode 100644 sysdeps/riscv/rv64/rvv/Implies

diff --git a/scripts/build-many-glibcs.py b/scripts/build-many-glibcs.py
index 82f8d97281..2fbb91a028 100755
--- a/scripts/build-many-glibcs.py
+++ b/scripts/build-many-glibcs.py
@@ -381,6 +381,11 @@ class Context(object):
                         variant='rv32imafdc-ilp32d',
                         gcc_cfg=['--with-arch=rv32imafdc', '--with-abi=ilp32d',
                                  '--disable-multilib'])
+        self.add_config(arch='riscv32',
+                        os_name='linux-gnu',
+                        variant='rv32imafdcv-ilp32d',
+                        gcc_cfg=['--with-arch=rv32imafdcv', '--with-abi=ilp32d',
+                                 '--disable-multilib'])
         self.add_config(arch='riscv64',
                         os_name='linux-gnu',
                         variant='rv64imac-lp64',
@@ -396,6 +401,11 @@ class Context(object):
                         variant='rv64imafdc-lp64d',
                         gcc_cfg=['--with-arch=rv64imafdc', '--with-abi=lp64d',
                                  '--disable-multilib'])
+        self.add_config(arch='riscv64',
+                        os_name='linux-gnu',
+                        variant='rv64imafdcv-lp64d',
+                        gcc_cfg=['--with-arch=rv64imafdcv', '--with-abi=lp64d',
+                                 '--disable-multilib'])
         self.add_config(arch='s390x',
                         os_name='linux-gnu',
                         glibcs=[{},
diff --git a/sysdeps/riscv/preconfigure b/sysdeps/riscv/preconfigure
index 4dedf4b0bb..5ddc195b46 100644
--- a/sysdeps/riscv/preconfigure
+++ b/sysdeps/riscv/preconfigure
@@ -7,6 +7,7 @@ riscv*)
     flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'`
     float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'`
     atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2`
+    vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2`
 
     case "$xlen" in
     64 | 32)
@@ -32,6 +33,24 @@ riscv*)
 	;;
     esac
 
+    case "$vector" in
+    __riscv_vector)
+        case "$flen" in
+        64)
+        float_machine=rvv
+        ;;
+        *)
+        # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only.
+        ;;
+        esac
+    ;;
+    *)
+    ;;
+    esac
+
+    { $as_echo "$as_me:${as_lineno-$LINENO}: vector $vector flen $flen float_machine $float_machine" >&5
+$as_echo "$as_me: vector $vector flen $flen float_machine $float_machine" >&6;}
+
     case "$float_abi" in
     soft)
 	abi_flen=0
diff --git a/sysdeps/riscv/preconfigure.ac b/sysdeps/riscv/preconfigure.ac
index a5c30e0dbf..b6b8bb46e4 100644
--- a/sysdeps/riscv/preconfigure.ac
+++ b/sysdeps/riscv/preconfigure.ac
@@ -7,6 +7,7 @@ riscv*)
     flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'`
     float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'`
     atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2`
+    vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2`
 
     case "$xlen" in
     64 | 32)
@@ -32,6 +33,23 @@ riscv*)
 	;;
     esac
 
+    case "$vector" in
+    __riscv_vector)
+        case "$flen" in
+        64)
+        float_machine=rvv
+        ;;
+        *)
+        # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only.
+        ;;
+        esac
+    ;;
+    *)
+    ;;
+    esac
+
+    AC_MSG_NOTICE([vector $vector flen $flen float_machine $float_machine])
+
     case "$float_abi" in
     soft)
 	abi_flen=0
diff --git a/sysdeps/riscv/rv32/rvv/Implies b/sysdeps/riscv/rv32/rvv/Implies
new file mode 100644
index 0000000000..25ce1df222
--- /dev/null
+++ b/sysdeps/riscv/rv32/rvv/Implies
@@ -0,0 +1,2 @@
+riscv/rv32/rvd
+riscv/rvv
diff --git a/sysdeps/riscv/rv64/rvv/Implies b/sysdeps/riscv/rv64/rvv/Implies
new file mode 100644
index 0000000000..9993bb30e3
--- /dev/null
+++ b/sysdeps/riscv/rv64/rvv/Implies
@@ -0,0 +1,2 @@
+riscv/rv64/rvd
+riscv/rvv
-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 2/5] riscv: vectorized mem* functions
  2023-04-21  7:54 [PATCH v2 0/5] riscv: Vectorized mem*/str* function Hau Hsu
  2023-04-21  7:54 ` [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time Hau Hsu
@ 2023-04-21  7:54 ` Hau Hsu
  2023-04-21 12:12   ` Adhemerval Zanella Netto
  2023-04-21  7:54 ` [PATCH v2 3/5] riscv: vectorized str* functions Hau Hsu
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:54 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu

From: Jerry Shih <jerry.shih@sifive.com>

This patch proposes implementations of memchr, memcmp, memcpy, memmove,
and memset that leverage the RISC-V V extension (RVV), version 1.0.
These routines assumes VLEN is at least 32 bits, as is required by all
currently defined vector extensions, and they support arbitrarily large
VLEN. All implementations work for both RV32 and RV64 platforms, and
make no assumptions about page size.
---
 sysdeps/riscv/rvv/memchr.S  | 63 +++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/memcmp.S  | 75 +++++++++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/memcpy.S  | 51 +++++++++++++++++++++++++
 sysdeps/riscv/rvv/memmove.S | 72 +++++++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/memset.S  | 51 +++++++++++++++++++++++++
 5 files changed, 312 insertions(+)
 create mode 100644 sysdeps/riscv/rvv/memchr.S
 create mode 100644 sysdeps/riscv/rvv/memcmp.S
 create mode 100644 sysdeps/riscv/rvv/memcpy.S
 create mode 100644 sysdeps/riscv/rvv/memmove.S
 create mode 100644 sysdeps/riscv/rvv/memset.S

diff --git a/sysdeps/riscv/rvv/memchr.S b/sysdeps/riscv/rvv/memchr.S
new file mode 100644
index 0000000000..6981a9f8b0
--- /dev/null
+++ b/sysdeps/riscv/rvv/memchr.S
@@ -0,0 +1,63 @@
+/* RVV versions memchr.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define iResult a0
+
+#define pSrc a0
+#define iValue a1
+#define iNum a2
+
+#define iVL a3
+#define iTemp a4
+
+#define ELEM_LMUL_SETTING m8
+#define vData v0
+#define vMask v8
+
+ENTRY(memchr)
+
+L(loop):
+    vsetvli zero, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8ff.v vData, (pSrc)
+    /* Find the iValue inside the loaded data.  */
+    vmseq.vx vMask, vData, iValue
+    vfirst.m iTemp, vMask
+
+    /* Skip the loop if we find the matched value.  */
+    bgez iTemp, L(found)
+
+    csrr iVL, vl
+    sub iNum, iNum, iVL
+    add pSrc, pSrc, iVL
+
+    bnez iNum, L(loop)
+
+    li iResult, 0
+    ret
+
+L(found):
+    add iResult, pSrc, iTemp
+    ret
+
+END(memchr)
+libc_hidden_builtin_def (memchr)
diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S
new file mode 100644
index 0000000000..b156ec524c
--- /dev/null
+++ b/sysdeps/riscv/rvv/memcmp.S
@@ -0,0 +1,75 @@
+/* RVV versions memcmp.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define iResult a0
+
+#define pSrc1 a0
+#define pSrc2 a1
+#define iNum a2
+
+#define iVL a3
+#define iTemp a4
+#define iTemp1 a5
+#define iTemp2 a6
+
+#define ELEM_LMUL_SETTING m8
+#define vData1 v0
+#define vData2 v8
+#define vMask v16
+
+ENTRY(memcmp)
+
+L(loop):
+    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8.v vData1, (pSrc1)
+    vle8.v vData2, (pSrc2)
+
+    vmsne.vv vMask, vData1, vData2
+    sub iNum, iNum, iVL
+    vfirst.m iTemp, vMask
+
+    /* Skip the loop if we find the different value between pSrc1 and pSrc2.  */
+    bgez iTemp, L(found)
+
+    add pSrc1, pSrc1, iVL
+    add pSrc2, pSrc2, iVL
+
+    bnez iNum, L(loop)
+
+    li iResult, 0
+    ret
+
+L(found):
+    add pSrc1, pSrc1, iTemp
+    add pSrc2, pSrc2, iTemp
+    lbu iTemp1, 0(pSrc1)
+    lbu iTemp2, 0(pSrc2)
+    sub iResult, iTemp1, iTemp2
+    ret
+
+END(memcmp)
+libc_hidden_builtin_def (memcmp)
+weak_alias (memcmp,bcmp)
+strong_alias (memcmp, __memcmpeq)
+libc_hidden_def (__memcmpeq)
+
diff --git a/sysdeps/riscv/rvv/memcpy.S b/sysdeps/riscv/rvv/memcpy.S
new file mode 100644
index 0000000000..de790fbe51
--- /dev/null
+++ b/sysdeps/riscv/rvv/memcpy.S
@@ -0,0 +1,51 @@
+/* RVV versions memcpy.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pDst a0
+#define pSrc a1
+#define iNum a2
+
+#define iVL a3
+#define pDstPtr a4
+
+#define ELEM_LMUL_SETTING m8
+#define vData v0
+
+ENTRY(memcpy)
+
+    mv pDstPtr, pDst
+
+L(loop):
+    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8.v vData, (pSrc)
+    sub iNum, iNum, iVL
+    add pSrc, pSrc, iVL
+    vse8.v vData, (pDstPtr)
+    add pDstPtr, pDstPtr, iVL
+
+    bnez iNum, L(loop)
+
+    ret
+
+END(memcpy)
+libc_hidden_builtin_def (memcpy)
diff --git a/sysdeps/riscv/rvv/memmove.S b/sysdeps/riscv/rvv/memmove.S
new file mode 100644
index 0000000000..ed12744064
--- /dev/null
+++ b/sysdeps/riscv/rvv/memmove.S
@@ -0,0 +1,72 @@
+/* RVV versions memmove.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pDst a0
+#define pSrc a1
+#define iNum a2
+
+#define iVL a3
+#define pDstPtr a4
+#define pSrcBackwardPtr a5
+#define pDstBackwardPtr a6
+
+#define ELEM_LMUL_SETTING m8
+#define vData v0
+
+ENTRY(memmove)
+
+    mv pDstPtr, pDst
+
+    /* If pSrc is equal or after pDst, all data in pSrc will be loaded before
+       overwrited for the overlapping case. We could use faster `forward-copy`.  */
+    bgeu pSrc, pDst, L(forward_copy_loop)
+    add pSrcBackwardPtr, pSrc, iNum
+    add pDstBackwardPtr, pDst, iNum
+    /* If pDst inside source data range, we need to use `backward_copy_loop` to
+       handle the overlapping issue.  */
+    bltu pDst, pSrcBackwardPtr, L(backward_copy_loop)
+
+L(forward_copy_loop):
+    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8.v vData, (pSrc)
+    sub iNum, iNum, iVL
+    add pSrc, pSrc, iVL
+    vse8.v vData, (pDstPtr)
+    add pDstPtr, pDstPtr, iVL
+
+    bnez iNum, L(forward_copy_loop)
+    ret
+
+L(backward_copy_loop):
+    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+
+    sub pSrcBackwardPtr, pSrcBackwardPtr, iVL
+    vle8.v vData, (pSrcBackwardPtr)
+    sub iNum, iNum, iVL
+    sub pDstBackwardPtr, pDstBackwardPtr, iVL
+    vse8.v vData, (pDstBackwardPtr)
+    bnez iNum, L(backward_copy_loop)
+    ret
+
+END(memmove)
+libc_hidden_builtin_def (memmove)
diff --git a/sysdeps/riscv/rvv/memset.S b/sysdeps/riscv/rvv/memset.S
new file mode 100644
index 0000000000..3a6c3d0afd
--- /dev/null
+++ b/sysdeps/riscv/rvv/memset.S
@@ -0,0 +1,51 @@
+/* RVV versions memset.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pDst a0
+#define iValue a1
+#define iNum a2
+
+#define iVL a3
+#define iTemp a4
+#define pDstPtr a5
+
+#define ELEM_LMUL_SETTING m8
+#define vData v0
+
+ENTRY(memset)
+
+    mv pDstPtr, pDst
+
+    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+    vmv.v.x vData, iValue
+
+L(loop):
+    vse8.v vData, (pDstPtr)
+    sub iNum, iNum, iVL
+    add pDstPtr, pDstPtr, iVL
+    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+    bnez iNum, L(loop)
+
+    ret
+
+END(memset)
+libc_hidden_builtin_def (memset)
-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 3/5] riscv: vectorized str* functions
  2023-04-21  7:54 [PATCH v2 0/5] riscv: Vectorized mem*/str* function Hau Hsu
  2023-04-21  7:54 ` [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time Hau Hsu
  2023-04-21  7:54 ` [PATCH v2 2/5] riscv: vectorized mem* functions Hau Hsu
@ 2023-04-21  7:54 ` Hau Hsu
  2023-04-21 12:14   ` Adhemerval Zanella Netto
  2023-04-21  7:54 ` [PATCH v2 4/5] riscv: vectorized strchr and strnlen functions Hau Hsu
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:54 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu

From: Jerry Shih <jerry.shih@sifive.com>

This patch proposes implementations of strcat, strcmp, strcpy, strlen,
strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV),
version 1.0. These routines assumes VLEN is at least 32 bits, as is
required by all currently defined vector extensions, and they support
arbitrarily large VLEN. All implementations work for both RV32 and RV64
platforms, and make no assumptions about page size.
---
 sysdeps/riscv/rvv/strcat.S  | 72 ++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strcmp.S  | 93 +++++++++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strcpy.S  | 56 ++++++++++++++++++++++
 sysdeps/riscv/rvv/strlen.S  | 54 +++++++++++++++++++++
 sysdeps/riscv/rvv/strncat.S | 83 +++++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strncmp.S | 85 +++++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strncpy.S | 86 ++++++++++++++++++++++++++++++++++
 7 files changed, 529 insertions(+)
 create mode 100644 sysdeps/riscv/rvv/strcat.S
 create mode 100644 sysdeps/riscv/rvv/strcmp.S
 create mode 100644 sysdeps/riscv/rvv/strcpy.S
 create mode 100644 sysdeps/riscv/rvv/strlen.S
 create mode 100644 sysdeps/riscv/rvv/strncat.S
 create mode 100644 sysdeps/riscv/rvv/strncmp.S
 create mode 100644 sysdeps/riscv/rvv/strncpy.S

diff --git a/sysdeps/riscv/rvv/strcat.S b/sysdeps/riscv/rvv/strcat.S
new file mode 100644
index 0000000000..8a7779fd3c
--- /dev/null
+++ b/sysdeps/riscv/rvv/strcat.S
@@ -0,0 +1,72 @@
+/* RVV versions strcat.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pDst a0
+#define pSrc a1
+#define pDstPtr a2
+
+#define iVL a3
+#define iCurrentVL a4
+#define iActiveElemPos a5
+
+#define ELEM_LMUL_SETTING m1
+#define vMask1 v0
+#define vMask2 v1
+#define vStr1 v8
+#define vStr2 v16
+
+ENTRY(strcat)
+
+    mv pDstPtr, pDst
+
+    /* Perform `strlen(dst)`.  */
+L(strlen_loop):
+    vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8ff.v vStr1, (pDstPtr)
+    vmseq.vx vMask1, vStr1, zero
+    csrr iCurrentVL, vl
+    vfirst.m iActiveElemPos, vMask1
+    add pDstPtr, pDstPtr, iCurrentVL
+    bltz iActiveElemPos, L(strlen_loop)
+
+    sub pDstPtr, pDstPtr, iCurrentVL
+    add pDstPtr, pDstPtr, iActiveElemPos
+
+    /* Perform `strcpy(dst, src)`.  */
+L(strcpy_loop):
+    vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8ff.v vStr1, (pSrc)
+    vmseq.vx vMask2, vStr1, zero
+    csrr iCurrentVL, vl
+    vfirst.m iActiveElemPos, vMask2
+    vmsif.m vMask1, vMask2
+    add pSrc, pSrc, iCurrentVL
+    vse8.v vStr1, (pDstPtr), vMask1.t
+    add pDstPtr, pDstPtr, iCurrentVL
+    bltz iActiveElemPos, L(strcpy_loop)
+
+    ret
+
+END(strcat)
+libc_hidden_builtin_def (strcat)
diff --git a/sysdeps/riscv/rvv/strcmp.S b/sysdeps/riscv/rvv/strcmp.S
new file mode 100644
index 0000000000..c5f525bbe9
--- /dev/null
+++ b/sysdeps/riscv/rvv/strcmp.S
@@ -0,0 +1,93 @@
+// Copyright (c) 2023 SiFive, Inc. -- Proprietary and Confidential All Rights
+// Reserved.
+//
+// NOTICE: All information contained herein is, and remains the property of
+// SiFive, Inc. The intellectual and technical concepts contained herein are
+// proprietary to SiFive, Inc. and may be covered by U.S. and Foreign Patents,
+// patents in process, and are protected by trade secret or copyright law.
+//
+// This work may not be copied, modified, re-published, uploaded, executed, or
+// distributed in any way, in any medium, whether in whole or in part, without
+// prior written permission from SiFive, Inc.
+//
+// The copyright notice above does not evidence any actual or intended
+// publication or disclosure of this source code, which includes information
+// that is confidential and/or proprietary, and is a trade secret, of SiFive,
+// Inc.
+//===----------------------------------------------------------------------===//
+
+// Contributed by: Jerry Shih <jerry.shih@sifive.com>
+
+// Prototype:
+// int strcmp(const char *lhs, const char *rhs)
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define iResult a0
+
+#define pStr1 a0
+#define pStr2 a1
+
+#define iVL a2
+#define iTemp1 a3
+#define iTemp2 a4
+
+#define vStr1 v0
+#define vStr2 v8
+#define vMask1 v16
+#define vMask2 v17
+
+ENTRY(strcmp)
+    // lmul=1
+
+L(Loop):
+    vsetvli iVL, zero, e8, m1, ta, ma
+    vle8ff.v vStr1, (pStr1)
+    // check if vStr1[i] == 0
+    vmseq.vx vMask1, vStr1, zero
+
+    vle8ff.v vStr2, (pStr2)
+    // check if vStr1[i] != vStr2[i]
+    vmsne.vv vMask2, vStr1, vStr2
+
+    // find the index x for vStr1[x]==0
+    vfirst.m iTemp1, vMask1
+    // find the index x for vStr1[x]!=vStr2[x]
+    vfirst.m iTemp2, vMask2
+
+    bgez iTemp1, L(check1)
+    bgez iTemp2, L(check2)
+
+    // get the current vl updated by vle8ff.
+    csrr iVL, vl
+    add pStr1, pStr1, iVL
+    add pStr2, pStr2, iVL
+    j L(Loop)
+
+    // iTemp1>=0
+L(check1):
+    bltz iTemp2, 1f
+    blt iTemp2, iTemp1, L(check2)
+1:
+    // iTemp2<0
+    // iTemp2>=0 && iTemp1<iTemp2
+    add pStr1, pStr1, iTemp1
+    add pStr2, pStr2, iTemp1
+    lbu iTemp1, 0(pStr1)
+    lbu iTemp2, 0(pStr2)
+    sub iResult, iTemp1, iTemp2
+    ret
+
+    // iTemp1<0
+    // iTemp2>=0
+L(check2):
+    add pStr1, pStr1, iTemp2
+    add pStr2, pStr2, iTemp2
+    lbu iTemp1, 0(pStr1)
+    lbu iTemp2, 0(pStr2)
+    sub iResult, iTemp1, iTemp2
+    ret
+
+END(strcmp)
+libc_hidden_builtin_def (strcmp)
diff --git a/sysdeps/riscv/rvv/strcpy.S b/sysdeps/riscv/rvv/strcpy.S
new file mode 100644
index 0000000000..8fb754ee23
--- /dev/null
+++ b/sysdeps/riscv/rvv/strcpy.S
@@ -0,0 +1,56 @@
+/* RVV versions strcpy.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pDst a0
+#define pSrc a1
+#define pDstPtr a2
+
+#define iVL a3
+#define iCurrentVL a4
+#define iActiveElemPos a5
+
+#define ELEM_LMUL_SETTING m1
+#define vMask1 v0
+#define vMask2 v1
+#define vStr1 v8
+#define vStr2 v16
+
+ENTRY(strcpy)
+
+    mv pDstPtr, pDst
+
+L(strcpy_loop):
+    vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma
+    vle8ff.v vStr1, (pSrc)
+    vmseq.vx vMask2, vStr1, zero
+    csrr iCurrentVL, vl
+    vfirst.m iActiveElemPos, vMask2
+    vmsif.m vMask1, vMask2
+    add pSrc, pSrc, iCurrentVL
+    vse8.v vStr1, (pDstPtr), vMask1.t
+    add pDstPtr, pDstPtr, iCurrentVL
+    bltz iActiveElemPos, L(strcpy_loop)
+
+    ret
+
+END(strcpy)
+libc_hidden_builtin_def (strcpy)
diff --git a/sysdeps/riscv/rvv/strlen.S b/sysdeps/riscv/rvv/strlen.S
new file mode 100644
index 0000000000..eb456b094b
--- /dev/null
+++ b/sysdeps/riscv/rvv/strlen.S
@@ -0,0 +1,54 @@
+/* RVV versions strlen.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define iResult a0
+#define pStr a0
+#define pCopyStr a1
+#define iVL a2
+#define iCurrentVL a2
+#define iEndOffset a3
+
+#define ELEM_LMUL_SETTING m2
+#define vStr v0
+#define vMaskEnd v2
+
+ENTRY(strlen)
+
+    mv pCopyStr, pStr
+L(loop):
+    vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma
+    vle8ff.v vStr, (pCopyStr)
+    csrr iCurrentVL, vl
+    vmseq.vi vMaskEnd, vStr, 0
+    vfirst.m iEndOffset, vMaskEnd
+    add pCopyStr, pCopyStr, iCurrentVL
+    bltz iEndOffset, L(loop)
+
+    add pStr, pStr, iCurrentVL
+    add pCopyStr, pCopyStr, iEndOffset
+    sub iResult, pCopyStr, iResult
+
+    ret
+
+END(strlen)
+
+libc_hidden_builtin_def (strlen)
diff --git a/sysdeps/riscv/rvv/strncat.S b/sysdeps/riscv/rvv/strncat.S
new file mode 100644
index 0000000000..7847c4f008
--- /dev/null
+++ b/sysdeps/riscv/rvv/strncat.S
@@ -0,0 +1,83 @@
+/* RVV versions strncat.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pDst a0
+#define pSrc a1
+#define iLength a2
+#define pDstPtr a3
+
+#define iVL a4
+#define iCurrentVL a5
+#define iActiveElemPos a6
+
+#define ELEM_LMUL_SETTING m1
+#define vMask1 v0
+#define vMask2 v1
+#define vStr1 v8
+#define vStr2 v16
+
+ENTRY(strncat)
+
+    mv pDstPtr, pDst
+
+    /* the strlen of dst.  */
+L(strlen_loop):
+    vsetvli iVL, zero, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8ff.v vStr1, (pDstPtr)
+    /* find the '\0'.  */
+    vmseq.vx vMask1, vStr1, zero
+    csrr iCurrentVL, vl
+    vfirst.m iActiveElemPos, vMask1
+    add pDstPtr, pDstPtr, iCurrentVL
+    bltz iActiveElemPos, L(strlen_loop)
+
+    sub pDstPtr, pDstPtr, iCurrentVL
+    add pDstPtr, pDstPtr, iActiveElemPos
+
+    /* copy pSrc to pDstPtr.  */
+L(strcpy_loop):
+    vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8ff.v vStr1, (pSrc)
+    vmseq.vx vMask2, vStr1, zero
+    csrr iCurrentVL, vl
+    vfirst.m iActiveElemPos, vMask2
+    vmsif.m vMask1, vMask2
+    add pSrc, pSrc, iCurrentVL
+    sub iLength, iLength, iCurrentVL
+    vse8.v vStr1, (pDstPtr), vMask1.t
+    add pDstPtr, pDstPtr, iCurrentVL
+    beqz iLength, L(fill_zero)
+    bltz iActiveElemPos, L(strcpy_loop)
+
+    ret
+
+L(fill_zero):
+    bgez iActiveElemPos, L(fill_zero_end)
+    sb zero, (pDstPtr)
+
+L(fill_zero_end):
+    ret
+
+END(strncat)
+libc_hidden_builtin_def (strncat)
diff --git a/sysdeps/riscv/rvv/strncmp.S b/sysdeps/riscv/rvv/strncmp.S
new file mode 100644
index 0000000000..168dbb07ce
--- /dev/null
+++ b/sysdeps/riscv/rvv/strncmp.S
@@ -0,0 +1,85 @@
+/* RVV versions strncmp.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http:/*www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define iResult a0
+
+#define pStr1 a0
+#define pStr2 a1
+#define iLength a2
+
+#define iVL a3
+#define iTemp1 a4
+#define iTemp2 a5
+
+#define ELEM_LMUL_SETTING m1
+#define vStr1 v0
+#define vStr2 v4
+#define vMask1 v8
+#define vMask2 v9
+
+ENTRY(strncmp)
+
+    beqz iLength, L(zero_length)
+
+L(loop):
+    vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8ff.v vStr1, (pStr1)
+    /* vStr1[i] == 0.  */
+    vmseq.vx vMask1, vStr1, zero
+
+    vle8ff.v vStr2, (pStr2)
+    /* vStr1[i] != vStr2[i].  */
+    vmsne.vv vMask2, vStr1, vStr2
+
+    csrr iVL, vl
+
+    /* r = mask1 | mask2
+       We could use vfirst.m to get the first zero char or the
+       first different char between str1 and str2.  */
+    vmor.mm vMask1, vMask1, vMask2
+
+    sub iLength, iLength, iVL
+
+    vfirst.m iTemp1, vMask1
+
+    bgez iTemp1, L(end_loop)
+
+    add pStr1, pStr1, iVL
+    add pStr2, pStr2, iVL
+    bnez iLength, L(loop)
+L(end_loop):
+
+    add pStr1, pStr1, iTemp1
+    add pStr2, pStr2, iTemp1
+    lbu iTemp1, 0(pStr1)
+    lbu iTemp2, 0(pStr2)
+
+    sub iResult, iTemp1, iTemp2
+    ret
+
+L(zero_length):
+    li iResult, 0
+    ret
+
+END(strncmp)
+libc_hidden_builtin_def (strncmp)
diff --git a/sysdeps/riscv/rvv/strncpy.S b/sysdeps/riscv/rvv/strncpy.S
new file mode 100644
index 0000000000..e8d9450448
--- /dev/null
+++ b/sysdeps/riscv/rvv/strncpy.S
@@ -0,0 +1,86 @@
+/* RVV versions strncpy.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pDst a0
+#define pSrc a1
+#define iLength a2
+#define pDstPtr a3
+
+#define iVL a4
+#define iCurrentVL a5
+#define iActiveElemPos a6
+#define iTemp a7
+
+#define ELEM_LMUL_SETTING m1
+#define vMask1 v0
+#define vMask2 v1
+#define ZERO_FILL_ELEM_LMUL_SETTING m8
+#define vStr1 v8
+#define vStr2 v16
+
+ENTRY(strncpy)
+
+    mv pDstPtr, pDst
+
+    /* Copy pSrc to pDstPtr.  */
+L(strcpy_loop):
+    vsetvli zero, iLength, e8, ELEM_LMUL_SETTING, ta, ma
+    vle8ff.v vStr1, (pSrc)
+    vmseq.vx vMask2, vStr1, zero
+    csrr iCurrentVL, vl
+    vfirst.m iActiveElemPos, vMask2
+    vmsif.m vMask1, vMask2
+    add pSrc, pSrc, iCurrentVL
+    sub iLength, iLength, iCurrentVL
+    vse8.v vStr1, (pDstPtr), vMask1.t
+    add pDstPtr, pDstPtr, iCurrentVL
+    bgez iActiveElemPos, L(fill_zero)
+    bnez iLength, L(strcpy_loop)
+    ret
+
+    /* Fill the tail zero.  */
+L(fill_zero):
+    /* We already copy the `\0` to dst. But we use `vfirst.m` to
+       get the `index` of `\0` position. We need to adjust `-1`
+       to get the correct remaining iLength for zero filling.  */
+    sub iTemp, iCurrentVL, iActiveElemPos
+    addi iTemp, iTemp, -1
+    add iLength, iLength, iTemp
+    /* Have an earily return for `strlen(src) + 1 == count` case.  */
+    bnez iLength, 1f
+    ret
+1:
+    sub pDstPtr, pDstPtr, iTemp
+    vsetvli zero, iLength, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma
+    vmv.v.x vStr2, zero
+
+L(fill_zero_loop):
+    vsetvli iVL, iLength, e8, ZERO_FILL_ELEM_LMUL_SETTING, ta, ma
+    vse8.v vStr2, (pDstPtr)
+    sub iLength, iLength, iVL
+    add pDstPtr, pDstPtr, iVL
+    bnez iLength, L(fill_zero_loop)
+
+    ret
+
+END(strncpy)
+libc_hidden_builtin_def (strncpy)
-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 4/5] riscv: vectorized strchr and strnlen functions
  2023-04-21  7:54 [PATCH v2 0/5] riscv: Vectorized mem*/str* function Hau Hsu
                   ` (2 preceding siblings ...)
  2023-04-21  7:54 ` [PATCH v2 3/5] riscv: vectorized str* functions Hau Hsu
@ 2023-04-21  7:54 ` Hau Hsu
  2023-04-21  7:54 ` [PATCH v2 5/5] riscv: add vectorized __memcmpeq Hau Hsu
  2023-04-21 12:09 ` [PATCH v2 0/5] riscv: Vectorized mem*/str* function Adhemerval Zanella Netto
  5 siblings, 0 replies; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:54 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu

From: Nick Knight <nick.knight@sifive.com>

This patch proposes implementations of strcat, strcmp, strcpy, strlen,
strncat, strncmp and strncpy that leverage the RISC-V V extension (RVV),
version 1.0. These routines assumes VLEN is at least 32 bits, as is
required by all currently defined vector extensions, and they support
arbitrarily large VLEN. All implementations work for both RV32 and RV64
platforms, and make no assumptions about page size.
---
 sysdeps/riscv/rvv/strchr.S  | 53 +++++++++++++++++++++++++++++++++++
 sysdeps/riscv/rvv/strnlen.S | 56 +++++++++++++++++++++++++++++++++++++
 2 files changed, 109 insertions(+)
 create mode 100644 sysdeps/riscv/rvv/strchr.S
 create mode 100644 sysdeps/riscv/rvv/strnlen.S

diff --git a/sysdeps/riscv/rvv/strchr.S b/sysdeps/riscv/rvv/strchr.S
new file mode 100644
index 0000000000..4a660200c3
--- /dev/null
+++ b/sysdeps/riscv/rvv/strchr.S
@@ -0,0 +1,53 @@
+/* RISC-V multiarch strchr, V-extension version.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Nick Knight <nick.knight@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+
+
+
+ENTRY(strchr)
+0:
+    vsetvli t0, zero, e8, m8, ta, ma
+    vle8ff.v v0, (a0)
+    vmseq.vi v8, v0, 0
+    vmseq.vx v9, v0, a1
+    vfirst.m a2, v8 /* first occurrence of \0 */
+    vfirst.m a3, v9 /* first occurrence of ch */
+    addi a4, a3, 1
+    seqz a4, a4
+    sltu a5, a2, a3
+    or a4, a4, a5
+    beqz a4, 1f /* Found ch, not preceded by \0? */
+    li a6, -1
+    csrr a5, vl
+    add a0, a0, a5
+    beq a2, a6, 0b /* Didn't find \0? */
+    li a0, 0
+    ret
+1:
+    add a0, a0, a3
+    ret
+
+END(strchr)
+weak_alias (strchr, index)
+libc_hidden_builtin_def (strchr)
+
+
diff --git a/sysdeps/riscv/rvv/strnlen.S b/sysdeps/riscv/rvv/strnlen.S
new file mode 100644
index 0000000000..c1ce12baa5
--- /dev/null
+++ b/sysdeps/riscv/rvv/strnlen.S
@@ -0,0 +1,56 @@
+/* RVV versions strnlen.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by: Nick Knight <nick.knight@sifive.com>
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+#define pStr a0
+#define pCopyStr a2
+#define iRetValue a0
+#define iMaxlen a1
+#define iCurrentVL a3
+#define iEndOffset a4
+
+#define ELEM_LMUL_SETTING m1
+#define vStr v0
+#define vMaskEnd v8
+
+ENTRY(__strnlen)
+
+    mv pCopyStr, pStr
+    mv iRetValue, iMaxlen
+L(strnlen_loop):
+    beqz iMaxlen, L(end_strnlen_loop)
+    vsetvli zero, iMaxlen, e8, ELEM_LMUL_SETTING, ta, ma
+    vle8ff.v vStr, (pCopyStr)
+    vmseq.vi vMaskEnd, vStr, 0
+    vfirst.m iEndOffset, vMaskEnd /* first occurence of \0 */
+    csrr iCurrentVL, vl
+    add pCopyStr, pCopyStr, iCurrentVL
+    sub iMaxlen, iMaxlen, iCurrentVL
+    bltz iEndOffset, L(strnlen_loop)
+    add iMaxlen, iMaxlen, iCurrentVL
+    sub iRetValue, iRetValue, iMaxlen
+    add iRetValue, iRetValue, iEndOffset
+L(end_strnlen_loop):
+    ret
+END(__strnlen)
+weak_alias (__strnlen, strnlen)
+libc_hidden_builtin_def (strnlen)
+libc_hidden_builtin_def (__strnlen)
-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 5/5] riscv: add vectorized __memcmpeq
  2023-04-21  7:54 [PATCH v2 0/5] riscv: Vectorized mem*/str* function Hau Hsu
                   ` (3 preceding siblings ...)
  2023-04-21  7:54 ` [PATCH v2 4/5] riscv: vectorized strchr and strnlen functions Hau Hsu
@ 2023-04-21  7:54 ` Hau Hsu
  2023-04-21 12:09 ` [PATCH v2 0/5] riscv: Vectorized mem*/str* function Adhemerval Zanella Netto
  5 siblings, 0 replies; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:54 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu, Yun Hsiang

From: Yun Hsiang <yun.hsiang@sifive.com>

This patch proposes implementations of __memcmpeq that leverage the
RISC-V V extension (RVV), version 1.0. These routines assumes VLEN is at
least 32 bits, as is required by all currently defined vector
extensions, and they support arbitrarily large VLEN. All implementations
work for both RV32 and RV64 platforms, and make no assumptions about
page size.
---
 sysdeps/riscv/rvv/memcmp.S   |  4 ---
 sysdeps/riscv/rvv/memcmpeq.S | 69 ++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+), 4 deletions(-)
 create mode 100644 sysdeps/riscv/rvv/memcmpeq.S

diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S
index b156ec524c..74d8361293 100644
--- a/sysdeps/riscv/rvv/memcmp.S
+++ b/sysdeps/riscv/rvv/memcmp.S
@@ -69,7 +69,3 @@ L(found):
 
 END(memcmp)
 libc_hidden_builtin_def (memcmp)
-weak_alias (memcmp,bcmp)
-strong_alias (memcmp, __memcmpeq)
-libc_hidden_def (__memcmpeq)
-
diff --git a/sysdeps/riscv/rvv/memcmpeq.S b/sysdeps/riscv/rvv/memcmpeq.S
new file mode 100644
index 0000000000..302bca6992
--- /dev/null
+++ b/sysdeps/riscv/rvv/memcmpeq.S
@@ -0,0 +1,69 @@
+/* RVV versions memcmp.  RISC-V version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jerry Shih <jerry.shih@sifive.com>,
+                  Yun Hsiang <yun.hsiang@sifive.com>.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <sys/asm.h>
+
+
+#define iResult a0
+
+#define pSrc1 a0
+#define pSrc2 a1
+#define iNum a2
+
+#define iVL a3
+#define iTemp a4
+
+#define ELEM_LMUL_SETTING m1
+#define vData1 v0
+#define vData2 v8
+#define vMask v16
+
+ENTRY(__memcmpeq)
+
+L(loop):
+    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
+
+    vle8.v vData1, (pSrc1)
+    vle8.v vData2, (pSrc2)
+
+    vmsne.vv vMask, vData1, vData2
+    sub iNum, iNum, iVL
+    vfirst.m iTemp, vMask
+
+    // Skip the loop if we find the different value between pSrc1 and pSrc2.
+    bgez iTemp, L(found)
+
+    add pSrc1, pSrc1, iVL
+    add pSrc2, pSrc2, iVL
+
+    bnez iNum, L(loop)
+
+    li iResult, 0
+    ret
+
+L(found):
+    mv iResult, iVL
+    ret
+
+END(__memcmpeq)
+
+weak_alias (__memcmpeq, bcmp)
+libc_hidden_def (__memcmpeq)
-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] riscv: Vectorized mem*/str* function
  2023-04-21  7:54 [PATCH v2 0/5] riscv: Vectorized mem*/str* function Hau Hsu
                   ` (4 preceding siblings ...)
  2023-04-21  7:54 ` [PATCH v2 5/5] riscv: add vectorized __memcmpeq Hau Hsu
@ 2023-04-21 12:09 ` Adhemerval Zanella Netto
  2023-04-26  3:11   ` Hau Hsu
  5 siblings, 1 reply; 12+ messages in thread
From: Adhemerval Zanella Netto @ 2023-04-21 12:09 UTC (permalink / raw)
  To: Hau Hsu, libc-alpha, hongrong.hsu, jerry.shih, nick.knight,
	kito.cheng, Jeff Law
  Cc: greentime.hu, alice.chan, andrew, vincent.chen



On 21/04/23 04:54, Hau Hsu via Libc-alpha wrote:
> I am submitting version 2 of the patch for adding vectorized mem*/str*
> functions for RISC-V. This patch builds upon the previous version (v1)
> available at
> https://patchwork.sourceware.org/project/glibc/list/?series=17710.
> 
> In this version, we have included the __memcmpeq function and set lmul=1
> for memcmp, which improves its generality.
> 
>

Is this really the idea for RISCV? Because from last iteration with Jeff Law [1]
I understood that RISCV would not move to start providing ISA variants where to 
enable some optimization you will need to either configure with --with-cpu or 
tune the compiler flags.

To explain it better, what you are trying is follow what powerpc does: it has
sysdeps subfolder, each representing and ISA variant, and you only enables it 
by either forcing on configure or automatically with configure.ac.

Now, for aarch64 you only have one ABI variant and each CPU or ISA optimization
(for instance SVE) is enabled *iff* through iFUNC mechanism.  You also have a
further optimization, that x86_64 and s390 implements, where if you are using
an specific ABI level (say x86_64-v2) you can using this specific ABI level
as the base and only provide ifunc variants fro the ABI level higher than you
have defined (it is really not a big deal, it optimizes the code size a bit,
and some intra libc calls). But it is still implemented through multiarch folder 
mechanism, you don't have any sysdep subfolder.

And that's what I have understood from Jeff's last email, that RISCV will 
eventually sort out his kernel functionality query mechanism (either by hwcap 
or by the new syscall), get in on linux-next or linus tree, and then resume the
work to provide both the unaligned and rvv or whatever other extension you want.

But it is really up to you maintainers, you can mimic the powerpc way to enable 
ifunc, which basically adds a lot of boilerplate to include the arch-specific 
variants. The drawback is now you have another build permutation that you need 
to keep testing (as you did by adding another build-many-glibcs.py entry).

[1] https://sourceware.org/pipermail/libc-alpha/2023-March/146824.html

PS: you seemed to have sent multiple copies of the same patch, I will reply only
the ones linked to this cover letter.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/5] riscv: vectorized mem* functions
  2023-04-21  7:54 ` [PATCH v2 2/5] riscv: vectorized mem* functions Hau Hsu
@ 2023-04-21 12:12   ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella Netto @ 2023-04-21 12:12 UTC (permalink / raw)
  To: Hau Hsu, libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen



On 21/04/23 04:54, Hau Hsu via Libc-alpha wrote:
> From: Jerry Shih <jerry.shih@sifive.com>
> 
> This patch proposes implementations of memchr, memcmp, memcpy, memmove,
> and memset that leverage the RISC-V V extension (RVV), version 1.0.
> These routines assumes VLEN is at least 32 bits, as is required by all
> currently defined vector extensions, and they support arbitrarily large
> VLEN. All implementations work for both RV32 and RV64 platforms, and
> make no assumptions about page size.

This is not a full review, just some remark skimming through the patch.

> ---
>  sysdeps/riscv/rvv/memchr.S  | 63 +++++++++++++++++++++++++++++++
>  sysdeps/riscv/rvv/memcmp.S  | 75 +++++++++++++++++++++++++++++++++++++
>  sysdeps/riscv/rvv/memcpy.S  | 51 +++++++++++++++++++++++++
>  sysdeps/riscv/rvv/memmove.S | 72 +++++++++++++++++++++++++++++++++++
>  sysdeps/riscv/rvv/memset.S  | 51 +++++++++++++++++++++++++
>  5 files changed, 312 insertions(+)
>  create mode 100644 sysdeps/riscv/rvv/memchr.S
>  create mode 100644 sysdeps/riscv/rvv/memcmp.S
>  create mode 100644 sysdeps/riscv/rvv/memcpy.S
>  create mode 100644 sysdeps/riscv/rvv/memmove.S
>  create mode 100644 sysdeps/riscv/rvv/memset.S
> 
> diff --git a/sysdeps/riscv/rvv/memchr.S b/sysdeps/riscv/rvv/memchr.S
> new file mode 100644
> index 0000000000..6981a9f8b0
> --- /dev/null
> +++ b/sysdeps/riscv/rvv/memchr.S
> @@ -0,0 +1,63 @@
> +/* RVV versions memchr.  RISC-V version.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +   Contributed by Jerry Shih <jerry.shih@sifive.com>.

We don't use 'Contributed by' anymore.

> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +#define iResult a0
> +
> +#define pSrc a0
> +#define iValue a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define iTemp a4
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData v0
> +#define vMask v8

We avoid to use camelcase, even for assembly implementations.

> +
> +ENTRY(memchr)
> +
> +L(loop):
> +    vsetvli zero, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +
> +    vle8ff.v vData, (pSrc)
> +    /* Find the iValue inside the loaded data.  */
> +    vmseq.vx vMask, vData, iValue
> +    vfirst.m iTemp, vMask
> +
> +    /* Skip the loop if we find the matched value.  */
> +    bgez iTemp, L(found)
> +
> +    csrr iVL, vl
> +    sub iNum, iNum, iVL
> +    add pSrc, pSrc, iVL
> +
> +    bnez iNum, L(loop)
> +
> +    li iResult, 0
> +    ret
> +
> +L(found):
> +    add iResult, pSrc, iTemp
> +    ret
> +
> +END(memchr)
> +libc_hidden_builtin_def (memchr)
> diff --git a/sysdeps/riscv/rvv/memcmp.S b/sysdeps/riscv/rvv/memcmp.S
> new file mode 100644
> index 0000000000..b156ec524c
> --- /dev/null
> +++ b/sysdeps/riscv/rvv/memcmp.S
> @@ -0,0 +1,75 @@
> +/* RVV versions memcmp.  RISC-V version.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +   Contributed by Jerry Shih <jerry.shih@sifive.com>.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +#define iResult a0
> +
> +#define pSrc1 a0
> +#define pSrc2 a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define iTemp a4
> +#define iTemp1 a5
> +#define iTemp2 a6
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData1 v0
> +#define vData2 v8
> +#define vMask v16
> +
> +ENTRY(memcmp)
> +
> +L(loop):
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +
> +    vle8.v vData1, (pSrc1)
> +    vle8.v vData2, (pSrc2)
> +
> +    vmsne.vv vMask, vData1, vData2
> +    sub iNum, iNum, iVL
> +    vfirst.m iTemp, vMask
> +
> +    /* Skip the loop if we find the different value between pSrc1 and pSrc2.  */
> +    bgez iTemp, L(found)
> +
> +    add pSrc1, pSrc1, iVL
> +    add pSrc2, pSrc2, iVL
> +
> +    bnez iNum, L(loop)
> +
> +    li iResult, 0
> +    ret
> +
> +L(found):
> +    add pSrc1, pSrc1, iTemp
> +    add pSrc2, pSrc2, iTemp
> +    lbu iTemp1, 0(pSrc1)
> +    lbu iTemp2, 0(pSrc2)
> +    sub iResult, iTemp1, iTemp2
> +    ret
> +
> +END(memcmp)
> +libc_hidden_builtin_def (memcmp)
> +weak_alias (memcmp,bcmp)
> +strong_alias (memcmp, __memcmpeq)
> +libc_hidden_def (__memcmpeq)
> +
> diff --git a/sysdeps/riscv/rvv/memcpy.S b/sysdeps/riscv/rvv/memcpy.S
> new file mode 100644
> index 0000000000..de790fbe51
> --- /dev/null
> +++ b/sysdeps/riscv/rvv/memcpy.S
> @@ -0,0 +1,51 @@
> +/* RVV versions memcpy.  RISC-V version.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +   Contributed by Jerry Shih <jerry.shih@sifive.com>.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +#define pDst a0
> +#define pSrc a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define pDstPtr a4
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData v0
> +
> +ENTRY(memcpy)
> +
> +    mv pDstPtr, pDst
> +
> +L(loop):
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +
> +    vle8.v vData, (pSrc)
> +    sub iNum, iNum, iVL
> +    add pSrc, pSrc, iVL
> +    vse8.v vData, (pDstPtr)
> +    add pDstPtr, pDstPtr, iVL
> +
> +    bnez iNum, L(loop)
> +
> +    ret
> +
> +END(memcpy)
> +libc_hidden_builtin_def (memcpy)
> diff --git a/sysdeps/riscv/rvv/memmove.S b/sysdeps/riscv/rvv/memmove.S
> new file mode 100644
> index 0000000000..ed12744064
> --- /dev/null
> +++ b/sysdeps/riscv/rvv/memmove.S
> @@ -0,0 +1,72 @@
> +/* RVV versions memmove.  RISC-V version.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +   Contributed by Jerry Shih <jerry.shih@sifive.com>.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +#define pDst a0
> +#define pSrc a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define pDstPtr a4
> +#define pSrcBackwardPtr a5
> +#define pDstBackwardPtr a6
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData v0
> +
> +ENTRY(memmove)
> +
> +    mv pDstPtr, pDst
> +
> +    /* If pSrc is equal or after pDst, all data in pSrc will be loaded before
> +       overwrited for the overlapping case. We could use faster `forward-copy`.  */
> +    bgeu pSrc, pDst, L(forward_copy_loop)
> +    add pSrcBackwardPtr, pSrc, iNum
> +    add pDstBackwardPtr, pDst, iNum
> +    /* If pDst inside source data range, we need to use `backward_copy_loop` to
> +       handle the overlapping issue.  */
> +    bltu pDst, pSrcBackwardPtr, L(backward_copy_loop)
> +
> +L(forward_copy_loop):
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +
> +    vle8.v vData, (pSrc)
> +    sub iNum, iNum, iVL
> +    add pSrc, pSrc, iVL
> +    vse8.v vData, (pDstPtr)
> +    add pDstPtr, pDstPtr, iVL
> +
> +    bnez iNum, L(forward_copy_loop)
> +    ret
> +
> +L(backward_copy_loop):
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +
> +    sub pSrcBackwardPtr, pSrcBackwardPtr, iVL
> +    vle8.v vData, (pSrcBackwardPtr)
> +    sub iNum, iNum, iVL
> +    sub pDstBackwardPtr, pDstBackwardPtr, iVL
> +    vse8.v vData, (pDstBackwardPtr)
> +    bnez iNum, L(backward_copy_loop)
> +    ret
> +
> +END(memmove)
> +libc_hidden_builtin_def (memmove)
> diff --git a/sysdeps/riscv/rvv/memset.S b/sysdeps/riscv/rvv/memset.S
> new file mode 100644
> index 0000000000..3a6c3d0afd
> --- /dev/null
> +++ b/sysdeps/riscv/rvv/memset.S
> @@ -0,0 +1,51 @@
> +/* RVV versions memset.  RISC-V version.
> +   Copyright (C) 2023 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +   Contributed by Jerry Shih <jerry.shih@sifive.com>.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +#define pDst a0
> +#define iValue a1
> +#define iNum a2
> +
> +#define iVL a3
> +#define iTemp a4
> +#define pDstPtr a5
> +
> +#define ELEM_LMUL_SETTING m8
> +#define vData v0
> +
> +ENTRY(memset)
> +
> +    mv pDstPtr, pDst
> +
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +    vmv.v.x vData, iValue
> +
> +L(loop):
> +    vse8.v vData, (pDstPtr)
> +    sub iNum, iNum, iVL
> +    add pDstPtr, pDstPtr, iVL
> +    vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma
> +    bnez iNum, L(loop)
> +
> +    ret
> +
> +END(memset)
> +libc_hidden_builtin_def (memset)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 3/5] riscv: vectorized str* functions
  2023-04-21  7:54 ` [PATCH v2 3/5] riscv: vectorized str* functions Hau Hsu
@ 2023-04-21 12:14   ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 12+ messages in thread
From: Adhemerval Zanella Netto @ 2023-04-21 12:14 UTC (permalink / raw)
  To: Hau Hsu, libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen



On 21/04/23 04:54, Hau Hsu via Libc-alpha wrote:
> diff --git a/sysdeps/riscv/rvv/strcmp.S b/sysdeps/riscv/rvv/strcmp.S
> new file mode 100644
> index 0000000000..c5f525bbe9
> --- /dev/null
> +++ b/sysdeps/riscv/rvv/strcmp.S
> @@ -0,0 +1,93 @@
> +// Copyright (c) 2023 SiFive, Inc. -- Proprietary and Confidential All Rights
> +// Reserved.
> +//

This is not acceptable by glibc, it requires to follow the 'Copyright and license'
as decribed by [1].

Also, no C99 one line comment.

[1] https://sourceware.org/glibc/wiki/Contribution%20checklist#Copyright_and_license


> +// NOTICE: All information contained herein is, and remains the property of
> +// SiFive, Inc. The intellectual and technical concepts contained herein are
> +// proprietary to SiFive, Inc. and may be covered by U.S. and Foreign Patents,
> +// patents in process, and are protected by trade secret or copyright law.
> +//
> +// This work may not be copied, modified, re-published, uploaded, executed, or
> +// distributed in any way, in any medium, whether in whole or in part, without
> +// prior written permission from SiFive, Inc.
> +//
> +// The copyright notice above does not evidence any actual or intended
> +// publication or disclosure of this source code, which includes information
> +// that is confidential and/or proprietary, and is a trade secret, of SiFive,
> +// Inc.
> +//===----------------------------------------------------------------------===//
> +
> +// Contributed by: Jerry Shih <jerry.shih@sifive.com>
> +
> +// Prototype:
> +// int strcmp(const char *lhs, const char *rhs)
> +
> +#include <sysdep.h>
> +#include <sys/asm.h>
> +
> +#define iResult a0
> +
> +#define pStr1 a0
> +#define pStr2 a1
> +
> +#define iVL a2
> +#define iTemp1 a3
> +#define iTemp2 a4
> +
> +#define vStr1 v0
> +#define vStr2 v8
> +#define vMask1 v16
> +#define vMask2 v17
> +
> +ENTRY(strcmp)
> +    // lmul=1
> +
> +L(Loop):
> +    vsetvli iVL, zero, e8, m1, ta, ma
> +    vle8ff.v vStr1, (pStr1)
> +    // check if vStr1[i] == 0
> +    vmseq.vx vMask1, vStr1, zero
> +
> +    vle8ff.v vStr2, (pStr2)
> +    // check if vStr1[i] != vStr2[i]
> +    vmsne.vv vMask2, vStr1, vStr2
> +
> +    // find the index x for vStr1[x]==0
> +    vfirst.m iTemp1, vMask1
> +    // find the index x for vStr1[x]!=vStr2[x]
> +    vfirst.m iTemp2, vMask2
> +
> +    bgez iTemp1, L(check1)
> +    bgez iTemp2, L(check2)
> +
> +    // get the current vl updated by vle8ff.
> +    csrr iVL, vl
> +    add pStr1, pStr1, iVL
> +    add pStr2, pStr2, iVL
> +    j L(Loop)
> +
> +    // iTemp1>=0
> +L(check1):
> +    bltz iTemp2, 1f
> +    blt iTemp2, iTemp1, L(check2)
> +1:
> +    // iTemp2<0
> +    // iTemp2>=0 && iTemp1<iTemp2
> +    add pStr1, pStr1, iTemp1
> +    add pStr2, pStr2, iTemp1
> +    lbu iTemp1, 0(pStr1)
> +    lbu iTemp2, 0(pStr2)
> +    sub iResult, iTemp1, iTemp2
> +    ret
> +
> +    // iTemp1<0
> +    // iTemp2>=0
> +L(check2):
> +    add pStr1, pStr1, iTemp2
> +    add pStr2, pStr2, iTemp2
> +    lbu iTemp1, 0(pStr1)
> +    lbu iTemp2, 0(pStr2)
> +    sub iResult, iTemp1, iTemp2
> +    ret
> +
> +END(strcmp)
> +libc_hidden_builtin_def (strcmp)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] riscv: Vectorized mem*/str* function
  2023-04-21 12:09 ` [PATCH v2 0/5] riscv: Vectorized mem*/str* function Adhemerval Zanella Netto
@ 2023-04-26  3:11   ` Hau Hsu
  0 siblings, 0 replies; 12+ messages in thread
From: Hau Hsu @ 2023-04-26  3:11 UTC (permalink / raw)
  To: Adhemerval Zanella Netto
  Cc: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, Kito Cheng,
	Jeff Law, Greentime Hu, Alice Chan, andrew, vincent.chen,
	Yi-Hsiu Hsu

[-- Attachment #1: Type: text/plain, Size: 3397 bytes --]

Hi Adhemerval,

Thanks for the comment.
The patchset is mainly for the providing a default RVV implementation.
We know that the mechanism to choose ISA variant is not determined yet.
The first patch is a workaround to build Glibc, but won't be the final version.
This decouples the how Glibc get RISC-V hardware information and the RVV function implementation.
As the final decision has been made, we will send another patchset to use that mechanism, 
with the RVV function implementation all together as the version to merge.

I'll send another patchset to fix other obvious mistakes base on your review.
Sorry for sending multiple copies of the same patches. 
I am not familiar with the system and had some SMTP config errors.

Thank you!

Best,

Hau Hsu
Software Engineer
hau.hsu@sifive.com

CC Yi-Hsiu Hsu



> Adhemerval Zanella Netto <adhemerval.zanella@linaro.org> 於 2023年4月21日 下午8:09 寫道:
> 
> 
> 
> On 21/04/23 04:54, Hau Hsu via Libc-alpha wrote:
>> I am submitting version 2 of the patch for adding vectorized mem*/str*
>> functions for RISC-V. This patch builds upon the previous version (v1)
>> available at
>> https://patchwork.sourceware.org/project/glibc/list/?series=17710.
>> 
>> In this version, we have included the __memcmpeq function and set lmul=1
>> for memcmp, which improves its generality.
>> 
>> 
> 
> Is this really the idea for RISCV? Because from last iteration with Jeff Law [1]
> I understood that RISCV would not move to start providing ISA variants where to 
> enable some optimization you will need to either configure with --with-cpu or 
> tune the compiler flags.
> 
> To explain it better, what you are trying is follow what powerpc does: it has
> sysdeps subfolder, each representing and ISA variant, and you only enables it 
> by either forcing on configure or automatically with configure.ac.
> 
> Now, for aarch64 you only have one ABI variant and each CPU or ISA optimization
> (for instance SVE) is enabled *iff* through iFUNC mechanism.  You also have a
> further optimization, that x86_64 and s390 implements, where if you are using
> an specific ABI level (say x86_64-v2) you can using this specific ABI level
> as the base and only provide ifunc variants fro the ABI level higher than you
> have defined (it is really not a big deal, it optimizes the code size a bit,
> and some intra libc calls). But it is still implemented through multiarch folder 
> mechanism, you don't have any sysdep subfolder.
> 
> And that's what I have understood from Jeff's last email, that RISCV will 
> eventually sort out his kernel functionality query mechanism (either by hwcap 
> or by the new syscall), get in on linux-next or linus tree, and then resume the
> work to provide both the unaligned and rvv or whatever other extension you want.
> 
> But it is really up to you maintainers, you can mimic the powerpc way to enable 
> ifunc, which basically adds a lot of boilerplate to include the arch-specific 
> variants. The drawback is now you have another build permutation that you need 
> to keep testing (as you did by adding another build-many-glibcs.py entry).
> 
> [1] https://sourceware.org/pipermail/libc-alpha/2023-March/146824.html
> 
> PS: you seemed to have sent multiple copies of the same patch, I will reply only
> the ones linked to this cover letter.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time
@ 2023-04-21  7:29 Hau Hsu
  0 siblings, 0 replies; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:29 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu

From: Vincent Chen <vincent.chen@sifive.com>

Let the build selects the vectorized mem*/str* functions when it detects
the compiler supports RISC-V V extension and enables it in this build.

We agree that the these vectorized mem*/str* functions should be
selected by IFUNC. Therefore, this patch is intended as a
**temporary solution** to enable reviewers to evaluate the effectiveness
of these vectorized mem*/str* functions.
---
 scripts/build-many-glibcs.py   | 10 ++++++++++
 sysdeps/riscv/preconfigure     | 19 +++++++++++++++++++
 sysdeps/riscv/preconfigure.ac  | 18 ++++++++++++++++++
 sysdeps/riscv/rv32/rvv/Implies |  2 ++
 sysdeps/riscv/rv64/rvv/Implies |  2 ++
 5 files changed, 51 insertions(+)
 create mode 100644 sysdeps/riscv/rv32/rvv/Implies
 create mode 100644 sysdeps/riscv/rv64/rvv/Implies

diff --git a/scripts/build-many-glibcs.py b/scripts/build-many-glibcs.py
index 82f8d97281..2fbb91a028 100755
--- a/scripts/build-many-glibcs.py
+++ b/scripts/build-many-glibcs.py
@@ -381,6 +381,11 @@ class Context(object):
                         variant='rv32imafdc-ilp32d',
                         gcc_cfg=['--with-arch=rv32imafdc', '--with-abi=ilp32d',
                                  '--disable-multilib'])
+        self.add_config(arch='riscv32',
+                        os_name='linux-gnu',
+                        variant='rv32imafdcv-ilp32d',
+                        gcc_cfg=['--with-arch=rv32imafdcv', '--with-abi=ilp32d',
+                                 '--disable-multilib'])
         self.add_config(arch='riscv64',
                         os_name='linux-gnu',
                         variant='rv64imac-lp64',
@@ -396,6 +401,11 @@ class Context(object):
                         variant='rv64imafdc-lp64d',
                         gcc_cfg=['--with-arch=rv64imafdc', '--with-abi=lp64d',
                                  '--disable-multilib'])
+        self.add_config(arch='riscv64',
+                        os_name='linux-gnu',
+                        variant='rv64imafdcv-lp64d',
+                        gcc_cfg=['--with-arch=rv64imafdcv', '--with-abi=lp64d',
+                                 '--disable-multilib'])
         self.add_config(arch='s390x',
                         os_name='linux-gnu',
                         glibcs=[{},
diff --git a/sysdeps/riscv/preconfigure b/sysdeps/riscv/preconfigure
index 4dedf4b0bb..5ddc195b46 100644
--- a/sysdeps/riscv/preconfigure
+++ b/sysdeps/riscv/preconfigure
@@ -7,6 +7,7 @@ riscv*)
     flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'`
     float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'`
     atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2`
+    vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2`
 
     case "$xlen" in
     64 | 32)
@@ -32,6 +33,24 @@ riscv*)
 	;;
     esac
 
+    case "$vector" in
+    __riscv_vector)
+        case "$flen" in
+        64)
+        float_machine=rvv
+        ;;
+        *)
+        # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only.
+        ;;
+        esac
+    ;;
+    *)
+    ;;
+    esac
+
+    { $as_echo "$as_me:${as_lineno-$LINENO}: vector $vector flen $flen float_machine $float_machine" >&5
+$as_echo "$as_me: vector $vector flen $flen float_machine $float_machine" >&6;}
+
     case "$float_abi" in
     soft)
 	abi_flen=0
diff --git a/sysdeps/riscv/preconfigure.ac b/sysdeps/riscv/preconfigure.ac
index a5c30e0dbf..b6b8bb46e4 100644
--- a/sysdeps/riscv/preconfigure.ac
+++ b/sysdeps/riscv/preconfigure.ac
@@ -7,6 +7,7 @@ riscv*)
     flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'`
     float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'`
     atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2`
+    vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2`
 
     case "$xlen" in
     64 | 32)
@@ -32,6 +33,23 @@ riscv*)
 	;;
     esac
 
+    case "$vector" in
+    __riscv_vector)
+        case "$flen" in
+        64)
+        float_machine=rvv
+        ;;
+        *)
+        # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only.
+        ;;
+        esac
+    ;;
+    *)
+    ;;
+    esac
+
+    AC_MSG_NOTICE([vector $vector flen $flen float_machine $float_machine])
+
     case "$float_abi" in
     soft)
 	abi_flen=0
diff --git a/sysdeps/riscv/rv32/rvv/Implies b/sysdeps/riscv/rv32/rvv/Implies
new file mode 100644
index 0000000000..25ce1df222
--- /dev/null
+++ b/sysdeps/riscv/rv32/rvv/Implies
@@ -0,0 +1,2 @@
+riscv/rv32/rvd
+riscv/rvv
diff --git a/sysdeps/riscv/rv64/rvv/Implies b/sysdeps/riscv/rv64/rvv/Implies
new file mode 100644
index 0000000000..9993bb30e3
--- /dev/null
+++ b/sysdeps/riscv/rv64/rvv/Implies
@@ -0,0 +1,2 @@
+riscv/rv64/rvd
+riscv/rvv
-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time
       [not found] <20230421072733.14047-1-hau.hsu@sifive.com>
@ 2023-04-21  7:27 ` Hau Hsu
  0 siblings, 0 replies; 12+ messages in thread
From: Hau Hsu @ 2023-04-21  7:27 UTC (permalink / raw)
  To: libc-alpha, hongrong.hsu, jerry.shih, nick.knight, kito.cheng
  Cc: greentime.hu, alice.chan, andrew, vincent.chen, hau.hsu

From: Vincent Chen <vincent.chen@sifive.com>

Let the build selects the vectorized mem*/str* functions when it detects
the compiler supports RISC-V V extension and enables it in this build.

We agree that the these vectorized mem*/str* functions should be
selected by IFUNC. Therefore, this patch is intended as a
**temporary solution** to enable reviewers to evaluate the effectiveness
of these vectorized mem*/str* functions.
---
 scripts/build-many-glibcs.py   | 10 ++++++++++
 sysdeps/riscv/preconfigure     | 19 +++++++++++++++++++
 sysdeps/riscv/preconfigure.ac  | 18 ++++++++++++++++++
 sysdeps/riscv/rv32/rvv/Implies |  2 ++
 sysdeps/riscv/rv64/rvv/Implies |  2 ++
 5 files changed, 51 insertions(+)
 create mode 100644 sysdeps/riscv/rv32/rvv/Implies
 create mode 100644 sysdeps/riscv/rv64/rvv/Implies

diff --git a/scripts/build-many-glibcs.py b/scripts/build-many-glibcs.py
index 82f8d97281..2fbb91a028 100755
--- a/scripts/build-many-glibcs.py
+++ b/scripts/build-many-glibcs.py
@@ -381,6 +381,11 @@ class Context(object):
                         variant='rv32imafdc-ilp32d',
                         gcc_cfg=['--with-arch=rv32imafdc', '--with-abi=ilp32d',
                                  '--disable-multilib'])
+        self.add_config(arch='riscv32',
+                        os_name='linux-gnu',
+                        variant='rv32imafdcv-ilp32d',
+                        gcc_cfg=['--with-arch=rv32imafdcv', '--with-abi=ilp32d',
+                                 '--disable-multilib'])
         self.add_config(arch='riscv64',
                         os_name='linux-gnu',
                         variant='rv64imac-lp64',
@@ -396,6 +401,11 @@ class Context(object):
                         variant='rv64imafdc-lp64d',
                         gcc_cfg=['--with-arch=rv64imafdc', '--with-abi=lp64d',
                                  '--disable-multilib'])
+        self.add_config(arch='riscv64',
+                        os_name='linux-gnu',
+                        variant='rv64imafdcv-lp64d',
+                        gcc_cfg=['--with-arch=rv64imafdcv', '--with-abi=lp64d',
+                                 '--disable-multilib'])
         self.add_config(arch='s390x',
                         os_name='linux-gnu',
                         glibcs=[{},
diff --git a/sysdeps/riscv/preconfigure b/sysdeps/riscv/preconfigure
index 4dedf4b0bb..5ddc195b46 100644
--- a/sysdeps/riscv/preconfigure
+++ b/sysdeps/riscv/preconfigure
@@ -7,6 +7,7 @@ riscv*)
     flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'`
     float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'`
     atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2`
+    vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2`
 
     case "$xlen" in
     64 | 32)
@@ -32,6 +33,24 @@ riscv*)
 	;;
     esac
 
+    case "$vector" in
+    __riscv_vector)
+        case "$flen" in
+        64)
+        float_machine=rvv
+        ;;
+        *)
+        # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only.
+        ;;
+        esac
+    ;;
+    *)
+    ;;
+    esac
+
+    { $as_echo "$as_me:${as_lineno-$LINENO}: vector $vector flen $flen float_machine $float_machine" >&5
+$as_echo "$as_me: vector $vector flen $flen float_machine $float_machine" >&6;}
+
     case "$float_abi" in
     soft)
 	abi_flen=0
diff --git a/sysdeps/riscv/preconfigure.ac b/sysdeps/riscv/preconfigure.ac
index a5c30e0dbf..b6b8bb46e4 100644
--- a/sysdeps/riscv/preconfigure.ac
+++ b/sysdeps/riscv/preconfigure.ac
@@ -7,6 +7,7 @@ riscv*)
     flen=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_flen \(.*\)/\1/p'`
     float_abi=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | sed -n 's/^#define __riscv_float_abi_\([^ ]*\) .*/\1/p'`
     atomic=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_atomic' | cut -d' ' -f2`
+    vector=`$CC $CFLAGS $CPPFLAGS -E -dM -xc /dev/null | grep '#define __riscv_vector' | cut -d' ' -f2`
 
     case "$xlen" in
     64 | 32)
@@ -32,6 +33,23 @@ riscv*)
 	;;
     esac
 
+    case "$vector" in
+    __riscv_vector)
+        case "$flen" in
+        64)
+        float_machine=rvv
+        ;;
+        *)
+        # V 1.0 spec requires both F and D extensions, but this may be an older version. Degrade to scalar only.
+        ;;
+        esac
+    ;;
+    *)
+    ;;
+    esac
+
+    AC_MSG_NOTICE([vector $vector flen $flen float_machine $float_machine])
+
     case "$float_abi" in
     soft)
 	abi_flen=0
diff --git a/sysdeps/riscv/rv32/rvv/Implies b/sysdeps/riscv/rv32/rvv/Implies
new file mode 100644
index 0000000000..25ce1df222
--- /dev/null
+++ b/sysdeps/riscv/rv32/rvv/Implies
@@ -0,0 +1,2 @@
+riscv/rv32/rvd
+riscv/rvv
diff --git a/sysdeps/riscv/rv64/rvv/Implies b/sysdeps/riscv/rv64/rvv/Implies
new file mode 100644
index 0000000000..9993bb30e3
--- /dev/null
+++ b/sysdeps/riscv/rv64/rvv/Implies
@@ -0,0 +1,2 @@
+riscv/rv64/rvd
+riscv/rvv
-- 
2.37.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-04-26  3:11 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-21  7:54 [PATCH v2 0/5] riscv: Vectorized mem*/str* function Hau Hsu
2023-04-21  7:54 ` [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time Hau Hsu
2023-04-21  7:54 ` [PATCH v2 2/5] riscv: vectorized mem* functions Hau Hsu
2023-04-21 12:12   ` Adhemerval Zanella Netto
2023-04-21  7:54 ` [PATCH v2 3/5] riscv: vectorized str* functions Hau Hsu
2023-04-21 12:14   ` Adhemerval Zanella Netto
2023-04-21  7:54 ` [PATCH v2 4/5] riscv: vectorized strchr and strnlen functions Hau Hsu
2023-04-21  7:54 ` [PATCH v2 5/5] riscv: add vectorized __memcmpeq Hau Hsu
2023-04-21 12:09 ` [PATCH v2 0/5] riscv: Vectorized mem*/str* function Adhemerval Zanella Netto
2023-04-26  3:11   ` Hau Hsu
  -- strict thread matches above, loose matches on Subject: below --
2023-04-21  7:29 [PATCH v2 1/5] riscv: Enabling vectorized mem*/str* functions in build time Hau Hsu
     [not found] <20230421072733.14047-1-hau.hsu@sifive.com>
2023-04-21  7:27 ` Hau Hsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).