* [PATCH 1/2] LoongArch: config: Added HAVE_LOONGARCH_VEC_ASM.
@ 2023-07-10 9:41 caiyinyu
2023-07-10 9:41 ` [PATCH 2/2] LoongArch: Add vector implementation for _dl_runtime_resolve caiyinyu
0 siblings, 1 reply; 5+ messages in thread
From: caiyinyu @ 2023-07-10 9:41 UTC (permalink / raw)
To: libc-alpha; +Cc: adhemerval.zanella, xry111, caiyinyu
This patch checks if assembler supports vector instructions to
generate LASX/LSX code or not, and then define HAVE_LOONGARCH_VEC_ASM macro
We have added support for vector instructions in binutils-2.41
See:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=75b2f521b101d974354f6ce9ed7c054d8b2e3b7a
commit 75b2f521b101d974354f6ce9ed7c054d8b2e3b7a
Author: mengqinggang <mengqinggang@loongson.cn>
Date: Thu Jun 22 10:35:28 2023 +0800
LoongArch: gas: Add lsx and lasx instructions support
gas/ChangeLog:
* config/tc-loongarch.c (md_parse_option): Add lsx and lasx option.
(loongarch_after_parse_args): Add lsx and lasx option.
opcodes/ChangeLog:
* loongarch-opc.c (struct loongarch_ase): Add lsx and lasx
instructions.
---
config.h.in | 5 +++++
sysdeps/loongarch/configure | 28 ++++++++++++++++++++++++++++
sysdeps/loongarch/configure.ac | 15 +++++++++++++++
3 files changed, 48 insertions(+)
diff --git a/config.h.in b/config.h.in
index 44a34072a4..0dedc124f7 100644
--- a/config.h.in
+++ b/config.h.in
@@ -141,6 +141,11 @@
/* LOONGARCH floating-point ABI for ld.so. */
#undef LOONGARCH_ABI_FRLEN
+/* Assembler support LoongArch LASX/LSX vector instructions.
+ This macro becomes obsolete when glibc increased the minimum
+ required version of GNU 'binutils' to 2.41 or later. */
+#define HAVE_LOONGARCH_VEC_ASM 0
+
/* Linux specific: minimum supported kernel version. */
#undef __LINUX_KERNEL_VERSION
diff --git a/sysdeps/loongarch/configure b/sysdeps/loongarch/configure
index 52bd08a91e..b090e43a24 100644
--- a/sysdeps/loongarch/configure
+++ b/sysdeps/loongarch/configure
@@ -101,3 +101,31 @@ fi
$as_echo "$libc_cv_loongarch_cmodel_medium" >&6; }
config_vars="$config_vars
have-cmodel-medium = $libc_cv_loongarch_cmodel_medium"
+
+# Check if asm support vector instructions.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for vector support in assembler" >&5
+$as_echo_n "checking for vector support in assembler... " >&6; }
+if ${libc_cv_loongarch_vec_asm+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ cat > conftest.s <<\EOF
+ vld $vr0, $sp, 0
+EOF
+if { ac_try='${CC-cc} -c $CFLAGS conftest.s -o conftest 1>&5'
+ { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+ (eval $ac_try) 2>&5
+ ac_status=$?
+ $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+ test $ac_status = 0; }; }; then
+ libc_cv_loongarch_vec_asm=yes
+else
+ libc_cv_loongarch_vec_asm=no
+fi
+rm -f conftest*
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_loongarch_vec_asm" >&5
+$as_echo "$libc_cv_loongarch_vec_asm" >&6; }
+if test $libc_cv_loongarch_vec_asm = yes; then
+ $as_echo "#define HAVE_LOONGARCH_VEC_ASM 1" >>confdefs.h
+
+fi
diff --git a/sysdeps/loongarch/configure.ac b/sysdeps/loongarch/configure.ac
index cdd95fa512..39efccfd8f 100644
--- a/sysdeps/loongarch/configure.ac
+++ b/sysdeps/loongarch/configure.ac
@@ -62,3 +62,18 @@ AC_CACHE_CHECK(whether $CC supports option -mcmodel=medium,
libc_cv_loongarch_cmodel_medium=no
fi])
LIBC_CONFIG_VAR([have-cmodel-medium], [$libc_cv_loongarch_cmodel_medium])
+
+# Check if asm support vector instructions.
+AC_CACHE_CHECK(for vector support in assembler, libc_cv_loongarch_vec_asm, [dnl
+cat > conftest.s <<\EOF
+ vld $vr0, $sp, 0
+EOF
+if AC_TRY_COMMAND(${CC-cc} -c $CFLAGS conftest.s -o conftest 1>&AS_MESSAGE_LOG_FD); then
+ libc_cv_loongarch_vec_asm=yes
+else
+ libc_cv_loongarch_vec_asm=no
+fi
+rm -f conftest*])
+if test $libc_cv_loongarch_vec_asm = yes; then
+ AC_DEFINE(HAVE_LOONGARCH_VEC_ASM)
+fi
--
2.31.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] LoongArch: Add vector implementation for _dl_runtime_resolve.
2023-07-10 9:41 [PATCH 1/2] LoongArch: config: Added HAVE_LOONGARCH_VEC_ASM caiyinyu
@ 2023-07-10 9:41 ` caiyinyu
2023-07-10 10:11 ` Xi Ruoyao
0 siblings, 1 reply; 5+ messages in thread
From: caiyinyu @ 2023-07-10 9:41 UTC (permalink / raw)
To: libc-alpha; +Cc: adhemerval.zanella, xry111, caiyinyu
---
sysdeps/loongarch/dl-machine.h | 13 +-
sysdeps/loongarch/dl-trampoline.S | 84 +++--------
sysdeps/loongarch/dl-trampoline.h | 131 ++++++++++++++++++
sysdeps/loongarch/ldsodefs.h | 1 +
sysdeps/loongarch/sys/asm.h | 2 +
sysdeps/loongarch/sys/regdef.h | 18 +++
.../unix/sysv/linux/loongarch/bits/hwcap.h | 37 +++++
.../unix/sysv/linux/loongarch/cpu-features.h | 29 ++++
8 files changed, 246 insertions(+), 69 deletions(-)
create mode 100644 sysdeps/loongarch/dl-trampoline.h
create mode 100644 sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h
create mode 100644 sysdeps/unix/sysv/linux/loongarch/cpu-features.h
diff --git a/sysdeps/loongarch/dl-machine.h b/sysdeps/loongarch/dl-machine.h
index e217d37c4b..02ce17852c 100644
--- a/sysdeps/loongarch/dl-machine.h
+++ b/sysdeps/loongarch/dl-machine.h
@@ -270,6 +270,10 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
/* If using PLTs, fill in the first two entries of .got.plt. */
if (l->l_info[DT_JMPREL])
{
+#if HAVE_LOONGARCH_VEC_ASM
+ extern void _dl_runtime_resolve_lasx (void) attribute_hidden;
+ extern void _dl_runtime_resolve_lsx (void) attribute_hidden;
+#endif
extern void _dl_runtime_resolve (void) attribute_hidden;
extern void _dl_runtime_profile (void) attribute_hidden;
@@ -296,7 +300,14 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
/* This function will get called to fix up the GOT entry
indicated by the offset on the stack, and then jump to
the resolved address. */
- gotplt[0] = (ElfW (Addr)) & _dl_runtime_resolve;
+#if HAVE_LOONGARCH_VEC_ASM
+ if (SUPPORT_LASX)
+ gotplt[0] = (ElfW(Addr)) &_dl_runtime_resolve_lasx;
+ else if (SUPPORT_LSX)
+ gotplt[0] = (ElfW(Addr)) &_dl_runtime_resolve_lsx;
+ else
+#endif
+ gotplt[0] = (ElfW(Addr)) &_dl_runtime_resolve;
}
gotplt[1] = (ElfW (Addr)) l;
}
diff --git a/sysdeps/loongarch/dl-trampoline.S b/sysdeps/loongarch/dl-trampoline.S
index ed9ec0901c..2a561b7136 100644
--- a/sysdeps/loongarch/dl-trampoline.S
+++ b/sysdeps/loongarch/dl-trampoline.S
@@ -19,77 +19,25 @@
#include <sysdep.h>
#include <sys/asm.h>
-#include "dl-link.h"
-
-/* Assembler veneer called from the PLT header code for lazy loading.
- The PLT header passes its own args in t0-t2. */
-#ifdef __loongarch_soft_float
-#define FRAME_SIZE (-((-10 * SZREG) & ALMASK))
-#else
-#define FRAME_SIZE (-((-10 * SZREG - 8 * SZFREG) & ALMASK))
+#if HAVE_LOONGARCH_VEC_ASM
+#define USE_LASX
+#define _dl_runtime_resolve _dl_runtime_resolve_lasx
+#include "dl-trampoline.h"
+#undef FRAME_SIZE
+#undef USE_LASX
+#undef _dl_runtime_resolve
+
+#define USE_LSX
+#define _dl_runtime_resolve _dl_runtime_resolve_lsx
+#include "dl-trampoline.h"
+#undef FRAME_SIZE
+#undef USE_LSX
+#undef _dl_runtime_resolve
#endif
-ENTRY (_dl_runtime_resolve)
-
- /* Save arguments to stack. */
- ADDI sp, sp, -FRAME_SIZE
- REG_S ra, sp, 9*SZREG
- REG_S a0, sp, 1*SZREG
- REG_S a1, sp, 2*SZREG
- REG_S a2, sp, 3*SZREG
- REG_S a3, sp, 4*SZREG
- REG_S a4, sp, 5*SZREG
- REG_S a5, sp, 6*SZREG
- REG_S a6, sp, 7*SZREG
- REG_S a7, sp, 8*SZREG
-
-#ifndef __loongarch_soft_float
- FREG_S fa0, sp, 10*SZREG + 0*SZFREG
- FREG_S fa1, sp, 10*SZREG + 1*SZFREG
- FREG_S fa2, sp, 10*SZREG + 2*SZFREG
- FREG_S fa3, sp, 10*SZREG + 3*SZFREG
- FREG_S fa4, sp, 10*SZREG + 4*SZFREG
- FREG_S fa5, sp, 10*SZREG + 5*SZFREG
- FREG_S fa6, sp, 10*SZREG + 6*SZFREG
- FREG_S fa7, sp, 10*SZREG + 7*SZFREG
-#endif
-
- /* Update .got.plt and obtain runtime address of callee */
- SLLI a1, t1, 1
- or a0, t0, zero
- ADD a1, a1, t1
- la a2, _dl_fixup
- jirl ra, a2, 0
- or t1, v0, zero
-
- /* Restore arguments from stack. */
- REG_L ra, sp, 9*SZREG
- REG_L a0, sp, 1*SZREG
- REG_L a1, sp, 2*SZREG
- REG_L a2, sp, 3*SZREG
- REG_L a3, sp, 4*SZREG
- REG_L a4, sp, 5*SZREG
- REG_L a5, sp, 6*SZREG
- REG_L a6, sp, 7*SZREG
- REG_L a7, sp, 8*SZREG
-
-#ifndef __loongarch_soft_float
- FREG_L fa0, sp, 10*SZREG + 0*SZFREG
- FREG_L fa1, sp, 10*SZREG + 1*SZFREG
- FREG_L fa2, sp, 10*SZREG + 2*SZFREG
- FREG_L fa3, sp, 10*SZREG + 3*SZFREG
- FREG_L fa4, sp, 10*SZREG + 4*SZFREG
- FREG_L fa5, sp, 10*SZREG + 5*SZFREG
- FREG_L fa6, sp, 10*SZREG + 6*SZFREG
- FREG_L fa7, sp, 10*SZREG + 7*SZFREG
-#endif
-
- ADDI sp, sp, FRAME_SIZE
-
- /* Invoke the callee. */
- jirl zero, t1, 0
-END (_dl_runtime_resolve)
+#include "dl-trampoline.h"
+#include "dl-link.h"
ENTRY (_dl_runtime_profile)
/* LoongArch we get called with:
diff --git a/sysdeps/loongarch/dl-trampoline.h b/sysdeps/loongarch/dl-trampoline.h
new file mode 100644
index 0000000000..d2833488df
--- /dev/null
+++ b/sysdeps/loongarch/dl-trampoline.h
@@ -0,0 +1,131 @@
+/* PLT trampolines.
+ Copyright (C) 2022-2023 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library. If not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* Assembler veneer called from the PLT header code for lazy loading.
+ The PLT header passes its own args in t0-t2. */
+#ifndef __loongarch_soft_float
+# ifdef USE_LASX
+# define FRAME_SIZE (-((-9 * SZREG - 8 * SZFREG - 8 * SZXREG) & ALMASK))
+# elif defined USE_LSX
+# define FRAME_SIZE (-((-9 * SZREG - 8 * SZFREG - 8 * SZVREG) & ALMASK))
+# else
+# define FRAME_SIZE (-((-9 * SZREG - 8 * SZFREG) & ALMASK))
+# endif
+#else
+# define FRAME_SIZE (-((-9 * SZREG) & ALMASK))
+#endif
+
+ENTRY (_dl_runtime_resolve)
+
+ /* Save arguments to stack. */
+ ADDI sp, sp, -FRAME_SIZE
+
+ REG_S ra, sp, 0*SZREG
+ REG_S a0, sp, 1*SZREG
+ REG_S a1, sp, 2*SZREG
+ REG_S a2, sp, 3*SZREG
+ REG_S a3, sp, 4*SZREG
+ REG_S a4, sp, 5*SZREG
+ REG_S a5, sp, 6*SZREG
+ REG_S a6, sp, 7*SZREG
+ REG_S a7, sp, 8*SZREG
+
+#ifndef __loongarch_soft_float
+ FREG_S fa0, sp, 9*SZREG + 0*SZFREG
+ FREG_S fa1, sp, 9*SZREG + 1*SZFREG
+ FREG_S fa2, sp, 9*SZREG + 2*SZFREG
+ FREG_S fa3, sp, 9*SZREG + 3*SZFREG
+ FREG_S fa4, sp, 9*SZREG + 4*SZFREG
+ FREG_S fa5, sp, 9*SZREG + 5*SZFREG
+ FREG_S fa6, sp, 9*SZREG + 6*SZFREG
+ FREG_S fa7, sp, 9*SZREG + 7*SZFREG
+#ifdef USE_LASX
+ xvst xr0, sp, 9*SZREG + 8*SZFREG + 0*SZXREG
+ xvst xr1, sp, 9*SZREG + 8*SZFREG + 1*SZXREG
+ xvst xr2, sp, 9*SZREG + 8*SZFREG + 2*SZXREG
+ xvst xr3, sp, 9*SZREG + 8*SZFREG + 3*SZXREG
+ xvst xr4, sp, 9*SZREG + 8*SZFREG + 4*SZXREG
+ xvst xr5, sp, 9*SZREG + 8*SZFREG + 5*SZXREG
+ xvst xr6, sp, 9*SZREG + 8*SZFREG + 6*SZXREG
+ xvst xr7, sp, 9*SZREG + 8*SZFREG + 7*SZXREG
+#elif defined USE_LSX
+ vst vr0, sp, 9*SZREG + 8*SZFREG + 0*SZVREG
+ vst vr1, sp, 9*SZREG + 8*SZFREG + 1*SZVREG
+ vst vr2, sp, 9*SZREG + 8*SZFREG + 2*SZVREG
+ vst vr3, sp, 9*SZREG + 8*SZFREG + 3*SZVREG
+ vst vr4, sp, 9*SZREG + 8*SZFREG + 4*SZVREG
+ vst vr5, sp, 9*SZREG + 8*SZFREG + 5*SZVREG
+ vst vr6, sp, 9*SZREG + 8*SZFREG + 6*SZVREG
+ vst vr7, sp, 9*SZREG + 8*SZFREG + 7*SZVREG
+#endif
+#endif
+
+ /* Update .got.plt and obtain runtime address of callee */
+ SLLI a1, t1, 1
+ or a0, t0, zero
+ ADD a1, a1, t1
+ la a2, _dl_fixup
+ jirl ra, a2, 0
+ or t1, v0, zero
+
+ /* Restore arguments from stack. */
+ REG_L ra, sp, 0*SZREG
+ REG_L a0, sp, 1*SZREG
+ REG_L a1, sp, 2*SZREG
+ REG_L a2, sp, 3*SZREG
+ REG_L a3, sp, 4*SZREG
+ REG_L a4, sp, 5*SZREG
+ REG_L a5, sp, 6*SZREG
+ REG_L a6, sp, 7*SZREG
+ REG_L a7, sp, 8*SZREG
+
+#ifndef __loongarch_soft_float
+ FREG_L fa0, sp, 9*SZREG + 0*SZFREG
+ FREG_L fa1, sp, 9*SZREG + 1*SZFREG
+ FREG_L fa2, sp, 9*SZREG + 2*SZFREG
+ FREG_L fa3, sp, 9*SZREG + 3*SZFREG
+ FREG_L fa4, sp, 9*SZREG + 4*SZFREG
+ FREG_L fa5, sp, 9*SZREG + 5*SZFREG
+ FREG_L fa6, sp, 9*SZREG + 6*SZFREG
+ FREG_L fa7, sp, 9*SZREG + 7*SZFREG
+#ifdef USE_LASX
+ xvld xr0, sp, 9*SZREG + 8*SZFREG + 0*SZXREG
+ xvld xr1, sp, 9*SZREG + 8*SZFREG + 1*SZXREG
+ xvld xr2, sp, 9*SZREG + 8*SZFREG + 2*SZXREG
+ xvld xr3, sp, 9*SZREG + 8*SZFREG + 3*SZXREG
+ xvld xr4, sp, 9*SZREG + 8*SZFREG + 4*SZXREG
+ xvld xr5, sp, 9*SZREG + 8*SZFREG + 5*SZXREG
+ xvld xr6, sp, 9*SZREG + 8*SZFREG + 6*SZXREG
+ xvld xr7, sp, 9*SZREG + 8*SZFREG + 7*SZXREG
+#elif defined USE_LSX
+ vld vr0, sp, 9*SZREG + 8*SZFREG + 0*SZVREG
+ vld vr1, sp, 9*SZREG + 8*SZFREG + 1*SZVREG
+ vld vr2, sp, 9*SZREG + 8*SZFREG + 2*SZVREG
+ vld vr3, sp, 9*SZREG + 8*SZFREG + 3*SZVREG
+ vld vr4, sp, 9*SZREG + 8*SZFREG + 4*SZVREG
+ vld vr5, sp, 9*SZREG + 8*SZFREG + 5*SZVREG
+ vld vr6, sp, 9*SZREG + 8*SZFREG + 6*SZVREG
+ vld vr7, sp, 9*SZREG + 8*SZFREG + 7*SZVREG
+#endif
+#endif
+
+ ADDI sp, sp, FRAME_SIZE
+
+ /* Invoke the callee. */
+ jirl zero, t1, 0
+END (_dl_runtime_resolve)
diff --git a/sysdeps/loongarch/ldsodefs.h b/sysdeps/loongarch/ldsodefs.h
index a8ef803aec..3b7c4ab83d 100644
--- a/sysdeps/loongarch/ldsodefs.h
+++ b/sysdeps/loongarch/ldsodefs.h
@@ -20,6 +20,7 @@
#define _LOONGARCH_LDSODEFS_H 1
#include <elf.h>
+#include <cpu-features.h>
struct La_loongarch_regs;
struct La_loongarch_retval;
diff --git a/sysdeps/loongarch/sys/asm.h b/sysdeps/loongarch/sys/asm.h
index 0bb430bb05..d1a279b8fb 100644
--- a/sysdeps/loongarch/sys/asm.h
+++ b/sysdeps/loongarch/sys/asm.h
@@ -25,6 +25,8 @@
/* Macros to handle different pointer/register sizes for 32/64-bit code. */
#define SZREG 8
#define SZFREG 8
+#define SZVREG 16
+#define SZXREG 32
#define REG_L ld.d
#define REG_S st.d
#define SRLI srli.d
diff --git a/sysdeps/loongarch/sys/regdef.h b/sysdeps/loongarch/sys/regdef.h
index 91810f5e8e..5100f36d24 100644
--- a/sysdeps/loongarch/sys/regdef.h
+++ b/sysdeps/loongarch/sys/regdef.h
@@ -90,4 +90,22 @@
#define fs6 $f30
#define fs7 $f31
+#define vr0 $vr0
+#define vr1 $vr1
+#define vr2 $vr2
+#define vr3 $vr3
+#define vr4 $vr4
+#define vr5 $vr5
+#define vr6 $vr6
+#define vr7 $vr7
+
+#define xr0 $xr0
+#define xr1 $xr1
+#define xr2 $xr2
+#define xr3 $xr3
+#define xr4 $xr4
+#define xr5 $xr5
+#define xr6 $xr6
+#define xr7 $xr7
+
#endif /* _SYS_REGDEF_H */
diff --git a/sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h b/sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h
new file mode 100644
index 0000000000..5104b69cbc
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/loongarch/bits/hwcap.h
@@ -0,0 +1,37 @@
+/* Defines for bits in AT_HWCAP. LoongArch64 Linux version.
+ Copyright (C) 2022 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#if !defined (_SYS_AUXV_H)
+# error "Never include <bits/hwcap.h> directly; use <sys/auxv.h> instead."
+#endif
+
+/* The following must match the kernel's <asm/hwcap.h>. */
+/* HWCAP flags */
+#define HWCAP_LOONGARCH_CPUCFG (1 << 0)
+#define HWCAP_LOONGARCH_LAM (1 << 1)
+#define HWCAP_LOONGARCH_UAL (1 << 2)
+#define HWCAP_LOONGARCH_FPU (1 << 3)
+#define HWCAP_LOONGARCH_LSX (1 << 4)
+#define HWCAP_LOONGARCH_LASX (1 << 5)
+#define HWCAP_LOONGARCH_CRC32 (1 << 6)
+#define HWCAP_LOONGARCH_COMPLEX (1 << 7)
+#define HWCAP_LOONGARCH_CRYPTO (1 << 8)
+#define HWCAP_LOONGARCH_LVZ (1 << 9)
+#define HWCAP_LOONGARCH_LBT_X86 (1 << 10)
+#define HWCAP_LOONGARCH_LBT_ARM (1 << 11)
+#define HWCAP_LOONGARCH_LBT_MIPS (1 << 12)
diff --git a/sysdeps/unix/sysv/linux/loongarch/cpu-features.h b/sysdeps/unix/sysv/linux/loongarch/cpu-features.h
new file mode 100644
index 0000000000..e371e13b15
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/loongarch/cpu-features.h
@@ -0,0 +1,29 @@
+/* Initialize CPU feature data. LoongArch64 version.
+ This file is part of the GNU C Library.
+ Copyright (C) 2022 Free Software Foundation, Inc.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef _CPU_FEATURES_LOONGARCH64_H
+#define _CPU_FEATURES_LOONGARCH64_H
+
+#include <sys/auxv.h>
+
+#define SUPPORT_UAL (GLRO (dl_hwcap) & HWCAP_LOONGARCH_UAL)
+#define SUPPORT_LSX (GLRO (dl_hwcap) & HWCAP_LOONGARCH_LSX)
+#define SUPPORT_LASX (GLRO (dl_hwcap) & HWCAP_LOONGARCH_LASX)
+
+#endif /* _CPU_FEATURES_LOONGARCH64_H */
+
--
2.31.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] LoongArch: Add vector implementation for _dl_runtime_resolve.
2023-07-10 9:41 ` [PATCH 2/2] LoongArch: Add vector implementation for _dl_runtime_resolve caiyinyu
@ 2023-07-10 10:11 ` Xi Ruoyao
2023-07-11 1:54 ` caiyinyu
0 siblings, 1 reply; 5+ messages in thread
From: Xi Ruoyao @ 2023-07-10 10:11 UTC (permalink / raw)
To: caiyinyu, libc-alpha; +Cc: adhemerval.zanella
On Mon, 2023-07-10 at 17:41 +0800, caiyinyu wrote:
> +#ifndef __loongarch_soft_float
> + FREG_S fa0, sp, 9*SZREG + 0*SZFREG
> + FREG_S fa1, sp, 9*SZREG + 1*SZFREG
> + FREG_S fa2, sp, 9*SZREG + 2*SZFREG
> + FREG_S fa3, sp, 9*SZREG + 3*SZFREG
> + FREG_S fa4, sp, 9*SZREG + 4*SZFREG
> + FREG_S fa5, sp, 9*SZREG + 5*SZFREG
> + FREG_S fa6, sp, 9*SZREG + 6*SZFREG
> + FREG_S fa7, sp, 9*SZREG + 7*SZFREG
fa[x] is aliasing with the lower 64 bits of vr[x], if LSX is available.
So we don't need to save/load fa[x] separately if vr[x] or xr[x] is
saved.
i. e.
#ifdef USE_LASX
xvst ... ...
#else if defined USE_LSX
vst ... ...
#else if !defined __loongarch_soft_float
FREG_S ... ...
#endif
> +#ifdef USE_LASX
> + xvst xr0, sp, 9*SZREG + 8*SZFREG + 0*SZXREG
> + xvst xr1, sp, 9*SZREG + 8*SZFREG + 1*SZXREG
> + xvst xr2, sp, 9*SZREG + 8*SZFREG + 2*SZXREG
> + xvst xr3, sp, 9*SZREG + 8*SZFREG + 3*SZXREG
> + xvst xr4, sp, 9*SZREG + 8*SZFREG + 4*SZXREG
> + xvst xr5, sp, 9*SZREG + 8*SZFREG + 5*SZXREG
> + xvst xr6, sp, 9*SZREG + 8*SZFREG + 6*SZXREG
> + xvst xr7, sp, 9*SZREG + 8*SZFREG + 7*SZXREG
> +#elif defined USE_LSX
> + vst vr0, sp, 9*SZREG + 8*SZFREG + 0*SZVREG
> + vst vr1, sp, 9*SZREG + 8*SZFREG + 1*SZVREG
> + vst vr2, sp, 9*SZREG + 8*SZFREG + 2*SZVREG
> + vst vr3, sp, 9*SZREG + 8*SZFREG + 3*SZVREG
> + vst vr4, sp, 9*SZREG + 8*SZFREG + 4*SZVREG
> + vst vr5, sp, 9*SZREG + 8*SZFREG + 5*SZVREG
> + vst vr6, sp, 9*SZREG + 8*SZFREG + 6*SZVREG
> + vst vr7, sp, 9*SZREG + 8*SZFREG + 7*SZVREG
> +#endif
> +#endif
--
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] LoongArch: Add vector implementation for _dl_runtime_resolve.
2023-07-10 10:11 ` Xi Ruoyao
@ 2023-07-11 1:54 ` caiyinyu
2023-07-11 4:46 ` caiyinyu
0 siblings, 1 reply; 5+ messages in thread
From: caiyinyu @ 2023-07-11 1:54 UTC (permalink / raw)
To: Xi Ruoyao, libc-alpha; +Cc: adhemerval.zanella
在 2023/7/10 下午6:11, Xi Ruoyao 写道:
> On Mon, 2023-07-10 at 17:41 +0800, caiyinyu wrote:
>> +#ifndef __loongarch_soft_float
>> + FREG_S fa0, sp, 9*SZREG + 0*SZFREG
>> + FREG_S fa1, sp, 9*SZREG + 1*SZFREG
>> + FREG_S fa2, sp, 9*SZREG + 2*SZFREG
>> + FREG_S fa3, sp, 9*SZREG + 3*SZFREG
>> + FREG_S fa4, sp, 9*SZREG + 4*SZFREG
>> + FREG_S fa5, sp, 9*SZREG + 5*SZFREG
>> + FREG_S fa6, sp, 9*SZREG + 6*SZFREG
>> + FREG_S fa7, sp, 9*SZREG + 7*SZFREG
> fa[x] is aliasing with the lower 64 bits of vr[x], if LSX is available.
> So we don't need to save/load fa[x] separately if vr[x] or xr[x] is
> saved.
>
> i. e.
>
> #ifdef USE_LASX
> xvst ... ...
> #else if defined USE_LSX
> vst ... ...
> #else if !defined __loongarch_soft_float
> FREG_S ... ...
> #endif
OK, fixed, thanks!
>
>> +#ifdef USE_LASX
>> + xvst xr0, sp, 9*SZREG + 8*SZFREG + 0*SZXREG
>> + xvst xr1, sp, 9*SZREG + 8*SZFREG + 1*SZXREG
>> + xvst xr2, sp, 9*SZREG + 8*SZFREG + 2*SZXREG
>> + xvst xr3, sp, 9*SZREG + 8*SZFREG + 3*SZXREG
>> + xvst xr4, sp, 9*SZREG + 8*SZFREG + 4*SZXREG
>> + xvst xr5, sp, 9*SZREG + 8*SZFREG + 5*SZXREG
>> + xvst xr6, sp, 9*SZREG + 8*SZFREG + 6*SZXREG
>> + xvst xr7, sp, 9*SZREG + 8*SZFREG + 7*SZXREG
>> +#elif defined USE_LSX
>> + vst vr0, sp, 9*SZREG + 8*SZFREG + 0*SZVREG
>> + vst vr1, sp, 9*SZREG + 8*SZFREG + 1*SZVREG
>> + vst vr2, sp, 9*SZREG + 8*SZFREG + 2*SZVREG
>> + vst vr3, sp, 9*SZREG + 8*SZFREG + 3*SZVREG
>> + vst vr4, sp, 9*SZREG + 8*SZFREG + 4*SZVREG
>> + vst vr5, sp, 9*SZREG + 8*SZFREG + 5*SZVREG
>> + vst vr6, sp, 9*SZREG + 8*SZFREG + 6*SZVREG
>> + vst vr7, sp, 9*SZREG + 8*SZFREG + 7*SZVREG
>> +#endif
>> +#endif
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] LoongArch: Add vector implementation for _dl_runtime_resolve.
2023-07-11 1:54 ` caiyinyu
@ 2023-07-11 4:46 ` caiyinyu
0 siblings, 0 replies; 5+ messages in thread
From: caiyinyu @ 2023-07-11 4:46 UTC (permalink / raw)
To: Xi Ruoyao, libc-alpha; +Cc: adhemerval.zanella
在 2023/7/11 上午9:54, caiyinyu 写道:
>
> 在 2023/7/10 下午6:11, Xi Ruoyao 写道:
>> On Mon, 2023-07-10 at 17:41 +0800, caiyinyu wrote:
>>> +#ifndef __loongarch_soft_float
>>> + FREG_S fa0, sp, 9*SZREG + 0*SZFREG
>>> + FREG_S fa1, sp, 9*SZREG + 1*SZFREG
>>> + FREG_S fa2, sp, 9*SZREG + 2*SZFREG
>>> + FREG_S fa3, sp, 9*SZREG + 3*SZFREG
>>> + FREG_S fa4, sp, 9*SZREG + 4*SZFREG
>>> + FREG_S fa5, sp, 9*SZREG + 5*SZFREG
>>> + FREG_S fa6, sp, 9*SZREG + 6*SZFREG
>>> + FREG_S fa7, sp, 9*SZREG + 7*SZFREG
>> fa[x] is aliasing with the lower 64 bits of vr[x], if LSX is available.
>> So we don't need to save/load fa[x] separately if vr[x] or xr[x] is
>> saved.
>>
>> i. e.
>>
>> #ifdef USE_LASX
>> xvst ... ...
>> #else if defined USE_LSX
>> vst ... ...
>> #else if !defined __loongarch_soft_float
>> FREG_S ... ...
>> #endif
>
> OK, fixed, thanks!
>
I made a mistake soft-float abi. See
https://sourceware.org/pipermail/libc-alpha/2023-July/149932.html
>>
>>> +#ifdef USE_LASX
>>> + xvst xr0, sp, 9*SZREG + 8*SZFREG + 0*SZXREG
>>> + xvst xr1, sp, 9*SZREG + 8*SZFREG + 1*SZXREG
>>> + xvst xr2, sp, 9*SZREG + 8*SZFREG + 2*SZXREG
>>> + xvst xr3, sp, 9*SZREG + 8*SZFREG + 3*SZXREG
>>> + xvst xr4, sp, 9*SZREG + 8*SZFREG + 4*SZXREG
>>> + xvst xr5, sp, 9*SZREG + 8*SZFREG + 5*SZXREG
>>> + xvst xr6, sp, 9*SZREG + 8*SZFREG + 6*SZXREG
>>> + xvst xr7, sp, 9*SZREG + 8*SZFREG + 7*SZXREG
>>> +#elif defined USE_LSX
>>> + vst vr0, sp, 9*SZREG + 8*SZFREG + 0*SZVREG
>>> + vst vr1, sp, 9*SZREG + 8*SZFREG + 1*SZVREG
>>> + vst vr2, sp, 9*SZREG + 8*SZFREG + 2*SZVREG
>>> + vst vr3, sp, 9*SZREG + 8*SZFREG + 3*SZVREG
>>> + vst vr4, sp, 9*SZREG + 8*SZFREG + 4*SZVREG
>>> + vst vr5, sp, 9*SZREG + 8*SZFREG + 5*SZVREG
>>> + vst vr6, sp, 9*SZREG + 8*SZFREG + 6*SZVREG
>>> + vst vr7, sp, 9*SZREG + 8*SZFREG + 7*SZVREG
>>> +#endif
>>> +#endif
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-07-11 4:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-10 9:41 [PATCH 1/2] LoongArch: config: Added HAVE_LOONGARCH_VEC_ASM caiyinyu
2023-07-10 9:41 ` [PATCH 2/2] LoongArch: Add vector implementation for _dl_runtime_resolve caiyinyu
2023-07-10 10:11 ` Xi Ruoyao
2023-07-11 1:54 ` caiyinyu
2023-07-11 4:46 ` caiyinyu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).