* [PATCH 01/12] ia64: Remove bcopy
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 02/12] powerpc: Remove bcopy optimizations Adhemerval Zanella
` (12 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
It just call memmove as the generic implementation.
---
sysdeps/ia64/bcopy.S | 10 ----------
1 file changed, 10 deletions(-)
delete mode 100644 sysdeps/ia64/bcopy.S
diff --git a/sysdeps/ia64/bcopy.S b/sysdeps/ia64/bcopy.S
deleted file mode 100644
index bdabf5acdc..0000000000
--- a/sysdeps/ia64/bcopy.S
+++ /dev/null
@@ -1,10 +0,0 @@
-#include <sysdep.h>
-
-ENTRY(bcopy)
- .regstk 3, 0, 0, 0
- mov r8 = in0
- mov in0 = in1
- ;;
- mov in1 = r8
- br.cond.sptk.many HIDDEN_BUILTIN_JUMPTARGET(memmove)
-END(bcopy)
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 02/12] powerpc: Remove bcopy optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 01/12] ia64: Remove bcopy Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 03/12] i386: " Adhemerval Zanella
` (11 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
.../powerpc/powerpc64/le/power10/memmove.S | 13 -------
sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +-
.../powerpc/powerpc64/multiarch/bcopy-ppc64.c | 27 -------------
sysdeps/powerpc/powerpc64/multiarch/bcopy.c | 38 -------------------
.../powerpc64/multiarch/ifunc-impl-list.c | 13 -------
.../powerpc64/multiarch/memmove-power10.S | 3 --
.../powerpc64/multiarch/memmove-power7.S | 3 --
sysdeps/powerpc/powerpc64/power7/bcopy.c | 1 -
sysdeps/powerpc/powerpc64/power7/memmove.S | 14 -------
9 files changed, 1 insertion(+), 113 deletions(-)
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy.c
delete mode 100644 sysdeps/powerpc/powerpc64/power7/bcopy.c
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memmove.S b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
index eda86b194e..3024718fdf 100644
--- a/sysdeps/powerpc/powerpc64/le/power10/memmove.S
+++ b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
@@ -305,16 +305,3 @@ L(tail1_bwd):
END_GEN_TB (MEMMOVE,TB_TOCLESS)
libc_hidden_builtin_def (memmove)
-
-/* void bcopy(const void *src [r3], void *dest [r4], size_t n [r5])
- Implemented in this file to avoid linker create a stub function call
- in the branch to '_memmove'. */
-ENTRY_TOCLESS (__bcopy)
- mr r6,r3
- mr r3,r4
- mr r4,r6
- b L(_memmove)
-END (__bcopy)
-#ifndef __bcopy
-weak_alias (__bcopy, bcopy)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 626845a43c..6f2436b660 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -24,7 +24,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
stpncpy-power8 stpncpy-power7 stpncpy-ppc64 \
strcmp-power8 strcmp-power7 strcmp-ppc64 \
strcat-power8 strcat-power7 strcat-ppc64 \
- memmove-power7 memmove-ppc64 wordcopy-ppc64 bcopy-ppc64 \
+ memmove-power7 memmove-ppc64 wordcopy-ppc64 \
strncpy-power8 strstr-power7 strstr-ppc64 \
strspn-power8 strspn-ppc64 strcspn-power8 strcspn-ppc64 \
strlen-power8 strcasestr-power8 strcasestr-ppc64 \
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
deleted file mode 100644
index fe68713ad7..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
+++ /dev/null
@@ -1,27 +0,0 @@
-/* PowerPC64 default bcopy.
- Copyright (C) 2014-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <string.h>
-
-extern __typeof (bcopy) __bcopy_ppc attribute_hidden;
-extern __typeof (memmove) __memmove_ppc attribute_hidden;
-
-void __bcopy_ppc (const void *src, void *dest, size_t n)
-{
- __memmove_ppc (dest, src, n);
-}
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
deleted file mode 100644
index 84c6adfd6e..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
+++ /dev/null
@@ -1,38 +0,0 @@
-/* PowerPC64 multiarch bcopy.
- Copyright (C) 2014-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <string.h>
-#include "init-arch.h"
-
-extern __typeof (bcopy) __bcopy_ppc attribute_hidden;
-/* __bcopy_power7 symbol is implemented at memmove-power7.S */
-extern __typeof (bcopy) __bcopy_power7 attribute_hidden;
-#ifdef __LITTLE_ENDIAN__
-extern __typeof (bcopy) __bcopy_power10 attribute_hidden;
-#endif
-
-libc_ifunc (bcopy,
-#ifdef __LITTLE_ENDIAN__
- (hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX)
- ? __bcopy_power10 :
-#endif
- (hwcap & PPC_FEATURE_HAS_VSX)
- ? __bcopy_power7
- : __bcopy_ppc);
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index a0f9fce25d..280b8616b2 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -244,19 +244,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__bzero_power4)
IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
- /* Support sysdeps/powerpc/powerpc64/multiarch/bcopy.c. */
- IFUNC_IMPL (i, name, bcopy,
-#ifdef __LITTLE_ENDIAN__
- IFUNC_IMPL_ADD (array, i, bcopy,
- hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX,
- __bcopy_power10)
-#endif
- IFUNC_IMPL_ADD (array, i, bcopy, hwcap & PPC_FEATURE_HAS_VSX,
- __bcopy_power7)
- IFUNC_IMPL_ADD (array, i, bcopy, 1, __bcopy_ppc))
-
/* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c. */
IFUNC_IMPL (i, name, mempcpy,
IFUNC_IMPL_ADD (array, i, mempcpy,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
index e5df0851c0..a66d2892c4 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bcopy
-#define __bcopy __bcopy_power10
-
#include <sysdeps/powerpc/powerpc64/le/power10/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
index a7b05ebfa9..0a6c7cb96e 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bcopy
-#define __bcopy __bcopy_power7
-
#include <sysdeps/powerpc/powerpc64/power7/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/power7/bcopy.c b/sysdeps/powerpc/powerpc64/power7/bcopy.c
deleted file mode 100644
index 4a6a400e7a..0000000000
--- a/sysdeps/powerpc/powerpc64/power7/bcopy.c
+++ /dev/null
@@ -1 +0,0 @@
-/* Implemented at memmove.S */
diff --git a/sysdeps/powerpc/powerpc64/power7/memmove.S b/sysdeps/powerpc/powerpc64/power7/memmove.S
index 1d10a3d593..5a1055c097 100644
--- a/sysdeps/powerpc/powerpc64/power7/memmove.S
+++ b/sysdeps/powerpc/powerpc64/power7/memmove.S
@@ -821,17 +821,3 @@ L(end_unaligned_loop_bwd):
blr
END_GEN_TB (MEMMOVE, TB_TOCLESS)
libc_hidden_builtin_def (memmove)
-
-
-/* void bcopy(const void *src [r3], void *dest [r4], size_t n [r5])
- Implemented in this file to avoid linker create a stub function call
- in the branch to '_memmove'. */
-ENTRY_TOCLESS (__bcopy)
- mr r6,r3
- mr r3,r4
- mr r4,r6
- b L(_memmove)
-END (__bcopy)
-#ifndef __bcopy
-weak_alias (__bcopy, bcopy)
-#endif
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 03/12] i386: Remove bcopy optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 01/12] ia64: Remove bcopy Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 02/12] powerpc: Remove bcopy optimizations Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 04/12] x86_64: " Adhemerval Zanella
` (10 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
sysdeps/i386/bcopy.S | 4 ---
sysdeps/i386/i686/bcopy.S | 3 --
sysdeps/i386/i686/multiarch/Makefile | 6 ++--
sysdeps/i386/i686/multiarch/bcopy-ia32.S | 20 -------------
.../i686/multiarch/bcopy-sse2-unaligned.S | 4 ---
sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S | 4 ---
sysdeps/i386/i686/multiarch/bcopy-ssse3.S | 4 ---
sysdeps/i386/i686/multiarch/bcopy.c | 30 -------------------
sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 10 -------
9 files changed, 3 insertions(+), 82 deletions(-)
delete mode 100644 sysdeps/i386/bcopy.S
delete mode 100644 sysdeps/i386/i686/bcopy.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ia32.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy.c
diff --git a/sysdeps/i386/bcopy.S b/sysdeps/i386/bcopy.S
deleted file mode 100644
index 12b8ddb886..0000000000
--- a/sysdeps/i386/bcopy.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY bcopy
-#include "memcpy.S"
diff --git a/sysdeps/i386/i686/bcopy.S b/sysdeps/i386/i686/bcopy.S
deleted file mode 100644
index 15ef9419a4..0000000000
--- a/sysdeps/i386/i686/bcopy.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BCOPY
-#define memmove bcopy
-#include <sysdeps/i386/i686/memmove.S>
diff --git a/sysdeps/i386/i686/multiarch/Makefile b/sysdeps/i386/i686/multiarch/Makefile
index c4897922d7..02fa02658e 100644
--- a/sysdeps/i386/i686/multiarch/Makefile
+++ b/sysdeps/i386/i686/multiarch/Makefile
@@ -2,7 +2,7 @@ ifeq ($(subdir),string)
gen-as-const-headers += locale-defines.sym
sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
memmove-ssse3 memcpy-ssse3-rep mempcpy-ssse3-rep \
- memmove-ssse3-rep bcopy-ssse3 bcopy-ssse3-rep \
+ memmove-ssse3-rep \
memset-sse2-rep bzero-sse2-rep strcmp-ssse3 \
strcmp-sse4 strncmp-c strncmp-ssse3 strncmp-sse4 \
memcmp-ssse3 memcmp-sse4 varshift \
@@ -18,10 +18,10 @@ sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
strcasecmp_l-c strcasecmp-c strcasecmp_l-ssse3 \
strncase_l-c strncase-c strncase_l-ssse3 \
strcasecmp_l-sse4 strncase_l-sse4 \
- bcopy-sse2-unaligned memcpy-sse2-unaligned \
+ memcpy-sse2-unaligned \
mempcpy-sse2-unaligned memmove-sse2-unaligned \
strcspn-c strpbrk-c strspn-c \
- bcopy-ia32 bzero-ia32 rawmemchr-ia32 \
+ bzero-ia32 rawmemchr-ia32 \
memchr-ia32 memcmp-ia32 memcpy-ia32 memmove-ia32 \
mempcpy-ia32 memset-ia32 strcat-ia32 strchr-ia32 \
strrchr-ia32 strcpy-ia32 strcmp-ia32 strcspn-ia32 \
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ia32.S b/sysdeps/i386/i686/multiarch/bcopy-ia32.S
deleted file mode 100644
index e0fadc0f3f..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ia32.S
+++ /dev/null
@@ -1,20 +0,0 @@
-/* bcopy optimized for i686.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#define bcopy __bcopy_ia32
-#include <sysdeps/i386/i686/bcopy.S>
diff --git a/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S b/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
deleted file mode 100644
index efef2a10dd..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY __bcopy_sse2_unaligned
-#include "memcpy-sse2-unaligned.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S b/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
deleted file mode 100644
index cbc8b420e8..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY __bcopy_ssse3_rep
-#include "memcpy-ssse3-rep.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ssse3.S b/sysdeps/i386/i686/multiarch/bcopy-ssse3.S
deleted file mode 100644
index 36aac44b9c..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ssse3.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY __bcopy_ssse3
-#include "memcpy-ssse3.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy.c b/sysdeps/i386/i686/multiarch/bcopy.c
deleted file mode 100644
index bc2c2ac55d..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy.c
+++ /dev/null
@@ -1,30 +0,0 @@
-/* Multiple versions of bcopy.
- All versions must be listed in ifunc-impl-list.c.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for the definition in libc. */
-#if IS_IN (libc)
-# define bcopy __redirect_bcopy
-# include <string.h>
-# undef bcopy
-
-# define SYMBOL_NAME bcopy
-# include "ifunc-memmove.h"
-
-libc_ifunc_redirected (__redirect_bcopy, bcopy, IFUNC_SELECTOR ());
-#endif
diff --git a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
index 6883b3d226..5c7a42dc97 100644
--- a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
+++ b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
@@ -36,16 +36,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
size_t i = 0;
- /* Support sysdeps/i386/i686/multiarch/bcopy.S. */
- IFUNC_IMPL (i, name, bcopy,
- IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSSE3),
- __bcopy_ssse3_rep)
- IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSSE3),
- __bcopy_ssse3)
- IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSE2),
- __bcopy_sse2_unaligned)
- IFUNC_IMPL_ADD (array, i, bcopy, 1, __bcopy_ia32))
-
/* Support sysdeps/i386/i686/multiarch/bzero.S. */
IFUNC_IMPL (i, name, bzero,
IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 04/12] x86_64: Remove bcopy optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (2 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 03/12] i386: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 05/12] alpha: Remove bzero optimization Adhemerval Zanella
` (9 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
sysdeps/x86_64/multiarch/bcopy.S | 7 -------
1 file changed, 7 deletions(-)
delete mode 100644 sysdeps/x86_64/multiarch/bcopy.S
diff --git a/sysdeps/x86_64/multiarch/bcopy.S b/sysdeps/x86_64/multiarch/bcopy.S
deleted file mode 100644
index 639f02bde3..0000000000
--- a/sysdeps/x86_64/multiarch/bcopy.S
+++ /dev/null
@@ -1,7 +0,0 @@
-#include <sysdep.h>
-
- .text
-ENTRY(bcopy)
- xchg %rdi, %rsi
- jmp __libc_memmove /* Branch to IFUNC memmove. */
-END(bcopy)
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 05/12] alpha: Remove bzero optimization
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (3 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 04/12] x86_64: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 06/12] ia64: " Adhemerval Zanella
` (8 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
sysdeps/alpha/bzero.S | 109 ------------------------------------------
1 file changed, 109 deletions(-)
delete mode 100644 sysdeps/alpha/bzero.S
diff --git a/sysdeps/alpha/bzero.S b/sysdeps/alpha/bzero.S
deleted file mode 100644
index 4821778622..0000000000
--- a/sysdeps/alpha/bzero.S
+++ /dev/null
@@ -1,109 +0,0 @@
-/* Copyright (C) 1996-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library. If not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Fill a block of memory with zeros. Optimized for the Alpha architecture:
-
- - memory accessed as aligned quadwords only
- - destination memory not read unless needed for good cache behaviour
- - basic blocks arranged to optimize branch prediction for full-quadword
- aligned memory blocks.
- - partial head and tail quadwords constructed with byte-mask instructions
-
- This is generally scheduled for the EV5 (got to look out for my own
- interests :-), but with EV4 needs in mind. There *should* be no more
- stalls for the EV4 than there are for the EV5.
-*/
-
-
-#include <sysdep.h>
-
- .set noat
- .set noreorder
-
- .text
- .type __bzero, @function
- .globl __bzero
- .usepv __bzero, USEPV_PROF
-
- cfi_startproc
-
- /* On entry to this basic block:
- t3 == loop counter
- t4 == bytes in partial final word
- a0 == possibly misaligned destination pointer */
-
- .align 3
-bzero_loop:
- beq t3, $tail #
- blbc t3, 0f # skip single store if count even
-
- stq_u zero, 0(a0) # e0 : store one word
- subq t3, 1, t3 # .. e1 :
- addq a0, 8, a0 # e0 :
- beq t3, $tail # .. e1 :
-
-0: stq_u zero, 0(a0) # e0 : store two words
- subq t3, 2, t3 # .. e1 :
- stq_u zero, 8(a0) # e0 :
- addq a0, 16, a0 # .. e1 :
- bne t3, 0b # e1 :
-
-$tail: bne t4, 1f # is there a tail to do?
- ret # no
-
-1: ldq_u t0, 0(a0) # yes, load original data
- mskqh t0, t4, t0 #
- stq_u t0, 0(a0) #
- ret #
-
-__bzero:
-#ifdef PROF
- ldgp gp, 0(pv)
- lda AT, _mcount
- jsr AT, (AT), _mcount
-#endif
-
- mov a0, v0 # e0 : move return value in place
- beq a1, $done # .. e1 : early exit for zero-length store
- and a0, 7, t1 # e0 :
- addq a1, t1, a1 # e1 : add dest misalignment to count
- srl a1, 3, t3 # e0 : loop = count >> 3
- and a1, 7, t4 # .. e1 : find number of bytes in tail
- unop # :
- beq t1, bzero_loop # e1 : aligned head, jump right in
-
- ldq_u t0, 0(a0) # e0 : load original data to mask into
- cmpult a1, 8, t2 # .. e1 : is this a sub-word set?
- bne t2, $oneq # e1 :
-
- mskql t0, a0, t0 # e0 : we span words. finish this partial
- subq t3, 1, t3 # e0 :
- addq a0, 8, a0 # .. e1 :
- stq_u t0, -8(a0) # e0 :
- br bzero_loop # .. e1 :
-
- .align 3
-$oneq:
- mskql t0, a0, t2 # e0 :
- mskqh t0, a1, t3 # e0 :
- or t2, t3, t0 # e1 :
- stq_u t0, 0(a0) # e0 :
-
-$done: ret
-
- cfi_endproc
-weak_alias (__bzero, bzero)
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 06/12] ia64: Remove bzero optimization
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (4 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 05/12] alpha: Remove bzero optimization Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 07/12] " Adhemerval Zanella
` (7 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbol is not present current POSIX specification and compiler
already generates memset call.
---
sysdeps/ia64/bzero.S | 312 -------------------------------------------
1 file changed, 312 deletions(-)
delete mode 100644 sysdeps/ia64/bzero.S
diff --git a/sysdeps/ia64/bzero.S b/sysdeps/ia64/bzero.S
deleted file mode 100644
index cd01abb436..0000000000
--- a/sysdeps/ia64/bzero.S
+++ /dev/null
@@ -1,312 +0,0 @@
-/* Optimized version of the standard bzero() function.
- This file is part of the GNU C Library.
- Copyright (C) 2000-2022 Free Software Foundation, Inc.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Return: dest
-
- Inputs:
- in0: dest
- in1: count
-
- The algorithm is fairly straightforward: set byte by byte until we
- we get to a 16B-aligned address, then loop on 128 B chunks using an
- early store as prefetching, then loop on 32B chucks, then clear remaining
- words, finally clear remaining bytes.
- Since a stf.spill f0 can store 16B in one go, we use this instruction
- to get peak speed. */
-
-#include <sysdep.h>
-#undef ret
-
-#define dest in0
-#define cnt in1
-
-#define tmp r31
-#define save_lc r30
-#define ptr0 r29
-#define ptr1 r28
-#define ptr2 r27
-#define ptr3 r26
-#define ptr9 r24
-#define loopcnt r23
-#define linecnt r22
-#define bytecnt r21
-
-// This routine uses only scratch predicate registers (p6 - p15)
-#define p_scr p6 // default register for same-cycle branches
-#define p_unalgn p9
-#define p_y p11
-#define p_n p12
-#define p_yy p13
-#define p_nn p14
-
-#define movi0 mov
-
-#define MIN1 15
-#define MIN1P1HALF 8
-#define LINE_SIZE 128
-#define LSIZE_SH 7 // shift amount
-#define PREF_AHEAD 8
-
-#define USE_FLP
-#if defined(USE_INT)
-#define store st8
-#define myval r0
-#elif defined(USE_FLP)
-#define store stf8
-#define myval f0
-#endif
-
-.align 64
-ENTRY(bzero)
-{ .mmi
- .prologue
- alloc tmp = ar.pfs, 2, 0, 0, 0
- lfetch.nt1 [dest]
- .save ar.lc, save_lc
- movi0 save_lc = ar.lc
-} { .mmi
- .body
- mov ret0 = dest // return value
- nop.m 0
- cmp.eq p_scr, p0 = cnt, r0
-;; }
-{ .mmi
- and ptr2 = -(MIN1+1), dest // aligned address
- and tmp = MIN1, dest // prepare to check for alignment
- tbit.nz p_y, p_n = dest, 0 // Do we have an odd address? (M_B_U)
-} { .mib
- mov ptr1 = dest
- nop.i 0
-(p_scr) br.ret.dpnt.many rp // return immediately if count = 0
-;; }
-{ .mib
- cmp.ne p_unalgn, p0 = tmp, r0
-} { .mib // NB: # of bytes to move is 1
- sub bytecnt = (MIN1+1), tmp // higher than loopcnt
- cmp.gt p_scr, p0 = 16, cnt // is it a minimalistic task?
-(p_scr) br.cond.dptk.many .move_bytes_unaligned // go move just a few (M_B_U)
-;; }
-{ .mmi
-(p_unalgn) add ptr1 = (MIN1+1), ptr2 // after alignment
-(p_unalgn) add ptr2 = MIN1P1HALF, ptr2 // after alignment
-(p_unalgn) tbit.nz.unc p_y, p_n = bytecnt, 3 // should we do a st8 ?
-;; }
-{ .mib
-(p_y) add cnt = -8, cnt
-(p_unalgn) tbit.nz.unc p_yy, p_nn = bytecnt, 2 // should we do a st4 ?
-} { .mib
-(p_y) st8 [ptr2] = r0,-4
-(p_n) add ptr2 = 4, ptr2
-;; }
-{ .mib
-(p_yy) add cnt = -4, cnt
-(p_unalgn) tbit.nz.unc p_y, p_n = bytecnt, 1 // should we do a st2 ?
-} { .mib
-(p_yy) st4 [ptr2] = r0,-2
-(p_nn) add ptr2 = 2, ptr2
-;; }
-{ .mmi
- mov tmp = LINE_SIZE+1 // for compare
-(p_y) add cnt = -2, cnt
-(p_unalgn) tbit.nz.unc p_yy, p_nn = bytecnt, 0 // should we do a st1 ?
-} { .mmi
- nop.m 0
-(p_y) st2 [ptr2] = r0,-1
-(p_n) add ptr2 = 1, ptr2
-;; }
-
-{ .mmi
-(p_yy) st1 [ptr2] = r0
- cmp.gt p_scr, p0 = tmp, cnt // is it a minimalistic task?
-} { .mbb
-(p_yy) add cnt = -1, cnt
-(p_scr) br.cond.dpnt.many .fraction_of_line // go move just a few
-;; }
-{ .mib
- nop.m 0
- shr.u linecnt = cnt, LSIZE_SH
- nop.b 0
-;; }
-
- .align 32
-.l1b: // ------------------// L1B: store ahead into cache lines; fill later
-{ .mmi
- and tmp = -(LINE_SIZE), cnt // compute end of range
- mov ptr9 = ptr1 // used for prefetching
- and cnt = (LINE_SIZE-1), cnt // remainder
-} { .mmi
- mov loopcnt = PREF_AHEAD-1 // default prefetch loop
- cmp.gt p_scr, p0 = PREF_AHEAD, linecnt // check against actual value
-;; }
-{ .mmi
-(p_scr) add loopcnt = -1, linecnt
- add ptr2 = 16, ptr1 // start of stores (beyond prefetch stores)
- add ptr1 = tmp, ptr1 // first address beyond total range
-;; }
-{ .mmi
- add tmp = -1, linecnt // next loop count
- movi0 ar.lc = loopcnt
-;; }
-.pref_l1b:
-{ .mib
- stf.spill [ptr9] = f0, 128 // Do stores one cache line apart
- nop.i 0
- br.cloop.dptk.few .pref_l1b
-;; }
-{ .mmi
- add ptr0 = 16, ptr2 // Two stores in parallel
- movi0 ar.lc = tmp
-;; }
-.l1bx:
- { .mmi
- stf.spill [ptr2] = f0, 32
- stf.spill [ptr0] = f0, 32
- ;; }
- { .mmi
- stf.spill [ptr2] = f0, 32
- stf.spill [ptr0] = f0, 32
- ;; }
- { .mmi
- stf.spill [ptr2] = f0, 32
- stf.spill [ptr0] = f0, 64
- cmp.lt p_scr, p0 = ptr9, ptr1 // do we need more prefetching?
- ;; }
-{ .mmb
- stf.spill [ptr2] = f0, 32
-(p_scr) stf.spill [ptr9] = f0, 128
- br.cloop.dptk.few .l1bx
-;; }
-{ .mib
- cmp.gt p_scr, p0 = 8, cnt // just a few bytes left ?
-(p_scr) br.cond.dpnt.many .move_bytes_from_alignment
-;; }
-
-.fraction_of_line:
-{ .mib
- add ptr2 = 16, ptr1
- shr.u loopcnt = cnt, 5 // loopcnt = cnt / 32
-;; }
-{ .mib
- cmp.eq p_scr, p0 = loopcnt, r0
- add loopcnt = -1, loopcnt
-(p_scr) br.cond.dpnt.many .store_words
-;; }
-{ .mib
- and cnt = 0x1f, cnt // compute the remaining cnt
- movi0 ar.lc = loopcnt
-;; }
- .align 32
-.l2: // -----------------------------// L2A: store 32B in 2 cycles
-{ .mmb
- store [ptr1] = myval, 8
- store [ptr2] = myval, 8
-;; } { .mmb
- store [ptr1] = myval, 24
- store [ptr2] = myval, 24
- br.cloop.dptk.many .l2
-;; }
-.store_words:
-{ .mib
- cmp.gt p_scr, p0 = 8, cnt // just a few bytes left ?
-(p_scr) br.cond.dpnt.many .move_bytes_from_alignment // Branch
-;; }
-
-{ .mmi
- store [ptr1] = myval, 8 // store
- cmp.le p_y, p_n = 16, cnt //
- add cnt = -8, cnt // subtract
-;; }
-{ .mmi
-(p_y) store [ptr1] = myval, 8 // store
-(p_y) cmp.le.unc p_yy, p_nn = 16, cnt
-(p_y) add cnt = -8, cnt // subtract
-;; }
-{ .mmi // store
-(p_yy) store [ptr1] = myval, 8
-(p_yy) add cnt = -8, cnt // subtract
-;; }
-
-.move_bytes_from_alignment:
-{ .mib
- cmp.eq p_scr, p0 = cnt, r0
- tbit.nz.unc p_y, p0 = cnt, 2 // should we terminate with a st4 ?
-(p_scr) br.cond.dpnt.few .restore_and_exit
-;; }
-{ .mib
-(p_y) st4 [ptr1] = r0,4
- tbit.nz.unc p_yy, p0 = cnt, 1 // should we terminate with a st2 ?
-;; }
-{ .mib
-(p_yy) st2 [ptr1] = r0,2
- tbit.nz.unc p_y, p0 = cnt, 0 // should we terminate with a st1 ?
-;; }
-
-{ .mib
-(p_y) st1 [ptr1] = r0
-;; }
-.restore_and_exit:
-{ .mib
- nop.m 0
- movi0 ar.lc = save_lc
- br.ret.sptk.many rp
-;; }
-
-.move_bytes_unaligned:
-{ .mmi
- .pred.rel "mutex",p_y, p_n
- .pred.rel "mutex",p_yy, p_nn
-(p_n) cmp.le p_yy, p_nn = 4, cnt
-(p_y) cmp.le p_yy, p_nn = 5, cnt
-(p_n) add ptr2 = 2, ptr1
-} { .mmi
-(p_y) add ptr2 = 3, ptr1
-(p_y) st1 [ptr1] = r0, 1 // fill 1 (odd-aligned) byte
-(p_y) add cnt = -1, cnt // [15, 14 (or less) left]
-;; }
-{ .mmi
-(p_yy) cmp.le.unc p_y, p0 = 8, cnt
- add ptr3 = ptr1, cnt // prepare last store
- movi0 ar.lc = save_lc
-} { .mmi
-(p_yy) st2 [ptr1] = r0, 4 // fill 2 (aligned) bytes
-(p_yy) st2 [ptr2] = r0, 4 // fill 2 (aligned) bytes
-(p_yy) add cnt = -4, cnt // [11, 10 (o less) left]
-;; }
-{ .mmi
-(p_y) cmp.le.unc p_yy, p0 = 8, cnt
- add ptr3 = -1, ptr3 // last store
- tbit.nz p_scr, p0 = cnt, 1 // will there be a st2 at the end ?
-} { .mmi
-(p_y) st2 [ptr1] = r0, 4 // fill 2 (aligned) bytes
-(p_y) st2 [ptr2] = r0, 4 // fill 2 (aligned) bytes
-(p_y) add cnt = -4, cnt // [7, 6 (or less) left]
-;; }
-{ .mmi
-(p_yy) st2 [ptr1] = r0, 4 // fill 2 (aligned) bytes
-(p_yy) st2 [ptr2] = r0, 4 // fill 2 (aligned) bytes
- // [3, 2 (or less) left]
- tbit.nz p_y, p0 = cnt, 0 // will there be a st1 at the end ?
-} { .mmi
-(p_yy) add cnt = -4, cnt
-;; }
-{ .mmb
-(p_scr) st2 [ptr1] = r0 // fill 2 (aligned) bytes
-(p_y) st1 [ptr3] = r0 // fill last byte (using ptr3)
- br.ret.sptk.many rp
-;; }
-END(bzero)
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 07/12] Remove bzero optimization
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (5 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 06/12] ia64: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 08/12] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
` (6 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/sparc/sparc32/bzero.c | 1 -
sysdeps/sparc/sparc32/memset.S | 37 ++++++++-----------
sysdeps/sparc/sparc32/sparcv9/bzero.c | 1 -
.../sparc/sparc32/sparcv9/multiarch/bzero.c | 1 -
.../sparc32/sparcv9/multiarch/memset-ultra1.S | 1 -
sysdeps/sparc/sparc64/bzero.c | 1 -
sysdeps/sparc/sparc64/memset.S | 30 ++++++---------
sysdeps/sparc/sparc64/multiarch/bzero.c | 33 -----------------
.../sparc/sparc64/multiarch/ifunc-impl-list.c | 9 -----
.../sparc/sparc64/multiarch/ifunc-memset.h | 2 +-
.../sparc/sparc64/multiarch/memset-niagara1.S | 5 +--
.../sparc/sparc64/multiarch/memset-niagara4.S | 6 +--
.../sparc/sparc64/multiarch/memset-niagara7.S | 7 ----
.../sparc/sparc64/multiarch/memset-ultra1.S | 1 -
14 files changed, 30 insertions(+), 105 deletions(-)
delete mode 100644 sysdeps/sparc/sparc32/bzero.c
delete mode 100644 sysdeps/sparc/sparc32/sparcv9/bzero.c
delete mode 100644 sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
delete mode 100644 sysdeps/sparc/sparc64/bzero.c
delete mode 100644 sysdeps/sparc/sparc64/multiarch/bzero.c
diff --git a/sysdeps/sparc/sparc32/bzero.c b/sysdeps/sparc/sparc32/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc32/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc32/memset.S b/sysdeps/sparc/sparc32/memset.S
index d222fa7506..b1b67cb2d1 100644
--- a/sysdeps/sparc/sparc32/memset.S
+++ b/sysdeps/sparc/sparc32/memset.S
@@ -42,25 +42,6 @@
.text
.align 4
-ENTRY(__bzero)
- b 1f
- mov %g0, %g3
-
-3: cmp %o2, 3
- be 2f
- stb %g3, [%o0]
-
- cmp %o2, 2
- be 2f
- stb %g3, [%o0 + 0x01]
-
- stb %g3, [%o0 + 0x02]
-2: sub %o2, 4, %o2
- add %o1, %o2, %o1
- b 4f
- sub %o0, %o2, %o0
-END(__bzero)
-
ENTRY(memset)
and %o1, 0xff, %g3
sll %g3, 8, %g2
@@ -73,7 +54,7 @@ ENTRY(memset)
mov %o0, %g1
andcc %o0, 3, %o2
- bne 3b
+ bne 3f
4: andcc %o0, 4, %g0
be 2f
@@ -146,7 +127,19 @@ ENTRY(memset)
stb %g3, [%o0 + 6]
0: retl
nop
+
+3: cmp %o2, 3
+ be 2f
+ stb %g3, [%o0]
+
+ cmp %o2, 2
+ be 2f
+ stb %g3, [%o0 + 0x01]
+
+ stb %g3, [%o0 + 0x02]
+2: sub %o2, 4, %o2
+ add %o1, %o2, %o1
+ b 4b
+ sub %o0, %o2, %o0
END(memset)
libc_hidden_builtin_def (memset)
-
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/sparc/sparc32/sparcv9/bzero.c b/sysdeps/sparc/sparc32/sparcv9/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c b/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
deleted file mode 100644
index cf6803ef44..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-#include <sysdeps/sparc/sparc64/multiarch/bzero.c>
diff --git a/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S b/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
index 6038611134..2dda6f1ed6 100644
--- a/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
+++ b/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
@@ -25,6 +25,5 @@
# define weak_alias(x, y)
# define memset __memset_ultra1
-# define __bzero __bzero_ultra1
# include <sysdeps/sparc/sparc32/sparcv9/memset.S>
#endif
diff --git a/sysdeps/sparc/sparc64/bzero.c b/sysdeps/sparc/sparc64/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc64/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc64/memset.S b/sysdeps/sparc/sparc64/memset.S
index a7f8361fa3..33ecbc93fe 100644
--- a/sysdeps/sparc/sparc64/memset.S
+++ b/sysdeps/sparc/sparc64/memset.S
@@ -31,6 +31,16 @@
stx source, [base - offset - 0x08]; \
stx source, [base - offset - 0x00];
+#define ZERO_BLOCKS(base, offset, source) \
+ stx source, [base - offset - 0x38]; \
+ stx source, [base - offset - 0x30]; \
+ stx source, [base - offset - 0x28]; \
+ stx source, [base - offset - 0x20]; \
+ stx source, [base - offset - 0x18]; \
+ stx source, [base - offset - 0x10]; \
+ stx source, [base - offset - 0x08]; \
+ stx source, [base - offset - 0x00];
+
/* Well, memset is a lot easier to get right than bcopy... */
.text
.align 32
@@ -174,22 +184,7 @@ ENTRY(memset)
nop
ba,pt %xcc, 18b
ldd [%o0], %f0
-END(memset)
-libc_hidden_builtin_def (memset)
-#define ZERO_BLOCKS(base, offset, source) \
- stx source, [base - offset - 0x38]; \
- stx source, [base - offset - 0x30]; \
- stx source, [base - offset - 0x28]; \
- stx source, [base - offset - 0x20]; \
- stx source, [base - offset - 0x18]; \
- stx source, [base - offset - 0x10]; \
- stx source, [base - offset - 0x08]; \
- stx source, [base - offset - 0x00];
-
- .text
- .align 32
-ENTRY(__bzero)
#ifndef USE_BPR
srl %o1, 0, %o1
#endif
@@ -307,6 +302,5 @@ ENTRY(__bzero)
stb %g0, [%o0 - 1]
0: retl
mov %o5, %o0
-END(__bzero)
-
-weak_alias (__bzero, bzero)
+END(memset)
+libc_hidden_builtin_def (memset)
diff --git a/sysdeps/sparc/sparc64/multiarch/bzero.c b/sysdeps/sparc/sparc64/multiarch/bzero.c
deleted file mode 100644
index 409d66a864..0000000000
--- a/sysdeps/sparc/sparc64/multiarch/bzero.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* Multiple versions of bzero. SPARC64/Linux version.
- All versions must be listed in ifunc-impl-list.c.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#if IS_IN (libc)
-# define bzero __redirect_bzero
-# include <string.h>
-# undef bzero
-
-# include <sparc-ifunc.h>
-
-# define SYMBOL_NAME bzero
-# include "ifunc-memset.h"
-
-sparc_libc_ifunc_redirected (__redirect_bzero, __bzero, IFUNC_SELECTOR)
-weak_alias (__bzero, bzero)
-
-#endif
diff --git a/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c b/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
index 05926e605b..9be12f9130 100644
--- a/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
@@ -61,15 +61,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__mempcpy_ultra3)
IFUNC_IMPL_ADD (array, i, mempcpy, 1, __mempcpy_ultra1));
- IFUNC_IMPL (i, name, bzero,
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_ADP,
- __bzero_niagara7)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_CRYPTO,
- __bzero_niagara4)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_BLKINIT,
- __bzero_niagara1)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ultra1));
-
IFUNC_IMPL (i, name, memset,
IFUNC_IMPL_ADD (array, i, memset, hwcap & HWCAP_SPARC_ADP,
__memset_niagara7)
diff --git a/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h b/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
index 56893b6883..0a2f16b3f1 100644
--- a/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
+++ b/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
@@ -1,4 +1,4 @@
-/* Common definition for memset/bzero implementation.
+/* Common definition for memset implementation.
All versions must be listed in ifunc-impl-list.c.
Copyright (C) 2017-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
index 13432effc1..7865691eca 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
@@ -45,9 +45,6 @@ ENTRY(__memset_niagara1)
sllx %o2, 32, %g1
ba,pt %XCC, 1f
or %g1, %o2, %o2
-END(__memset_niagara1)
-
-ENTRY(__bzero_niagara1)
clr %o2
1:
# ifndef USE_BRP
@@ -171,6 +168,6 @@ ENTRY(__bzero_niagara1)
90:
retl
mov %o3, %o0
-END(__bzero_niagara1)
+END(__memset_niagara1)
#endif
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
index 1ccf24e516..d6fbd83009 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
@@ -39,10 +39,6 @@ ENTRY(__memset_niagara4)
sllx %o2, 32, %g1
ba,pt %icc, 1f
or %g1, %o2, %o4
-END(__memset_niagara4)
-
- .align 32
-ENTRY(__bzero_niagara4)
clr %o4
1: cmp %o1, 16
ble %icc, .Ltiny
@@ -118,6 +114,6 @@ ENTRY(__bzero_niagara4)
bne,pt %icc, 1b
add %o0, 0x30, %o0
ba,a,pt %icc, .Lpostloop
-END(__bzero_niagara4)
+END(__memset_niagara4)
#endif
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
index 491b203ff9..6fcbf56675 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
@@ -99,13 +99,6 @@
.text
.align 32
-ENTRY(__bzero_niagara7)
- /* bzero (dst, size) */
- mov %o1, %o2
- mov 0, %o1
- /* fall through into memset code */
-END(__bzero_niagara7)
-
ENTRY(__memset_niagara7)
/* memset (src, c, size) */
mov %o0, %o5 /* copy sp1 before using it */
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S b/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
index e0d3424307..3c3add791e 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
@@ -25,6 +25,5 @@
# define weak_alias(x, y)
# define memset __memset_ultra1
-# define __bzero __bzero_ultra1
# include <sysdeps/sparc/sparc64/memset.S>
#endif
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 08/12] powerpc: Remove powerpc32 bzero optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (6 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 07/12] " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 09/12] powerpc: Remove powerpc64 " Adhemerval Zanella
` (5 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/powerpc/powerpc32/bzero.S | 27 --------------
.../powerpc32/power4/multiarch/Makefile | 4 +-
.../powerpc32/power4/multiarch/bzero-power6.S | 25 -------------
.../powerpc32/power4/multiarch/bzero-power7.S | 25 -------------
.../powerpc32/power4/multiarch/bzero-ppc32.S | 34 -----------------
.../powerpc32/power4/multiarch/bzero.c | 37 -------------------
.../power4/multiarch/ifunc-impl-list.c | 8 ----
7 files changed, 2 insertions(+), 158 deletions(-)
delete mode 100644 sysdeps/powerpc/powerpc32/bzero.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
diff --git a/sysdeps/powerpc/powerpc32/bzero.S b/sysdeps/powerpc/powerpc32/bzero.S
deleted file mode 100644
index 9cc03c92df..0000000000
--- a/sysdeps/powerpc/powerpc32/bzero.S
+++ /dev/null
@@ -1,27 +0,0 @@
-/* Optimized bzero `implementation' for PowerPC.
- Copyright (C) 1997-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY (__bzero)
-
- mr r5,r4
- li r4,0
- b memset@local
-END (__bzero)
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile b/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
index 5c68f07d19..b2f9deefb8 100644
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
@@ -1,8 +1,8 @@
ifeq ($(subdir),string)
sysdep_routines += memcpy-power7 memcpy-a2 memcpy-power6 memcpy-cell \
memcpy-ppc32 memcmp-power7 memcmp-ppc32 memset-power7 \
- memset-power6 memset-ppc32 bzero-power7 bzero-power6 \
- bzero-ppc32 mempcpy-power7 mempcpy-ppc32 memchr-power7 \
+ memset-power6 memset-ppc32 \
+ mempcpy-power7 mempcpy-ppc32 memchr-power7 \
memchr-ppc32 memrchr-power7 memrchr-ppc32 rawmemchr-power7 \
rawmemchr-ppc32 strlen-power7 strlen-ppc32 strnlen-power7 \
strnlen-ppc32 strncmp-power7 strncmp-ppc32 \
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
deleted file mode 100644
index b352433283..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
+++ /dev/null
@@ -1,25 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/POWER6.
- Copyright (C) 2010-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY (__bzero_power6)
- mr r5,r4
- li r4,0
- b __memset_power6@local
-END (__bzero_power6)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
deleted file mode 100644
index 80c8ffe55a..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
+++ /dev/null
@@ -1,25 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/POWER7.
- Copyright (C) 2010-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY (__bzero_power7)
- mr r5,r4
- li r4,0
- b __memset_power7@local
-END (__bzero_power7)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
deleted file mode 100644
index 86711e8e22..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
+++ /dev/null
@@ -1,34 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/PPC32.
- Copyright (C) 2010-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-/* memset ifunc selector is not built for static and memset@local
- for shared builds makes the linker point the call to the ifunc
- selector. */
-#ifdef SHARED
-# define MEMSET __memset_ppc
-#else
-# define MEMSET memset
-#endif
-
-ENTRY (__bzero_ppc)
- mr r5,r4
- li r4,0
- b MEMSET@local
-END (__bzero_ppc)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
deleted file mode 100644
index 5d9270289f..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
+++ /dev/null
@@ -1,37 +0,0 @@
-/* Multiple versions of bzero.
- Copyright (C) 2013-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for definition in libc. */
-#if IS_IN (libc)
-# include <string.h>
-# include <strings.h>
-# include "init-arch.h"
-
-extern __typeof (bzero) __bzero_ppc attribute_hidden;
-extern __typeof (bzero) __bzero_power6 attribute_hidden;
-extern __typeof (bzero) __bzero_power7 attribute_hidden;
-
-libc_ifunc (__bzero,
- (hwcap & PPC_FEATURE_HAS_VSX)
- ? __bzero_power7 :
- (hwcap & PPC_FEATURE_ARCH_2_05)
- ? __bzero_power6
- : __bzero_ppc);
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
index 9832f366bb..01890367a4 100644
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
@@ -73,14 +73,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__memset_power6)
IFUNC_IMPL_ADD (array, i, memset, 1, __memset_ppc))
- /* Support sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c. */
- IFUNC_IMPL (i, name, bzero,
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_HAS_VSX,
- __bzero_power7)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_ARCH_2_05,
- __bzero_power6)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
-
/* Support sysdeps/powerpc/powerpc32/power4/multiarch/strlen.c. */
IFUNC_IMPL (i, name, strlen,
IFUNC_IMPL_ADD (array, i, strlen, hwcap & PPC_FEATURE_HAS_VSX,
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 09/12] powerpc: Remove powerpc64 bzero optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (7 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 08/12] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 10/12] s390: Remove " Adhemerval Zanella
` (4 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/powerpc/powerpc64/bzero.S | 20 -------
sysdeps/powerpc/powerpc64/le/power10/memset.S | 12 -----
sysdeps/powerpc/powerpc64/memset.S | 13 -----
sysdeps/powerpc/powerpc64/multiarch/bzero.c | 54 -------------------
.../powerpc64/multiarch/ifunc-impl-list.c | 21 --------
.../powerpc64/multiarch/memset-power10.S | 3 --
.../powerpc64/multiarch/memset-power4.S | 3 --
.../powerpc64/multiarch/memset-power6.S | 3 --
.../powerpc64/multiarch/memset-power7.S | 2 -
.../powerpc64/multiarch/memset-power8.S | 3 --
.../powerpc64/multiarch/memset-ppc64.S | 16 +-----
sysdeps/powerpc/powerpc64/power4/memset.S | 12 -----
sysdeps/powerpc/powerpc64/power6/memset.S | 12 -----
sysdeps/powerpc/powerpc64/power7/memset.S | 12 -----
sysdeps/powerpc/powerpc64/power8/memset.S | 12 -----
15 files changed, 1 insertion(+), 197 deletions(-)
delete mode 100644 sysdeps/powerpc/powerpc64/bzero.S
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bzero.c
diff --git a/sysdeps/powerpc/powerpc64/bzero.S b/sysdeps/powerpc/powerpc64/bzero.S
deleted file mode 100644
index a7ca73cc39..0000000000
--- a/sysdeps/powerpc/powerpc64/bzero.S
+++ /dev/null
@@ -1,20 +0,0 @@
-/* Optimized bzero `implementation' for PowerPC64.
- Copyright (C) 1997-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* This code was moved into memset.S to solve a double stub call problem.
- @local would have worked but it is not supported in PowerPC64 asm. */
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memset.S b/sysdeps/powerpc/powerpc64/le/power10/memset.S
index bee6d8b31b..0f43b002bf 100644
--- a/sysdeps/powerpc/powerpc64/le/power10/memset.S
+++ b/sysdeps/powerpc/powerpc64/le/power10/memset.S
@@ -242,15 +242,3 @@ L(bcdz_tail):
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 2
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/memset.S b/sysdeps/powerpc/powerpc64/memset.S
index 34ee8ffca4..b813cd3c6b 100644
--- a/sysdeps/powerpc/powerpc64/memset.S
+++ b/sysdeps/powerpc/powerpc64/memset.S
@@ -253,16 +253,3 @@ L(medium_28t):
blr
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-#ifndef NO_BZERO_IMPL
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END_GEN_TB (__bzero,TB_TOCLESS)
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bzero.c b/sysdeps/powerpc/powerpc64/multiarch/bzero.c
deleted file mode 100644
index f83d6da55b..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bzero.c
+++ /dev/null
@@ -1,54 +0,0 @@
-/* Multiple versions of bzero. PowerPC64 version.
- Copyright (C) 2013-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for definition in libc. */
-#if IS_IN (libc)
-# include <string.h>
-# include <strings.h>
-# include "init-arch.h"
-
-extern __typeof (bzero) __bzero_ppc attribute_hidden;
-extern __typeof (bzero) __bzero_power4 attribute_hidden;
-extern __typeof (bzero) __bzero_power6 attribute_hidden;
-extern __typeof (bzero) __bzero_power7 attribute_hidden;
-extern __typeof (bzero) __bzero_power8 attribute_hidden;
-# ifdef __LITTLE_ENDIAN__
-extern __typeof (bzero) __bzero_power10 attribute_hidden;
-# endif
-
-libc_ifunc (__bzero,
-# ifdef __LITTLE_ENDIAN__
- (hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX)
- ? __bzero_power10 :
-# endif
- (hwcap2 & PPC_FEATURE2_ARCH_2_07
- && hwcap & PPC_FEATURE_HAS_ALTIVEC)
- ? __bzero_power8 :
- (hwcap & PPC_FEATURE_HAS_VSX)
- ? __bzero_power7 :
- (hwcap & PPC_FEATURE_ARCH_2_05
- && hwcap & PPC_FEATURE_HAS_ALTIVEC)
- ? __bzero_power6 :
- (hwcap & PPC_FEATURE_POWER4)
- ? __bzero_power4
- : __bzero_ppc);
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 280b8616b2..ac533a9886 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -223,27 +223,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__memcmp_power4)
IFUNC_IMPL_ADD (array, i, memcmp, 1, __memcmp_ppc))
- /* Support sysdeps/powerpc/powerpc64/multiarch/bzero.c. */
- IFUNC_IMPL (i, name, bzero,
-#ifdef __LITTLE_ENDIAN__
- IFUNC_IMPL_ADD (array, i, bzero,
- hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX,
- __bzero_power10)
-#endif
- IFUNC_IMPL_ADD (array, i, bzero, hwcap2 & PPC_FEATURE2_ARCH_2_07
- && hwcap & PPC_FEATURE_HAS_ALTIVEC,
- __bzero_power8)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_HAS_VSX,
- __bzero_power7)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_ARCH_2_05
- && hwcap & PPC_FEATURE_HAS_ALTIVEC,
- __bzero_power6)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_POWER4,
- __bzero_power4)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
-
/* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c. */
IFUNC_IMPL (i, name, mempcpy,
IFUNC_IMPL_ADD (array, i, mempcpy,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
index ead0b67926..ba5bee1c7a 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power10
-
#include <sysdeps/powerpc/powerpc64/le/power10/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
index 6f5631d03d..4ee567c6f9 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power4
-
#include <sysdeps/powerpc/powerpc64/power4/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
index b81f4f0d64..9f5e7d1b37 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power6
-
#include <sysdeps/powerpc/powerpc64/power6/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
index a8ca12db83..6fd92d5afc 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
@@ -21,6 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power7
#include <sysdeps/powerpc/powerpc64/power7/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
index b06587aa2d..43cc5c7339 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power8
-
#include <sysdeps/powerpc/powerpc64/power8/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S b/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
index 876954d36b..30b25ef15f 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
@@ -1,4 +1,4 @@
-/* Default memset/bzero implementation for PowerPC64.
+/* Default memset implementation for PowerPC64.
Copyright (C) 2013-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.
@@ -18,17 +18,6 @@
#include <sysdep.h>
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. NOTE: this code should be positioned
- before ENTRY/END_GEN_TB redefinition. */
-ENTRY (__bzero_ppc)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END_GEN_TB (__bzero_ppc,TB_TOCLESS)
-
-
#if defined SHARED && IS_IN (libc)
# define MEMSET __memset_ppc
@@ -36,7 +25,4 @@ END_GEN_TB (__bzero_ppc,TB_TOCLESS)
# define libc_hidden_builtin_def(name)
#endif
-/* Do not implement __bzero at powerpc64/memset.S. */
-#define NO_BZERO_IMPL
-
#include <sysdeps/powerpc/powerpc64/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/power4/memset.S b/sysdeps/powerpc/powerpc64/power4/memset.S
index dfc136261b..0f14a5198a 100644
--- a/sysdeps/powerpc/powerpc64/power4/memset.S
+++ b/sysdeps/powerpc/powerpc64/power4/memset.S
@@ -237,15 +237,3 @@ L(medium_28t):
blr
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power6/memset.S b/sysdeps/powerpc/powerpc64/power6/memset.S
index 7ad82c38e6..140a756348 100644
--- a/sysdeps/powerpc/powerpc64/power6/memset.S
+++ b/sysdeps/powerpc/powerpc64/power6/memset.S
@@ -381,15 +381,3 @@ L(medium_28t):
blr
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power7/memset.S b/sysdeps/powerpc/powerpc64/power7/memset.S
index 31aa0f91cf..358199a805 100644
--- a/sysdeps/powerpc/powerpc64/power7/memset.S
+++ b/sysdeps/powerpc/powerpc64/power7/memset.S
@@ -384,15 +384,3 @@ L(small):
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power8/memset.S b/sysdeps/powerpc/powerpc64/power8/memset.S
index 9ecb6f3067..70cace14ef 100644
--- a/sysdeps/powerpc/powerpc64/power8/memset.S
+++ b/sysdeps/powerpc/powerpc64/power8/memset.S
@@ -504,15 +504,3 @@ L(LE7_tail5):
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 10/12] s390: Remove bzero optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (8 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 09/12] powerpc: Remove powerpc64 " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 11/12] i686: " Adhemerval Zanella
` (3 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/s390/Makefile | 2 +-
sysdeps/s390/bzero.c | 47 ------------------------
sysdeps/s390/ifunc-memset.h | 9 -----
sysdeps/s390/memset-z900.S | 32 +---------------
sysdeps/s390/multiarch/ifunc-impl-list.c | 15 --------
5 files changed, 2 insertions(+), 103 deletions(-)
delete mode 100644 sysdeps/s390/bzero.c
diff --git a/sysdeps/s390/Makefile b/sysdeps/s390/Makefile
index ade8663218..5b6a96579c 100644
--- a/sysdeps/s390/Makefile
+++ b/sysdeps/s390/Makefile
@@ -66,7 +66,7 @@ endif
endif
ifeq ($(subdir),string)
-sysdep_routines += bzero memset memset-z900 \
+sysdep_routines += memset memset-z900 \
memcmp memcmp-z900 \
mempcpy memcpy memcpy-z900 \
memmove memmove-c \
diff --git a/sysdeps/s390/bzero.c b/sysdeps/s390/bzero.c
deleted file mode 100644
index 1f0a03e2ed..0000000000
--- a/sysdeps/s390/bzero.c
+++ /dev/null
@@ -1,47 +0,0 @@
-/* Multiple versions of bzero.
- Copyright (C) 2018-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <ifunc-memset.h>
-#if HAVE_MEMSET_IFUNC
-# include <string.h>
-# include <ifunc-resolve.h>
-
-# if HAVE_MEMSET_Z900_G5
-extern __typeof (__bzero) BZERO_Z900_G5 attribute_hidden;
-# endif
-
-# if HAVE_MEMSET_Z10
-extern __typeof (__bzero) BZERO_Z10 attribute_hidden;
-# endif
-
-# if HAVE_MEMSET_Z196
-extern __typeof (__bzero) BZERO_Z196 attribute_hidden;
-# endif
-
-s390_libc_ifunc_expr (__bzero, __bzero,
- ({
- s390_libc_ifunc_expr_stfle_init ();
- (HAVE_MEMSET_Z196 && S390_IS_Z196 (stfle_bits))
- ? BZERO_Z196
- : (HAVE_MEMSET_Z10 && S390_IS_Z10 (stfle_bits))
- ? BZERO_Z10
- : BZERO_DEFAULT;
- })
- )
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/s390/ifunc-memset.h b/sysdeps/s390/ifunc-memset.h
index db15df9bc1..7098332e92 100644
--- a/sysdeps/s390/ifunc-memset.h
+++ b/sysdeps/s390/ifunc-memset.h
@@ -25,19 +25,16 @@
#if defined HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
# define MEMSET_DEFAULT MEMSET_Z196
-# define BZERO_DEFAULT BZERO_Z196
# define HAVE_MEMSET_Z900_G5 0
# define HAVE_MEMSET_Z10 0
# define HAVE_MEMSET_Z196 1
#elif defined HAVE_S390_MIN_Z10_ZARCH_ASM_SUPPORT
# define MEMSET_DEFAULT MEMSET_Z10
-# define BZERO_DEFAULT BZERO_Z10
# define HAVE_MEMSET_Z900_G5 0
# define HAVE_MEMSET_Z10 1
# define HAVE_MEMSET_Z196 HAVE_MEMSET_IFUNC
#else
# define MEMSET_DEFAULT MEMSET_Z900_G5
-# define BZERO_DEFAULT BZERO_Z900_G5
# define HAVE_MEMSET_Z900_G5 1
# define HAVE_MEMSET_Z10 HAVE_MEMSET_IFUNC
# define HAVE_MEMSET_Z196 HAVE_MEMSET_IFUNC
@@ -51,24 +48,18 @@
#if HAVE_MEMSET_Z900_G5
# define MEMSET_Z900_G5 __memset_default
-# define BZERO_Z900_G5 __bzero_default
#else
# define MEMSET_Z900_G5 NULL
-# define BZERO_Z900_G5 NULL
#endif
#if HAVE_MEMSET_Z10
# define MEMSET_Z10 __memset_z10
-# define BZERO_Z10 __bzero_z10
#else
# define MEMSET_Z10 NULL
-# define BZERO_Z10 NULL
#endif
#if HAVE_MEMSET_Z196
# define MEMSET_Z196 __memset_z196
-# define BZERO_Z196 __bzero_z196
#else
# define MEMSET_Z196 NULL
-# define BZERO_Z196 NULL
#endif
diff --git a/sysdeps/s390/memset-z900.S b/sysdeps/s390/memset-z900.S
index d454743f75..7adb466bb1 100644
--- a/sysdeps/s390/memset-z900.S
+++ b/sysdeps/s390/memset-z900.S
@@ -24,11 +24,7 @@
/* INPUT PARAMETERS - MEMSET
%r2 = address of memory area
%r3 = byte to fill memory with
- %r4 = number of bytes to fill.
-
- INPUT PARAMETERS - BZERO
- %r2 = address of memory area
- %r3 = number of bytes to fill. */
+ %r4 = number of bytes to fill. */
.text
@@ -47,12 +43,6 @@
# define BRCTG brct
# endif /* ! defined __s390x__ */
-ENTRY(BZERO_Z900_G5)
- LGR %r4,%r3
- xr %r3,%r3
- j .L_Z900_G5_start
-END(BZERO_Z900_G5)
-
ENTRY(MEMSET_Z900_G5)
.L_Z900_G5_start:
#if defined __s390x__
@@ -100,14 +90,6 @@ END(MEMSET_Z900_G5)
#endif /* HAVE_MEMSET_Z900_G5 */
#if HAVE_MEMSET_Z10
-ENTRY(BZERO_Z10)
- .machine "z10"
- .machinemode "zarch_nohighgprs"
- lgr %r4,%r3
- xr %r3,%r3
- j .L_Z10_start
-END(BZERO_Z10)
-
ENTRY(MEMSET_Z10)
.L_Z10_start:
.machine "z10"
@@ -141,14 +123,6 @@ END(MEMSET_Z10)
#endif /* HAVE_MEMSET_Z10 */
#if HAVE_MEMSET_Z196
-ENTRY(BZERO_Z196)
- .machine "z196"
- .machinemode "zarch_nohighgprs"
- lgr %r4,%r3
- xr %r3,%r3
- j .L_Z196_start
-END(BZERO_Z196)
-
ENTRY(MEMSET_Z196)
.L_Z196_start:
.machine "z196"
@@ -204,10 +178,6 @@ END(__memset_mvcle)
/* If we don't use ifunc, define an alias for memset here.
Otherwise see sysdeps/s390/memset.c. */
strong_alias (MEMSET_DEFAULT, memset)
-/* Same for bzero. If ifunc is used, see
- sysdeps/s390/bzero.c. */
-strong_alias (BZERO_DEFAULT, __bzero)
-weak_alias (__bzero, bzero)
#endif
#if defined SHARED && IS_IN (libc)
diff --git a/sysdeps/s390/multiarch/ifunc-impl-list.c b/sysdeps/s390/multiarch/ifunc-impl-list.c
index 29598c2a6e..c1902b2c26 100644
--- a/sysdeps/s390/multiarch/ifunc-impl-list.c
+++ b/sysdeps/s390/multiarch/ifunc-impl-list.c
@@ -102,21 +102,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
# endif
# if HAVE_MEMSET_Z900_G5
IFUNC_IMPL_ADD (array, i, memset, 1, MEMSET_Z900_G5)
-# endif
- )
-
- /* Note: bzero is implemented in memset. */
- IFUNC_IMPL (i, name, bzero,
-# if HAVE_MEMSET_Z196
- IFUNC_IMPL_ADD (array, i, bzero,
- S390_IS_Z196 (stfle_bits), BZERO_Z196)
-# endif
-# if HAVE_MEMSET_Z10
- IFUNC_IMPL_ADD (array, i, bzero,
- S390_IS_Z10 (stfle_bits), BZERO_Z10)
-# endif
-# if HAVE_MEMSET_Z900_G5
- IFUNC_IMPL_ADD (array, i, bzero, 1, BZERO_Z900_G5)
# endif
)
#endif /* HAVE_MEMSET_IFUNC */
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 11/12] i686: Remove bzero optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (9 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 10/12] s390: Remove " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 12/12] x86_64: " Adhemerval Zanella
` (2 subsequent siblings)
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/i386/bzero.S | 5 ---
sysdeps/i386/i586/bzero.S | 4 --
sysdeps/i386/i586/memset.S | 16 ++------
sysdeps/i386/i686/bzero.S | 4 --
sysdeps/i386/i686/memset.S | 23 +++---------
sysdeps/i386/i686/multiarch/Makefile | 6 +--
sysdeps/i386/i686/multiarch/bzero-ia32.S | 37 -------------------
sysdeps/i386/i686/multiarch/bzero-sse2-rep.S | 3 --
sysdeps/i386/i686/multiarch/bzero-sse2.S | 3 --
sysdeps/i386/i686/multiarch/bzero.c | 32 ----------------
sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 8 ----
sysdeps/i386/i686/multiarch/memset-sse2-rep.S | 24 +++---------
sysdeps/i386/i686/multiarch/memset-sse2.S | 24 +++---------
sysdeps/i386/memset.S | 14 +------
14 files changed, 22 insertions(+), 181 deletions(-)
delete mode 100644 sysdeps/i386/bzero.S
delete mode 100644 sysdeps/i386/i586/bzero.S
delete mode 100644 sysdeps/i386/i686/bzero.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-ia32.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero.c
diff --git a/sysdeps/i386/bzero.S b/sysdeps/i386/bzero.S
deleted file mode 100644
index c8dd47b4da..0000000000
--- a/sysdeps/i386/bzero.S
+++ /dev/null
@@ -1,5 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include "memset.S"
-
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i586/bzero.S b/sysdeps/i386/i586/bzero.S
deleted file mode 100644
index 2a106719a4..0000000000
--- a/sysdeps/i386/i586/bzero.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include <sysdeps/i386/i586/memset.S>
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i586/memset.S b/sysdeps/i386/i586/memset.S
index ae09c3b40a..672af41398 100644
--- a/sysdeps/i386/i586/memset.S
+++ b/sysdeps/i386/i586/memset.S
@@ -23,15 +23,11 @@
#define PARMS 4+4 /* space for 1 saved reg */
#define RTN PARMS
#define DEST RTN
-#ifdef USE_AS_BZERO
-# define LEN DEST+4
-#else
-# define CHR DEST+4
-# define LEN CHR+4
-#endif
+#define CHR DEST+4
+#define LEN CHR+4
.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -46,15 +42,11 @@ ENTRY (memset)
movl DEST(%esp), %edi
cfi_rel_offset (edi, 0)
movl LEN(%esp), %edx
-#ifdef USE_AS_BZERO
- xorl %eax, %eax /* we fill with 0 */
-#else
movb CHR(%esp), %al
movb %al, %ah
movl %eax, %ecx
shll $16, %eax
movw %cx, %ax
-#endif
cld
/* If less than 36 bytes to write, skip tricky code (it wouldn't work). */
@@ -100,10 +92,8 @@ L(2): shrl $2, %ecx /* convert byte count to longword count */
rep
stosb
-#ifndef USE_AS_BZERO
/* Load result (only if used as memset). */
movl DEST(%esp), %eax /* start address of destination is result */
-#endif
popl %edi
cfi_adjust_cfa_offset (-4)
cfi_restore (edi)
diff --git a/sysdeps/i386/i686/bzero.S b/sysdeps/i386/i686/bzero.S
deleted file mode 100644
index c7898f18e0..0000000000
--- a/sysdeps/i386/i686/bzero.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include <sysdeps/i386/i686/memset.S>
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i686/memset.S b/sysdeps/i386/i686/memset.S
index fd5b26aeae..3cb86c016d 100644
--- a/sysdeps/i386/i686/memset.S
+++ b/sysdeps/i386/i686/memset.S
@@ -21,18 +21,13 @@
#include "asm-syntax.h"
#define PARMS 4+4 /* space for 1 saved reg */
-#ifdef USE_AS_BZERO
-# define DEST PARMS
-# define LEN DEST+4
-#else
-# define RTN PARMS
-# define DEST RTN
-# define CHR DEST+4
-# define LEN CHR+4
-#endif
+#define RTN PARMS
+#define DEST RTN
+#define CHR DEST+4
+#define LEN CHR+4
.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY_CHK (__memset_chk)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -46,11 +41,7 @@ ENTRY (memset)
cfi_adjust_cfa_offset (4)
movl DEST(%esp), %edx
movl LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
- xorl %eax, %eax /* fill with 0 */
-#else
movzbl CHR(%esp), %eax
-#endif
jecxz 1f
movl %edx, %edi
cfi_rel_offset (edi, 0)
@@ -70,9 +61,7 @@ ENTRY (memset)
2: movl %ecx, %edx
shrl $2, %ecx
andl $3, %edx
-#ifndef USE_AS_BZERO
imul $0x01010101, %eax
-#endif
rep
stosl
movl %edx, %ecx
@@ -80,9 +69,7 @@ ENTRY (memset)
stosb
1:
-#ifndef USE_AS_BZERO
movl DEST(%esp), %eax /* start address of destination is result */
-#endif
popl %edi
cfi_adjust_cfa_offset (-4)
cfi_restore (edi)
diff --git a/sysdeps/i386/i686/multiarch/Makefile b/sysdeps/i386/i686/multiarch/Makefile
index 02fa02658e..9fe5ea8639 100644
--- a/sysdeps/i386/i686/multiarch/Makefile
+++ b/sysdeps/i386/i686/multiarch/Makefile
@@ -1,9 +1,9 @@
ifeq ($(subdir),string)
gen-as-const-headers += locale-defines.sym
-sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
+sysdep_routines += memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
memmove-ssse3 memcpy-ssse3-rep mempcpy-ssse3-rep \
memmove-ssse3-rep \
- memset-sse2-rep bzero-sse2-rep strcmp-ssse3 \
+ memset-sse2-rep strcmp-ssse3 \
strcmp-sse4 strncmp-c strncmp-ssse3 strncmp-sse4 \
memcmp-ssse3 memcmp-sse4 varshift \
strlen-sse2 strlen-sse2-bsf strncpy-c strcpy-ssse3 \
@@ -21,7 +21,7 @@ sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
memcpy-sse2-unaligned \
mempcpy-sse2-unaligned memmove-sse2-unaligned \
strcspn-c strpbrk-c strspn-c \
- bzero-ia32 rawmemchr-ia32 \
+ rawmemchr-ia32 \
memchr-ia32 memcmp-ia32 memcpy-ia32 memmove-ia32 \
mempcpy-ia32 memset-ia32 strcat-ia32 strchr-ia32 \
strrchr-ia32 strcpy-ia32 strcmp-ia32 strcspn-ia32 \
diff --git a/sysdeps/i386/i686/multiarch/bzero-ia32.S b/sysdeps/i386/i686/multiarch/bzero-ia32.S
deleted file mode 100644
index 96afe9bad1..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-ia32.S
+++ /dev/null
@@ -1,37 +0,0 @@
-/* bzero optimized for i686.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-#if IS_IN (libc)
-# define __bzero __bzero_ia32
-
-# ifdef SHARED
-# undef libc_hidden_builtin_def
-/* IFUNC doesn't work with the hidden functions in shared library since
- they will be called without setting up EBX needed for PLT which is
- used by IFUNC. */
-# define libc_hidden_builtin_def(name) \
- .globl __GI___bzero; __GI___bzero = __bzero
-# endif
-
-# undef weak_alias
-# define weak_alias(original, alias)
-
-# include <sysdeps/i386/i686/bzero.S>
-#endif
diff --git a/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S b/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
deleted file mode 100644
index 507b288bb3..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BZERO
-#define __memset_sse2_rep __bzero_sse2_rep
-#include "memset-sse2-rep.S"
diff --git a/sysdeps/i386/i686/multiarch/bzero-sse2.S b/sysdeps/i386/i686/multiarch/bzero-sse2.S
deleted file mode 100644
index 8d04512e4e..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-sse2.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BZERO
-#define __memset_sse2 __bzero_sse2
-#include "memset-sse2.S"
diff --git a/sysdeps/i386/i686/multiarch/bzero.c b/sysdeps/i386/i686/multiarch/bzero.c
deleted file mode 100644
index 7fd0ddd576..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* Multiple versions of bzero.
- All versions must be listed in ifunc-impl-list.c.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for the definition in libc. */
-#if IS_IN (libc)
-# define bzero __redirect_bzero
-# include <string.h>
-# undef bzero
-
-# define SYMBOL_NAME bzero
-# include "ifunc-memset.h"
-
-libc_ifunc_redirected (__redirect_bzero, __bzero, IFUNC_SELECTOR ());
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
index 5c7a42dc97..c014f52bf9 100644
--- a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
+++ b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
@@ -36,14 +36,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
size_t i = 0;
- /* Support sysdeps/i386/i686/multiarch/bzero.S. */
- IFUNC_IMPL (i, name, bzero,
- IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
- __bzero_sse2_rep)
- IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
- __bzero_sse2)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ia32))
-
/* Support sysdeps/i386/i686/multiarch/memchr.S. */
IFUNC_IMPL (i, name, memchr,
IFUNC_IMPL_ADD (array, i, memchr, CPU_FEATURE_USABLE (SSE2),
diff --git a/sysdeps/i386/i686/multiarch/memset-sse2-rep.S b/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
index 37a10575e7..28df7836e0 100644
--- a/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
+++ b/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
@@ -32,16 +32,10 @@
#define PUSH(REG) pushl REG; CFI_PUSH (REG)
#define POP(REG) popl REG; CFI_POP (REG)
-#ifdef USE_AS_BZERO
-# define DEST PARMS
-# define LEN DEST+4
-# define SETRTNVAL
-#else
-# define DEST PARMS
-# define CHR DEST+4
-# define LEN CHR+4
-# define SETRTNVAL movl DEST(%esp), %eax
-#endif
+#define DEST PARMS
+#define CHR DEST+4
+#define LEN CHR+4
+#define SETRTNVAL movl DEST(%esp), %eax
#ifdef PIC
# define ENTRANCE PUSH (%ebx);
@@ -78,7 +72,7 @@
#endif
.section .text.sse2,"ax",@progbits
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk_sse2_rep)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -89,16 +83,12 @@ ENTRY (__memset_sse2_rep)
ENTRANCE
movl LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
- xor %eax, %eax
-#else
movzbl CHR(%esp), %eax
movb %al, %ah
/* Fill the whole EAX with pattern. */
movl %eax, %edx
shl $16, %eax
or %edx, %eax
-#endif
movl DEST(%esp), %edx
cmp $32, %ecx
jae L(32bytesormore)
@@ -228,12 +218,8 @@ L(write_3bytes):
/* ECX > 32 and EDX is 4 byte aligned. */
L(32bytesormore):
/* Fill xmm0 with the pattern. */
-#ifdef USE_AS_BZERO
- pxor %xmm0, %xmm0
-#else
movd %eax, %xmm0
pshufd $0, %xmm0, %xmm0
-#endif
testl $0xf, %edx
jz L(aligned_16)
/* ECX > 32 and EDX is not 16 byte aligned. */
diff --git a/sysdeps/i386/i686/multiarch/memset-sse2.S b/sysdeps/i386/i686/multiarch/memset-sse2.S
index 455519c7ac..4e8414fd51 100644
--- a/sysdeps/i386/i686/multiarch/memset-sse2.S
+++ b/sysdeps/i386/i686/multiarch/memset-sse2.S
@@ -32,16 +32,10 @@
#define PUSH(REG) pushl REG; CFI_PUSH (REG)
#define POP(REG) popl REG; CFI_POP (REG)
-#ifdef USE_AS_BZERO
-# define DEST PARMS
-# define LEN DEST+4
-# define SETRTNVAL
-#else
-# define DEST PARMS
-# define CHR DEST+4
-# define LEN CHR+4
-# define SETRTNVAL movl DEST(%esp), %eax
-#endif
+#define DEST PARMS
+#define CHR DEST+4
+#define LEN CHR+4
+#define SETRTNVAL movl DEST(%esp), %eax
#ifdef PIC
# define ENTRANCE PUSH (%ebx);
@@ -78,7 +72,7 @@
#endif
.section .text.sse2,"ax",@progbits
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk_sse2)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -89,16 +83,12 @@ ENTRY (__memset_sse2)
ENTRANCE
movl LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
- xor %eax, %eax
-#else
movzbl CHR(%esp), %eax
movb %al, %ah
/* Fill the whole EAX with pattern. */
movl %eax, %edx
shl $16, %eax
or %edx, %eax
-#endif
movl DEST(%esp), %edx
cmp $32, %ecx
jae L(32bytesormore)
@@ -228,12 +218,8 @@ L(write_3bytes):
/* ECX > 32 and EDX is 4 byte aligned. */
L(32bytesormore):
/* Fill xmm0 with the pattern. */
-#ifdef USE_AS_BZERO
- pxor %xmm0, %xmm0
-#else
movd %eax, %xmm0
pshufd $0, %xmm0, %xmm0
-#endif
testl $0xf, %edx
jz L(aligned_16)
/* ECX > 32 and EDX is not 16 byte aligned. */
diff --git a/sysdeps/i386/memset.S b/sysdeps/i386/memset.S
index f470511b64..db2753eb2f 100644
--- a/sysdeps/i386/memset.S
+++ b/sysdeps/i386/memset.S
@@ -30,15 +30,11 @@
#define POP(REG) popl REG; CFI_POP (REG)
#define STR1 8
-#ifdef USE_AS_BZERO
-#define N STR1+4
-#else
#define STR2 STR1+4
#define N STR2+4
-#endif
.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -49,20 +45,12 @@ ENTRY (memset)
PUSH (%edi)
movl N(%esp), %ecx
movl STR1(%esp), %edi
-#ifdef USE_AS_BZERO
- xor %eax, %eax
-#else
movzbl STR2(%esp), %eax
mov %edi, %edx
-#endif
rep stosb
-#ifndef USE_AS_BZERO
mov %edx, %eax
-#endif
POP (%edi)
ret
END (memset)
-#ifndef USE_AS_BZERO
libc_hidden_builtin_def (memset)
-#endif
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 12/12] x86_64: Remove bzero optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (10 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 11/12] i686: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
2022-02-14 14:29 ` [PATCH 00/12] Remove bcopy and " Florian Weimer
2022-02-21 16:39 ` Szabolcs Nagy
13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/x86_64/bzero.S | 1 -
sysdeps/x86_64/memset.S | 10 +-
sysdeps/x86_64/multiarch/Makefile | 1 -
sysdeps/x86_64/multiarch/bzero.c | 106 ------------------
sysdeps/x86_64/multiarch/ifunc-impl-list.c | 42 -------
.../memset-avx2-unaligned-erms-rtm.S | 1 -
.../multiarch/memset-avx2-unaligned-erms.S | 6 -
.../multiarch/memset-avx512-unaligned-erms.S | 3 -
.../multiarch/memset-evex-unaligned-erms.S | 3 -
.../multiarch/memset-sse2-unaligned-erms.S | 5 -
.../multiarch/memset-vec-unaligned-erms.S | 56 +--------
11 files changed, 2 insertions(+), 232 deletions(-)
delete mode 100644 sysdeps/x86_64/bzero.S
delete mode 100644 sysdeps/x86_64/multiarch/bzero.c
diff --git a/sysdeps/x86_64/bzero.S b/sysdeps/x86_64/bzero.S
deleted file mode 100644
index f96d567fd8..0000000000
--- a/sysdeps/x86_64/bzero.S
+++ /dev/null
@@ -1 +0,0 @@
-/* Implemented in memset.S. */
diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S
index af26e9cedc..a6eea61a4d 100644
--- a/sysdeps/x86_64/memset.S
+++ b/sysdeps/x86_64/memset.S
@@ -1,4 +1,4 @@
-/* memset/bzero -- set memory area to CH/0
+/* memset -- set memory area to CH/0
Optimized version for x86-64.
Copyright (C) 2002-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.
@@ -35,9 +35,6 @@
punpcklwd %xmm0, %xmm0; \
pshufd $0, %xmm0, %xmm0
-# define BZERO_ZERO_VEC0() \
- pxor %xmm0, %xmm0
-
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
movd d, %xmm0; \
pshufd $0, %xmm0, %xmm0; \
@@ -56,10 +53,6 @@
# define MEMSET_SYMBOL(p,s) memset
#endif
-#ifndef BZERO_SYMBOL
-# define BZERO_SYMBOL(p,s) __bzero
-#endif
-
#ifndef WMEMSET_SYMBOL
# define WMEMSET_CHK_SYMBOL(p,s) p
# define WMEMSET_SYMBOL(p,s) __wmemset
@@ -70,7 +63,6 @@
libc_hidden_builtin_def (memset)
#if IS_IN (libc)
-weak_alias (__bzero, bzero)
libc_hidden_def (__wmemset)
weak_alias (__wmemset, wmemset)
libc_hidden_weak (wmemset)
diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile
index e7b413edad..4274bfdd0d 100644
--- a/sysdeps/x86_64/multiarch/Makefile
+++ b/sysdeps/x86_64/multiarch/Makefile
@@ -1,7 +1,6 @@
ifeq ($(subdir),string)
sysdep_routines += \
- bzero \
memchr-avx2 \
memchr-avx2-rtm \
memchr-evex \
diff --git a/sysdeps/x86_64/multiarch/bzero.c b/sysdeps/x86_64/multiarch/bzero.c
deleted file mode 100644
index 58a14b2c33..0000000000
--- a/sysdeps/x86_64/multiarch/bzero.c
+++ /dev/null
@@ -1,106 +0,0 @@
-/* Multiple versions of bzero.
- All versions must be listed in ifunc-impl-list.c.
- Copyright (C) 2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for the definition in libc. */
-#if IS_IN (libc)
-# define __bzero __redirect___bzero
-# include <string.h>
-# undef __bzero
-
-# define SYMBOL_NAME __bzero
-# include <init-arch.h>
-
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (sse2_unaligned)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (sse2_unaligned_erms)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned) attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_erms)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_rtm)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_erms_rtm)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (evex_unaligned)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (evex_unaligned_erms)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx512_unaligned)
- attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx512_unaligned_erms)
- attribute_hidden;
-
-static inline void *
-IFUNC_SELECTOR (void)
-{
- const struct cpu_features* cpu_features = __get_cpu_features ();
-
- if (CPU_FEATURE_USABLE_P (cpu_features, AVX512F)
- && !CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_AVX512))
- {
- if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)
- && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)
- && CPU_FEATURE_USABLE_P (cpu_features, BMI2))
- {
- if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
- return OPTIMIZE1 (avx512_unaligned_erms);
-
- return OPTIMIZE1 (avx512_unaligned);
- }
- }
-
- if (CPU_FEATURE_USABLE_P (cpu_features, AVX2))
- {
- if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)
- && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)
- && CPU_FEATURE_USABLE_P (cpu_features, BMI2))
- {
- if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
- return OPTIMIZE1 (evex_unaligned_erms);
-
- return OPTIMIZE1 (evex_unaligned);
- }
-
- if (CPU_FEATURE_USABLE_P (cpu_features, RTM))
- {
- if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
- return OPTIMIZE1 (avx2_unaligned_erms_rtm);
-
- return OPTIMIZE1 (avx2_unaligned_rtm);
- }
-
- if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER))
- {
- if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
- return OPTIMIZE1 (avx2_unaligned_erms);
-
- return OPTIMIZE1 (avx2_unaligned);
- }
- }
-
- if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
- return OPTIMIZE1 (sse2_unaligned_erms);
-
- return OPTIMIZE1 (sse2_unaligned);
-}
-
-libc_ifunc_redirected (__redirect___bzero, __bzero, IFUNC_SELECTOR ());
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
index a594f4176e..68a56797d4 100644
--- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
@@ -300,48 +300,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__memset_avx512_no_vzeroupper)
)
- /* Support sysdeps/x86_64/multiarch/bzero.c. */
- IFUNC_IMPL (i, name, bzero,
- IFUNC_IMPL_ADD (array, i, bzero, 1,
- __bzero_sse2_unaligned)
- IFUNC_IMPL_ADD (array, i, bzero, 1,
- __bzero_sse2_unaligned_erms)
- IFUNC_IMPL_ADD (array, i, bzero,
- CPU_FEATURE_USABLE (AVX2),
- __bzero_avx2_unaligned)
- IFUNC_IMPL_ADD (array, i, bzero,
- CPU_FEATURE_USABLE (AVX2),
- __bzero_avx2_unaligned_erms)
- IFUNC_IMPL_ADD (array, i, bzero,
- (CPU_FEATURE_USABLE (AVX2)
- && CPU_FEATURE_USABLE (RTM)),
- __bzero_avx2_unaligned_rtm)
- IFUNC_IMPL_ADD (array, i, bzero,
- (CPU_FEATURE_USABLE (AVX2)
- && CPU_FEATURE_USABLE (RTM)),
- __bzero_avx2_unaligned_erms_rtm)
- IFUNC_IMPL_ADD (array, i, bzero,
- (CPU_FEATURE_USABLE (AVX512VL)
- && CPU_FEATURE_USABLE (AVX512BW)
- && CPU_FEATURE_USABLE (BMI2)),
- __bzero_evex_unaligned)
- IFUNC_IMPL_ADD (array, i, bzero,
- (CPU_FEATURE_USABLE (AVX512VL)
- && CPU_FEATURE_USABLE (AVX512BW)
- && CPU_FEATURE_USABLE (BMI2)),
- __bzero_evex_unaligned_erms)
- IFUNC_IMPL_ADD (array, i, bzero,
- (CPU_FEATURE_USABLE (AVX512VL)
- && CPU_FEATURE_USABLE (AVX512BW)
- && CPU_FEATURE_USABLE (BMI2)),
- __bzero_avx512_unaligned_erms)
- IFUNC_IMPL_ADD (array, i, bzero,
- (CPU_FEATURE_USABLE (AVX512VL)
- && CPU_FEATURE_USABLE (AVX512BW)
- && CPU_FEATURE_USABLE (BMI2)),
- __bzero_avx512_unaligned)
- )
-
/* Support sysdeps/x86_64/multiarch/rawmemchr.c. */
IFUNC_IMPL (i, name, rawmemchr,
IFUNC_IMPL_ADD (array, i, rawmemchr,
diff --git a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
index 5a5ee6f672..8ac3e479bb 100644
--- a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
+++ b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
@@ -5,7 +5,6 @@
#define SECTION(p) p##.avx.rtm
#define MEMSET_SYMBOL(p,s) p##_avx2_##s##_rtm
-#define BZERO_SYMBOL(p,s) p##_avx2_##s##_rtm
#define WMEMSET_SYMBOL(p,s) p##_avx2_##s##_rtm
#include "memset-avx2-unaligned-erms.S"
diff --git a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
index a093a2831f..c0bf2875d0 100644
--- a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
@@ -14,9 +14,6 @@
vmovd d, %xmm0; \
movq r, %rax;
-# define BZERO_ZERO_VEC0() \
- vpxor %xmm0, %xmm0, %xmm0
-
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
MEMSET_SET_VEC0_AND_SET_RETURN(d, r)
@@ -32,9 +29,6 @@
# ifndef MEMSET_SYMBOL
# define MEMSET_SYMBOL(p,s) p##_avx2_##s
# endif
-# ifndef BZERO_SYMBOL
-# define BZERO_SYMBOL(p,s) p##_avx2_##s
-# endif
# ifndef WMEMSET_SYMBOL
# define WMEMSET_SYMBOL(p,s) p##_avx2_##s
# endif
diff --git a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
index 727c92133a..5241216a77 100644
--- a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
@@ -19,9 +19,6 @@
vpbroadcastb d, %VEC0; \
movq r, %rax
-# define BZERO_ZERO_VEC0() \
- vpxorq %XMM0, %XMM0, %XMM0
-
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
vpbroadcastd d, %VEC0; \
movq r, %rax
diff --git a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
index 5d8fa78f05..6370021506 100644
--- a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
@@ -19,9 +19,6 @@
vpbroadcastb d, %VEC0; \
movq r, %rax
-# define BZERO_ZERO_VEC0() \
- vpxorq %XMM0, %XMM0, %XMM0
-
# define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
vpbroadcastd d, %VEC0; \
movq r, %rax
diff --git a/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
index 329c58ee46..684cc248d7 100644
--- a/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
@@ -22,7 +22,6 @@
#if IS_IN (libc)
# define MEMSET_SYMBOL(p,s) p##_sse2_##s
-# define BZERO_SYMBOL(p,s) MEMSET_SYMBOL (p, s)
# define WMEMSET_SYMBOL(p,s) p##_sse2_##s
# ifdef SHARED
@@ -30,10 +29,6 @@
# define libc_hidden_builtin_def(name)
# endif
-# undef weak_alias
-# define weak_alias(original, alias) \
- .weak bzero; bzero = __bzero
-
# undef strong_alias
# define strong_alias(ignored1, ignored2)
#endif
diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
index 7c94fcdae1..a018077df0 100644
--- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
@@ -1,4 +1,4 @@
-/* memset/bzero with unaligned store and rep stosb
+/* memset with unaligned store and rep stosb
Copyright (C) 2016-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.
@@ -26,10 +26,6 @@
#include <sysdep.h>
-#ifndef BZERO_SYMBOL
-# define BZERO_SYMBOL(p,s) MEMSET_SYMBOL (p, s)
-#endif
-
#ifndef MEMSET_CHK_SYMBOL
# define MEMSET_CHK_SYMBOL(p,s) MEMSET_SYMBOL(p, s)
#endif
@@ -133,31 +129,6 @@ ENTRY (WMEMSET_SYMBOL (__wmemset, unaligned))
END (WMEMSET_SYMBOL (__wmemset, unaligned))
#endif
-ENTRY (BZERO_SYMBOL(__bzero, unaligned))
-#if VEC_SIZE > 16
- BZERO_ZERO_VEC0 ()
-#endif
- mov %RDI_LP, %RAX_LP
- mov %RSI_LP, %RDX_LP
-#ifndef USE_LESS_VEC_MASK_STORE
- xorl %esi, %esi
-#endif
- cmp $VEC_SIZE, %RDX_LP
- jb L(less_vec_no_vdup)
-#ifdef USE_LESS_VEC_MASK_STORE
- xorl %esi, %esi
-#endif
-#if VEC_SIZE <= 16
- BZERO_ZERO_VEC0 ()
-#endif
- cmp $(VEC_SIZE * 2), %RDX_LP
- ja L(more_2x_vec)
- /* From VEC and to 2 * VEC. No branch when size == VEC_SIZE. */
- VMOVU %VEC(0), (%rdi)
- VMOVU %VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
- VZEROUPPER_RETURN
-END (BZERO_SYMBOL(__bzero, unaligned))
-
#if defined SHARED && IS_IN (libc)
ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned))
cmp %RDX_LP, %RCX_LP
@@ -215,31 +186,6 @@ END (__memset_erms)
END (MEMSET_SYMBOL (__memset, erms))
# endif
-ENTRY_P2ALIGN (BZERO_SYMBOL(__bzero, unaligned_erms), 6)
-# if VEC_SIZE > 16
- BZERO_ZERO_VEC0 ()
-# endif
- mov %RDI_LP, %RAX_LP
- mov %RSI_LP, %RDX_LP
-# ifndef USE_LESS_VEC_MASK_STORE
- xorl %esi, %esi
-# endif
- cmp $VEC_SIZE, %RDX_LP
- jb L(less_vec_no_vdup)
-# ifdef USE_LESS_VEC_MASK_STORE
- xorl %esi, %esi
-# endif
-# if VEC_SIZE <= 16
- BZERO_ZERO_VEC0 ()
-# endif
- cmp $(VEC_SIZE * 2), %RDX_LP
- ja L(stosb_more_2x_vec)
- /* From VEC and to 2 * VEC. No branch when size == VEC_SIZE. */
- VMOVU %VEC(0), (%rdi)
- VMOVU %VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
- VZEROUPPER_RETURN
-END (BZERO_SYMBOL(__bzero, unaligned_erms))
-
# if defined SHARED && IS_IN (libc)
ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned_erms))
cmp %RDX_LP, %RCX_LP
--
2.32.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (11 preceding siblings ...)
2022-02-10 19:58 ` [PATCH 12/12] x86_64: " Adhemerval Zanella
@ 2022-02-14 14:29 ` Florian Weimer
2022-02-14 14:41 ` Adhemerval Zanella
2022-02-21 16:39 ` Szabolcs Nagy
13 siblings, 1 reply; 17+ messages in thread
From: Florian Weimer @ 2022-02-14 14:29 UTC (permalink / raw)
To: Adhemerval Zanella via Libc-alpha
Cc: Wilco Dijkstra, H . J . Lu, Noah Goldstein, Adhemerval Zanella
* Adhemerval Zanella via Libc-alpha:
> On a recent Linux distro (Ubuntu 21.04), I see only 1 'bcmp' call
> (which is already aliased to memcmp):
Clang turns memcmp for equality (so basically __memcmpeq) into bcmp.
Thanks,
Florian
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
2022-02-14 14:29 ` [PATCH 00/12] Remove bcopy and " Florian Weimer
@ 2022-02-14 14:41 ` Adhemerval Zanella
0 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-14 14:41 UTC (permalink / raw)
To: Florian Weimer, Adhemerval Zanella via Libc-alpha
Cc: Wilco Dijkstra, H . J . Lu, Noah Goldstein
On 14/02/2022 11:29, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
>
>> On a recent Linux distro (Ubuntu 21.04), I see only 1 'bcmp' call
>> (which is already aliased to memcmp):
>
> Clang turns memcmp for equality (so basically __memcmpeq) into bcmp.
bcmp has the advantage we can alias to memcmp, so there is no need to
actually provide all the ifunc machinery.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
` (12 preceding siblings ...)
2022-02-14 14:29 ` [PATCH 00/12] Remove bcopy and " Florian Weimer
@ 2022-02-21 16:39 ` Szabolcs Nagy
2022-02-22 15:40 ` Adhemerval Zanella
13 siblings, 1 reply; 17+ messages in thread
From: Szabolcs Nagy @ 2022-02-21 16:39 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
The 02/10/2022 16:58, Adhemerval Zanella via Libc-alpha wrote:
> Both symbols are marked as legacy in POSIX.1-2001 and removed on
> POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE
> or _DEFAULT_SOURCE.
>
> Most architectures just route bcopy/bzero to internal memmove/memset
> implementation, however some do implement iFUNC variants when memset
> or memmove are also provided through iFUNC.
>
> However, gcc already replaces bcopy with a memmove and bzero with memset
> on default configuration (to actually get a bstring libc call the code
> requires to omit string.h inclusion and built with --fno-builtin), so
> it is highly unlikely programs are actually calling libc bcopy or
> bzero symbols.
...
> So there is point in keeping such optimization.
>
> Adhemerval Zanella (12):
> ia64: Remove bcopy
> powerpc: Remove bcopy optimizations
> i386: Remove bcopy optimizations
> x86_64: Remove bcopy optimizations
> alpha: Remove bzero optimization
> ia64: Remove bzero optimization
> Remove bzero optimization
> powerpc: Remove powerpc32 bzero optimizations
> powerpc: Remove powerpc64 bzero optimizations
> s390: Remove bzero optimizations
> i686: Remove bzero optimizations
> x86_64: Remove bzero optimizations
i see this does not affect aarch64, but i agree with the principle.
(there was a comment about the x86 bzero code that if __memsetzero
is accepted then it's easier to rename the bzero optimization instead
of removing and readding.)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
2022-02-21 16:39 ` Szabolcs Nagy
@ 2022-02-22 15:40 ` Adhemerval Zanella
0 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-22 15:40 UTC (permalink / raw)
To: Szabolcs Nagy; +Cc: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein
On 21/02/2022 13:39, Szabolcs Nagy wrote:
> The 02/10/2022 16:58, Adhemerval Zanella via Libc-alpha wrote:
>> Both symbols are marked as legacy in POSIX.1-2001 and removed on
>> POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE
>> or _DEFAULT_SOURCE.
>>
>> Most architectures just route bcopy/bzero to internal memmove/memset
>> implementation, however some do implement iFUNC variants when memset
>> or memmove are also provided through iFUNC.
>>
>> However, gcc already replaces bcopy with a memmove and bzero with memset
>> on default configuration (to actually get a bstring libc call the code
>> requires to omit string.h inclusion and built with --fno-builtin), so
>> it is highly unlikely programs are actually calling libc bcopy or
>> bzero symbols.
> ...
>> So there is point in keeping such optimization.
>>
>> Adhemerval Zanella (12):
>> ia64: Remove bcopy
>> powerpc: Remove bcopy optimizations
>> i386: Remove bcopy optimizations
>> x86_64: Remove bcopy optimizations
>> alpha: Remove bzero optimization
>> ia64: Remove bzero optimization
>> Remove bzero optimization
>> powerpc: Remove powerpc32 bzero optimizations
>> powerpc: Remove powerpc64 bzero optimizations
>> s390: Remove bzero optimizations
>> i686: Remove bzero optimizations
>> x86_64: Remove bzero optimizations
>
> i see this does not affect aarch64, but i agree with the principle.
>
> (there was a comment about the x86 bzero code that if __memsetzero
> is accepted then it's easier to rename the bzero optimization instead
> of removing and readding.)
I will exclude the last patch that touches x86_64 and commit the rest.
From last discussions on both maillist and weekly we still need to
get consensus on __memsetzero addition.
^ permalink raw reply [flat|nested] 17+ messages in thread