* [PATCH v2 00/11] Remove bcopy and bzero optimizations
@ 2022-02-23 14:09 Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 01/11] ia64: Remove bcopy Adhemerval Zanella
` (11 more replies)
0 siblings, 12 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
Both symbols are marked as legacy in POSIX.1-2001 and removed on
POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE
or _DEFAULT_SOURCE.
Most architectures just route bcopy/bzero to internal memmove/memset
implementation, however some do implement iFUNC variants when memset
or memmove are also provided through iFUNC.
However, gcc already replaces bcopy with a memmove and bzero with memset
on default configuration (to actually get a bstring libc call the code
requires to omit string.h inclusion and built with --fno-builtin), so
it is highly unlikely programs are actually calling libc bcopy or
bzero symbols.
On a recent Linux distro (Ubuntu 21.04), I see only 1 'bcmp' call
(which is already aliased to memcmp):
$ cat count_bstring.sh
#!/bin/bash
files=`IFS=':';for i in $PATH; do test -d "$i" && find "$i" -maxdepth
1 -executable -type f; done`
total=0
for file in $files; do
symbols=`objdump -R $file 2>&1`
if [ $? -eq 0 ]; then
ncalls=`echo $symbols | grep -w $1 | wc -l`
((total=total+ncalls))
if [ $ncalls -gt 0 ]; then
echo "$file: $ncalls"
fi
fi
done
echo "TOTAL=$total"
$ ./count_bstring.sh bcmp
/usr/bin/rg: 1
TOTAL=1
$ ./count_bstring.sh bcopy
TOTAL=0
$ ./count_bstring.sh bzero
TOTAL=0
So there is point in keeping such optimization.
v2: Fix ia64 extra __bzero symbol, cleanup more i686 bzero definitions,
remove x86_64 bzero part.
Adhemerval Zanella (11):
ia64: Remove bcopy
powerpc: Remove bcopy optimizations
i386: Remove bcopy optimizations
x86_64: Remove bcopy optimizations
alpha: Remove bzero optimization
ia64: Remove bzero optimization
sparc: Remove bzero optimization
powerpc: Remove powerpc32 bzero optimizations
powerpc: Remove powerpc64 bzero optimizations
s390: Remove bzero optimizations
i686: Remove bzero optimizations
string/bzero.c | 4 +-
sysdeps/alpha/bzero.S | 109 ------
sysdeps/i386/bcopy.S | 4 -
sysdeps/i386/bzero.S | 5 -
sysdeps/i386/i586/bzero.S | 4 -
sysdeps/i386/i586/memset.S | 16 +-
sysdeps/i386/i686/bcopy.S | 3 -
sysdeps/i386/i686/bzero.S | 4 -
sysdeps/i386/i686/memmove.S | 22 +-
sysdeps/i386/i686/memset.S | 23 +-
sysdeps/i386/i686/multiarch/Makefile | 10 +-
sysdeps/i386/i686/multiarch/bcopy-ia32.S | 20 --
.../i686/multiarch/bcopy-sse2-unaligned.S | 4 -
sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S | 4 -
sysdeps/i386/i686/multiarch/bcopy-ssse3.S | 4 -
sysdeps/i386/i686/multiarch/bcopy.c | 30 --
sysdeps/i386/i686/multiarch/bzero-ia32.S | 37 ---
sysdeps/i386/i686/multiarch/bzero-sse2-rep.S | 3 -
sysdeps/i386/i686/multiarch/bzero-sse2.S | 3 -
sysdeps/i386/i686/multiarch/bzero.c | 32 --
sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 18 -
.../i686/multiarch/memcpy-sse2-unaligned.S | 16 +-
.../i386/i686/multiarch/memcpy-ssse3-rep.S | 64 ++--
sysdeps/i386/i686/multiarch/memcpy-ssse3.S | 202 ++++--------
sysdeps/i386/i686/multiarch/memset-sse2-rep.S | 24 +-
sysdeps/i386/i686/multiarch/memset-sse2.S | 24 +-
sysdeps/i386/memcpy.S | 16 +-
sysdeps/i386/memset.S | 14 +-
sysdeps/ia64/bcopy.S | 10 -
sysdeps/ia64/bzero.S | 312 ------------------
sysdeps/ia64/bzero.c | 3 +
sysdeps/powerpc/powerpc32/bzero.S | 27 --
.../powerpc32/power4/multiarch/Makefile | 4 +-
.../powerpc32/power4/multiarch/bzero-power6.S | 25 --
.../powerpc32/power4/multiarch/bzero-power7.S | 25 --
.../powerpc32/power4/multiarch/bzero-ppc32.S | 34 --
.../powerpc32/power4/multiarch/bzero.c | 37 ---
.../power4/multiarch/ifunc-impl-list.c | 8 -
sysdeps/powerpc/powerpc64/bzero.S | 20 --
.../powerpc/powerpc64/le/power10/memmove.S | 13 -
sysdeps/powerpc/powerpc64/le/power10/memset.S | 12 -
sysdeps/powerpc/powerpc64/memset.S | 13 -
sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +-
.../powerpc/powerpc64/multiarch/bcopy-ppc64.c | 27 --
sysdeps/powerpc/powerpc64/multiarch/bcopy.c | 38 ---
sysdeps/powerpc/powerpc64/multiarch/bzero.c | 54 ---
.../powerpc64/multiarch/ifunc-impl-list.c | 34 --
.../powerpc64/multiarch/memmove-power10.S | 3 -
.../powerpc64/multiarch/memmove-power7.S | 3 -
.../powerpc64/multiarch/memset-power10.S | 3 -
.../powerpc64/multiarch/memset-power4.S | 3 -
.../powerpc64/multiarch/memset-power6.S | 3 -
.../powerpc64/multiarch/memset-power7.S | 2 -
.../powerpc64/multiarch/memset-power8.S | 3 -
.../powerpc64/multiarch/memset-ppc64.S | 16 +-
sysdeps/powerpc/powerpc64/power4/memset.S | 12 -
sysdeps/powerpc/powerpc64/power6/memset.S | 12 -
sysdeps/powerpc/powerpc64/power7/bcopy.c | 1 -
sysdeps/powerpc/powerpc64/power7/memmove.S | 14 -
sysdeps/powerpc/powerpc64/power7/memset.S | 12 -
sysdeps/powerpc/powerpc64/power8/memset.S | 12 -
sysdeps/s390/Makefile | 2 +-
sysdeps/s390/bzero.c | 47 ---
sysdeps/s390/ifunc-memset.h | 9 -
sysdeps/s390/memset-z900.S | 32 +-
sysdeps/s390/multiarch/ifunc-impl-list.c | 15 -
sysdeps/sparc/sparc32/bzero.c | 1 -
sysdeps/sparc/sparc32/memset.S | 37 +--
sysdeps/sparc/sparc32/sparcv9/bzero.c | 1 -
.../sparc/sparc32/sparcv9/multiarch/bzero.c | 1 -
.../sparc32/sparcv9/multiarch/memset-ultra1.S | 1 -
sysdeps/sparc/sparc64/bzero.c | 1 -
sysdeps/sparc/sparc64/memset.S | 30 +-
sysdeps/sparc/sparc64/multiarch/bzero.c | 33 --
.../sparc/sparc64/multiarch/ifunc-impl-list.c | 9 -
.../sparc/sparc64/multiarch/ifunc-memset.h | 2 +-
.../sparc/sparc64/multiarch/memset-niagara1.S | 5 +-
.../sparc/sparc64/multiarch/memset-niagara4.S | 6 +-
.../sparc/sparc64/multiarch/memset-niagara7.S | 7 -
.../sparc/sparc64/multiarch/memset-ultra1.S | 1 -
sysdeps/x86_64/multiarch/bcopy.S | 7 -
81 files changed, 162 insertions(+), 1601 deletions(-)
delete mode 100644 sysdeps/alpha/bzero.S
delete mode 100644 sysdeps/i386/bcopy.S
delete mode 100644 sysdeps/i386/bzero.S
delete mode 100644 sysdeps/i386/i586/bzero.S
delete mode 100644 sysdeps/i386/i686/bcopy.S
delete mode 100644 sysdeps/i386/i686/bzero.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ia32.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy.c
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-ia32.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero.c
delete mode 100644 sysdeps/ia64/bcopy.S
delete mode 100644 sysdeps/ia64/bzero.S
create mode 100644 sysdeps/ia64/bzero.c
delete mode 100644 sysdeps/powerpc/powerpc32/bzero.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
delete mode 100644 sysdeps/powerpc/powerpc64/bzero.S
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy.c
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bzero.c
delete mode 100644 sysdeps/powerpc/powerpc64/power7/bcopy.c
delete mode 100644 sysdeps/s390/bzero.c
delete mode 100644 sysdeps/sparc/sparc32/bzero.c
delete mode 100644 sysdeps/sparc/sparc32/sparcv9/bzero.c
delete mode 100644 sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
delete mode 100644 sysdeps/sparc/sparc64/bzero.c
delete mode 100644 sysdeps/sparc/sparc64/multiarch/bzero.c
delete mode 100644 sysdeps/x86_64/multiarch/bcopy.S
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 01/11] ia64: Remove bcopy
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 02/11] powerpc: Remove bcopy optimizations Adhemerval Zanella
` (10 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
It just call memmove as the generic implementation. The arch specific
mplementation is just to avoid the __bzero symbol creation (which
ia64 abi does not export).
---
string/bzero.c | 4 ++--
sysdeps/ia64/bcopy.S | 10 ----------
sysdeps/ia64/bzero.c | 3 +++
3 files changed, 5 insertions(+), 12 deletions(-)
delete mode 100644 sysdeps/ia64/bcopy.S
create mode 100644 sysdeps/ia64/bzero.c
diff --git a/string/bzero.c b/string/bzero.c
index eb2af49e9e..d8b79df1c7 100644
--- a/string/bzero.c
+++ b/string/bzero.c
@@ -17,12 +17,12 @@
#include <string.h>
-#undef __bzero
-
/* Set N bytes of S to 0. */
void
__bzero (void *s, size_t len)
{
memset (s, '\0', len);
}
+#ifndef __bzero
weak_alias (__bzero, bzero)
+#endif
diff --git a/sysdeps/ia64/bcopy.S b/sysdeps/ia64/bcopy.S
deleted file mode 100644
index bdabf5acdc..0000000000
--- a/sysdeps/ia64/bcopy.S
+++ /dev/null
@@ -1,10 +0,0 @@
-#include <sysdep.h>
-
-ENTRY(bcopy)
- .regstk 3, 0, 0, 0
- mov r8 = in0
- mov in0 = in1
- ;;
- mov in1 = r8
- br.cond.sptk.many HIDDEN_BUILTIN_JUMPTARGET(memmove)
-END(bcopy)
diff --git a/sysdeps/ia64/bzero.c b/sysdeps/ia64/bzero.c
new file mode 100644
index 0000000000..79771f3e91
--- /dev/null
+++ b/sysdeps/ia64/bzero.c
@@ -0,0 +1,3 @@
+/* ia64 does not export __bzero symbol. */
+#define __bzero bzero
+#include <string/bzero.c>
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 02/11] powerpc: Remove bcopy optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 01/11] ia64: Remove bcopy Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 03/11] i386: " Adhemerval Zanella
` (9 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
.../powerpc/powerpc64/le/power10/memmove.S | 13 -------
sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +-
.../powerpc/powerpc64/multiarch/bcopy-ppc64.c | 27 -------------
sysdeps/powerpc/powerpc64/multiarch/bcopy.c | 38 -------------------
.../powerpc64/multiarch/ifunc-impl-list.c | 13 -------
.../powerpc64/multiarch/memmove-power10.S | 3 --
.../powerpc64/multiarch/memmove-power7.S | 3 --
sysdeps/powerpc/powerpc64/power7/bcopy.c | 1 -
sysdeps/powerpc/powerpc64/power7/memmove.S | 14 -------
9 files changed, 1 insertion(+), 113 deletions(-)
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy.c
delete mode 100644 sysdeps/powerpc/powerpc64/power7/bcopy.c
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memmove.S b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
index eda86b194e..3024718fdf 100644
--- a/sysdeps/powerpc/powerpc64/le/power10/memmove.S
+++ b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
@@ -305,16 +305,3 @@ L(tail1_bwd):
END_GEN_TB (MEMMOVE,TB_TOCLESS)
libc_hidden_builtin_def (memmove)
-
-/* void bcopy(const void *src [r3], void *dest [r4], size_t n [r5])
- Implemented in this file to avoid linker create a stub function call
- in the branch to '_memmove'. */
-ENTRY_TOCLESS (__bcopy)
- mr r6,r3
- mr r3,r4
- mr r4,r6
- b L(_memmove)
-END (__bcopy)
-#ifndef __bcopy
-weak_alias (__bcopy, bcopy)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 626845a43c..6f2436b660 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -24,7 +24,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
stpncpy-power8 stpncpy-power7 stpncpy-ppc64 \
strcmp-power8 strcmp-power7 strcmp-ppc64 \
strcat-power8 strcat-power7 strcat-ppc64 \
- memmove-power7 memmove-ppc64 wordcopy-ppc64 bcopy-ppc64 \
+ memmove-power7 memmove-ppc64 wordcopy-ppc64 \
strncpy-power8 strstr-power7 strstr-ppc64 \
strspn-power8 strspn-ppc64 strcspn-power8 strcspn-ppc64 \
strlen-power8 strcasestr-power8 strcasestr-ppc64 \
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
deleted file mode 100644
index fe68713ad7..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
+++ /dev/null
@@ -1,27 +0,0 @@
-/* PowerPC64 default bcopy.
- Copyright (C) 2014-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <string.h>
-
-extern __typeof (bcopy) __bcopy_ppc attribute_hidden;
-extern __typeof (memmove) __memmove_ppc attribute_hidden;
-
-void __bcopy_ppc (const void *src, void *dest, size_t n)
-{
- __memmove_ppc (dest, src, n);
-}
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
deleted file mode 100644
index 84c6adfd6e..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
+++ /dev/null
@@ -1,38 +0,0 @@
-/* PowerPC64 multiarch bcopy.
- Copyright (C) 2014-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <string.h>
-#include "init-arch.h"
-
-extern __typeof (bcopy) __bcopy_ppc attribute_hidden;
-/* __bcopy_power7 symbol is implemented at memmove-power7.S */
-extern __typeof (bcopy) __bcopy_power7 attribute_hidden;
-#ifdef __LITTLE_ENDIAN__
-extern __typeof (bcopy) __bcopy_power10 attribute_hidden;
-#endif
-
-libc_ifunc (bcopy,
-#ifdef __LITTLE_ENDIAN__
- (hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX)
- ? __bcopy_power10 :
-#endif
- (hwcap & PPC_FEATURE_HAS_VSX)
- ? __bcopy_power7
- : __bcopy_ppc);
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index a0f9fce25d..280b8616b2 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -244,19 +244,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__bzero_power4)
IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
- /* Support sysdeps/powerpc/powerpc64/multiarch/bcopy.c. */
- IFUNC_IMPL (i, name, bcopy,
-#ifdef __LITTLE_ENDIAN__
- IFUNC_IMPL_ADD (array, i, bcopy,
- hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX,
- __bcopy_power10)
-#endif
- IFUNC_IMPL_ADD (array, i, bcopy, hwcap & PPC_FEATURE_HAS_VSX,
- __bcopy_power7)
- IFUNC_IMPL_ADD (array, i, bcopy, 1, __bcopy_ppc))
-
/* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c. */
IFUNC_IMPL (i, name, mempcpy,
IFUNC_IMPL_ADD (array, i, mempcpy,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
index e5df0851c0..a66d2892c4 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bcopy
-#define __bcopy __bcopy_power10
-
#include <sysdeps/powerpc/powerpc64/le/power10/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
index a7b05ebfa9..0a6c7cb96e 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bcopy
-#define __bcopy __bcopy_power7
-
#include <sysdeps/powerpc/powerpc64/power7/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/power7/bcopy.c b/sysdeps/powerpc/powerpc64/power7/bcopy.c
deleted file mode 100644
index 4a6a400e7a..0000000000
--- a/sysdeps/powerpc/powerpc64/power7/bcopy.c
+++ /dev/null
@@ -1 +0,0 @@
-/* Implemented at memmove.S */
diff --git a/sysdeps/powerpc/powerpc64/power7/memmove.S b/sysdeps/powerpc/powerpc64/power7/memmove.S
index 1d10a3d593..5a1055c097 100644
--- a/sysdeps/powerpc/powerpc64/power7/memmove.S
+++ b/sysdeps/powerpc/powerpc64/power7/memmove.S
@@ -821,17 +821,3 @@ L(end_unaligned_loop_bwd):
blr
END_GEN_TB (MEMMOVE, TB_TOCLESS)
libc_hidden_builtin_def (memmove)
-
-
-/* void bcopy(const void *src [r3], void *dest [r4], size_t n [r5])
- Implemented in this file to avoid linker create a stub function call
- in the branch to '_memmove'. */
-ENTRY_TOCLESS (__bcopy)
- mr r6,r3
- mr r3,r4
- mr r4,r6
- b L(_memmove)
-END (__bcopy)
-#ifndef __bcopy
-weak_alias (__bcopy, bcopy)
-#endif
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 03/11] i386: Remove bcopy optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 01/11] ia64: Remove bcopy Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 02/11] powerpc: Remove bcopy optimizations Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 04/11] x86_64: " Adhemerval Zanella
` (8 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
sysdeps/i386/bcopy.S | 4 -
sysdeps/i386/i686/bcopy.S | 3 -
sysdeps/i386/i686/memmove.S | 22 +-
sysdeps/i386/i686/multiarch/Makefile | 6 +-
sysdeps/i386/i686/multiarch/bcopy-ia32.S | 20 --
.../i686/multiarch/bcopy-sse2-unaligned.S | 4 -
sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S | 4 -
sysdeps/i386/i686/multiarch/bcopy-ssse3.S | 4 -
sysdeps/i386/i686/multiarch/bcopy.c | 30 ---
sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 10 -
.../i686/multiarch/memcpy-sse2-unaligned.S | 16 +-
.../i386/i686/multiarch/memcpy-ssse3-rep.S | 64 ++----
sysdeps/i386/i686/multiarch/memcpy-ssse3.S | 202 ++++++------------
sysdeps/i386/memcpy.S | 16 +-
14 files changed, 100 insertions(+), 305 deletions(-)
delete mode 100644 sysdeps/i386/bcopy.S
delete mode 100644 sysdeps/i386/i686/bcopy.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ia32.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3.S
delete mode 100644 sysdeps/i386/i686/multiarch/bcopy.c
diff --git a/sysdeps/i386/bcopy.S b/sysdeps/i386/bcopy.S
deleted file mode 100644
index 12b8ddb886..0000000000
--- a/sysdeps/i386/bcopy.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY bcopy
-#include "memcpy.S"
diff --git a/sysdeps/i386/i686/bcopy.S b/sysdeps/i386/i686/bcopy.S
deleted file mode 100644
index 15ef9419a4..0000000000
--- a/sysdeps/i386/i686/bcopy.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BCOPY
-#define memmove bcopy
-#include <sysdeps/i386/i686/memmove.S>
diff --git a/sysdeps/i386/i686/memmove.S b/sysdeps/i386/i686/memmove.S
index 0301560bb8..bdc69d315a 100644
--- a/sysdeps/i386/i686/memmove.S
+++ b/sysdeps/i386/i686/memmove.S
@@ -25,22 +25,16 @@
.text
-#ifdef USE_AS_BCOPY
-# define SRC RTN
-# define DEST SRC+4
-# define LEN DEST+4
-#else
-# define DEST RTN
-# define SRC DEST+4
-# define LEN SRC+4
-
-# if defined PIC && IS_IN (libc)
+#define DEST RTN
+#define SRC DEST+4
+#define LEN SRC+4
+
+#if defined PIC && IS_IN (libc)
ENTRY_CHK (__memmove_chk)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
jb HIDDEN_JUMPTARGET (__chk_fail)
END_CHK (__memmove_chk)
-# endif
#endif
ENTRY (memmove)
@@ -71,9 +65,7 @@ ENTRY (memmove)
movsl
movl %edx, %esi
cfi_restore (esi)
-#ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-#endif
popl %edi
cfi_adjust_cfa_offset (-4)
@@ -103,9 +95,7 @@ ENTRY (memmove)
movsl
movl %edx, %esi
cfi_restore (esi)
-#ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-#endif
cld
popl %edi
@@ -114,6 +104,4 @@ ENTRY (memmove)
ret
END (memmove)
-#ifndef USE_AS_BCOPY
libc_hidden_builtin_def (memmove)
-#endif
diff --git a/sysdeps/i386/i686/multiarch/Makefile b/sysdeps/i386/i686/multiarch/Makefile
index c4897922d7..02fa02658e 100644
--- a/sysdeps/i386/i686/multiarch/Makefile
+++ b/sysdeps/i386/i686/multiarch/Makefile
@@ -2,7 +2,7 @@ ifeq ($(subdir),string)
gen-as-const-headers += locale-defines.sym
sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
memmove-ssse3 memcpy-ssse3-rep mempcpy-ssse3-rep \
- memmove-ssse3-rep bcopy-ssse3 bcopy-ssse3-rep \
+ memmove-ssse3-rep \
memset-sse2-rep bzero-sse2-rep strcmp-ssse3 \
strcmp-sse4 strncmp-c strncmp-ssse3 strncmp-sse4 \
memcmp-ssse3 memcmp-sse4 varshift \
@@ -18,10 +18,10 @@ sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
strcasecmp_l-c strcasecmp-c strcasecmp_l-ssse3 \
strncase_l-c strncase-c strncase_l-ssse3 \
strcasecmp_l-sse4 strncase_l-sse4 \
- bcopy-sse2-unaligned memcpy-sse2-unaligned \
+ memcpy-sse2-unaligned \
mempcpy-sse2-unaligned memmove-sse2-unaligned \
strcspn-c strpbrk-c strspn-c \
- bcopy-ia32 bzero-ia32 rawmemchr-ia32 \
+ bzero-ia32 rawmemchr-ia32 \
memchr-ia32 memcmp-ia32 memcpy-ia32 memmove-ia32 \
mempcpy-ia32 memset-ia32 strcat-ia32 strchr-ia32 \
strrchr-ia32 strcpy-ia32 strcmp-ia32 strcspn-ia32 \
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ia32.S b/sysdeps/i386/i686/multiarch/bcopy-ia32.S
deleted file mode 100644
index e0fadc0f3f..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ia32.S
+++ /dev/null
@@ -1,20 +0,0 @@
-/* bcopy optimized for i686.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#define bcopy __bcopy_ia32
-#include <sysdeps/i386/i686/bcopy.S>
diff --git a/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S b/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
deleted file mode 100644
index efef2a10dd..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY __bcopy_sse2_unaligned
-#include "memcpy-sse2-unaligned.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S b/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
deleted file mode 100644
index cbc8b420e8..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY __bcopy_ssse3_rep
-#include "memcpy-ssse3-rep.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ssse3.S b/sysdeps/i386/i686/multiarch/bcopy-ssse3.S
deleted file mode 100644
index 36aac44b9c..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ssse3.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY __bcopy_ssse3
-#include "memcpy-ssse3.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy.c b/sysdeps/i386/i686/multiarch/bcopy.c
deleted file mode 100644
index bc2c2ac55d..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy.c
+++ /dev/null
@@ -1,30 +0,0 @@
-/* Multiple versions of bcopy.
- All versions must be listed in ifunc-impl-list.c.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for the definition in libc. */
-#if IS_IN (libc)
-# define bcopy __redirect_bcopy
-# include <string.h>
-# undef bcopy
-
-# define SYMBOL_NAME bcopy
-# include "ifunc-memmove.h"
-
-libc_ifunc_redirected (__redirect_bcopy, bcopy, IFUNC_SELECTOR ());
-#endif
diff --git a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
index 6883b3d226..5c7a42dc97 100644
--- a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
+++ b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
@@ -36,16 +36,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
size_t i = 0;
- /* Support sysdeps/i386/i686/multiarch/bcopy.S. */
- IFUNC_IMPL (i, name, bcopy,
- IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSSE3),
- __bcopy_ssse3_rep)
- IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSSE3),
- __bcopy_ssse3)
- IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSE2),
- __bcopy_sse2_unaligned)
- IFUNC_IMPL_ADD (array, i, bcopy, 1, __bcopy_ia32))
-
/* Support sysdeps/i386/i686/multiarch/bzero.S. */
IFUNC_IMPL (i, name, bzero,
IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
diff --git a/sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S b/sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S
index 72c97a9dbe..ed1f3836a6 100644
--- a/sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S
+++ b/sysdeps/i386/i686/multiarch/memcpy-sse2-unaligned.S
@@ -29,15 +29,9 @@
# define MEMCPY_CHK __memcpy_chk_sse2_unaligned
# endif
-# ifdef USE_AS_BCOPY
-# define SRC PARMS
-# define DEST SRC+4
-# define LEN DEST+4
-# else
-# define DEST PARMS
-# define SRC DEST+4
-# define LEN SRC+4
-# endif
+# define DEST PARMS
+# define SRC DEST+4
+# define LEN SRC+4
# define CFI_PUSH(REG) \
cfi_adjust_cfa_offset (4); \
@@ -56,7 +50,7 @@
# define RETURN RETURN_END; CFI_PUSH (%ebx)
.section .text.sse2,"ax",@progbits
-# if !defined USE_AS_BCOPY && defined SHARED
+# if defined SHARED
ENTRY (MEMCPY_CHK)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -671,7 +665,7 @@ L(len_5_8_bytes):
L(return):
movl %edx, %eax
-# if !defined USE_AS_BCOPY && defined USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
# endif
diff --git a/sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S b/sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S
index 38fc56db71..8e3c67d8e1 100644
--- a/sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S
+++ b/sysdeps/i386/i686/multiarch/memcpy-ssse3-rep.S
@@ -30,15 +30,9 @@
# define MEMCPY_CHK __memcpy_chk_ssse3_rep
#endif
-#ifdef USE_AS_BCOPY
-# define SRC PARMS
-# define DEST SRC+4
-# define LEN DEST+4
-#else
-# define DEST PARMS
-# define SRC DEST+4
-# define LEN SRC+4
-#endif
+#define DEST PARMS
+#define SRC DEST+4
+#define LEN SRC+4
#define CFI_PUSH(REG) \
cfi_adjust_cfa_offset (4); \
@@ -99,7 +93,7 @@
#endif
.section .text.ssse3,"ax",@progbits
-#if !defined USE_AS_BCOPY && defined SHARED
+#ifdef SHARED
ENTRY (MEMCPY_CHK)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -1097,12 +1091,10 @@ L(fwd_write_4bytes):
movl -4(%eax), %ecx
movl %ecx, -4(%edx)
L(fwd_write_0bytes):
-#ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+#else
movl DEST(%esp), %eax
-# endif
#endif
RETURN
@@ -1112,12 +1104,10 @@ L(fwd_write_5bytes):
movl -4(%eax), %eax
movl %ecx, -5(%edx)
movl %eax, -4(%edx)
-#ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+#else
movl DEST(%esp), %eax
-# endif
#endif
RETURN
@@ -1157,12 +1147,10 @@ L(fwd_write_9bytes):
L(fwd_write_1bytes):
movzbl -1(%eax), %ecx
movb %cl, -1(%edx)
-#ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+#else
movl DEST(%esp), %eax
-# endif
#endif
RETURN
@@ -1203,12 +1191,10 @@ L(fwd_write_6bytes):
L(fwd_write_2bytes):
movzwl -2(%eax), %ecx
movw %cx, -2(%edx)
-#ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+#else
movl DEST(%esp), %eax
-# endif
#endif
RETURN
@@ -1251,12 +1237,10 @@ L(fwd_write_3bytes):
movzbl -1(%eax), %eax
movw %cx, -3(%edx)
movb %al, -1(%edx)
-#ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+#else
movl DEST(%esp), %eax
-# endif
#endif
RETURN_END
@@ -1357,12 +1341,10 @@ L(copy_page_by_rep_left_1):
L(copy_page_by_rep_exit):
POP (%esi)
POP (%edi)
-#ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
#endif
RETURN
@@ -1401,12 +1383,10 @@ L(bk_write_4bytes):
movl (%eax), %ecx
movl %ecx, (%edx)
L(bk_write_0bytes):
-#ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
#endif
RETURN
@@ -1447,12 +1427,10 @@ L(bk_write_5bytes):
L(bk_write_1bytes):
movzbl (%eax), %ecx
movb %cl, (%edx)
-#ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
#endif
RETURN
@@ -1493,12 +1471,10 @@ L(bk_write_6bytes):
L(bk_write_2bytes):
movzwl (%eax), %ecx
movw %cx, (%edx)
-#ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
#endif
RETURN
@@ -1541,12 +1517,10 @@ L(bk_write_3bytes):
movw %cx, 1(%edx)
movzbl (%eax), %eax
movb %al, (%edx)
-#ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+#ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
#endif
RETURN_END
diff --git a/sysdeps/i386/i686/multiarch/memcpy-ssse3.S b/sysdeps/i386/i686/multiarch/memcpy-ssse3.S
index 62ae4a8d65..18c0913e5d 100644
--- a/sysdeps/i386/i686/multiarch/memcpy-ssse3.S
+++ b/sysdeps/i386/i686/multiarch/memcpy-ssse3.S
@@ -29,15 +29,9 @@
# define MEMCPY_CHK __memcpy_chk_ssse3
# endif
-# ifdef USE_AS_BCOPY
-# define SRC PARMS
-# define DEST SRC+4
-# define LEN DEST+4
-# else
-# define DEST PARMS
-# define SRC DEST+4
-# define LEN SRC+4
-# endif
+# define DEST PARMS
+# define SRC DEST+4
+# define LEN SRC+4
# define CFI_PUSH(REG) \
cfi_adjust_cfa_offset (4); \
@@ -88,7 +82,7 @@
# endif
.section .text.ssse3,"ax",@progbits
-# if !defined USE_AS_BCOPY && defined SHARED
+# ifdef SHARED
ENTRY (MEMCPY_CHK)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -1979,12 +1973,10 @@ L(fwd_write_12bytes):
L(fwd_write_4bytes):
movl -4(%eax), %ecx
movl %ecx, -4(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2005,12 +1997,10 @@ L(fwd_write_8bytes):
movq -8(%eax), %xmm0
movq %xmm0, -8(%edx)
L(fwd_write_0bytes):
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2020,12 +2010,10 @@ L(fwd_write_5bytes):
movl -4(%eax), %eax
movl %ecx, -5(%edx)
movl %eax, -4(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2049,12 +2037,10 @@ L(fwd_write_13bytes):
movl %ecx, -5(%edx)
movzbl -1(%eax), %ecx
movb %cl, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2077,12 +2063,10 @@ L(fwd_write_9bytes):
L(fwd_write_1bytes):
movzbl -1(%eax), %ecx
movb %cl, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2107,12 +2091,10 @@ L(fwd_write_6bytes):
movl %ecx, -6(%edx)
movzwl -2(%eax), %ecx
movw %cx, -2(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2135,12 +2117,10 @@ L(fwd_write_10bytes):
L(fwd_write_2bytes):
movzwl -2(%eax), %ecx
movw %cx, -2(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2167,12 +2147,10 @@ L(fwd_write_7bytes):
movzbl -1(%eax), %eax
movw %cx, -3(%edx)
movb %al, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2197,12 +2175,10 @@ L(fwd_write_3bytes):
movzbl -1(%eax), %eax
movw %cx, -3(%edx)
movb %al, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2217,12 +2193,10 @@ L(fwd_write_8bytes_align):
movq -8(%eax), %xmm0
movq %xmm0, -8(%edx)
L(fwd_write_0bytes_align):
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2233,12 +2207,10 @@ L(fwd_write_32bytes_align):
L(fwd_write_16bytes_align):
movdqa -16(%eax), %xmm0
movdqa %xmm0, -16(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2248,12 +2220,10 @@ L(fwd_write_5bytes_align):
movl -4(%eax), %eax
movl %ecx, -5(%edx)
movl %eax, -4(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2271,12 +2241,10 @@ L(fwd_write_13bytes_align):
movl %ecx, -5(%edx)
movzbl -1(%eax), %ecx
movb %cl, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2291,12 +2259,10 @@ L(fwd_write_21bytes_align):
movl %ecx, -5(%edx)
movzbl -1(%eax), %ecx
movb %cl, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2313,12 +2279,10 @@ L(fwd_write_9bytes_align):
L(fwd_write_1bytes_align):
movzbl -1(%eax), %ecx
movb %cl, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2331,12 +2295,10 @@ L(fwd_write_17bytes_align):
movdqa %xmm0, -17(%edx)
movzbl -1(%eax), %ecx
movb %cl, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2355,12 +2317,10 @@ L(fwd_write_6bytes_align):
movl %ecx, -6(%edx)
movzwl -2(%eax), %ecx
movw %cx, -2(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2375,12 +2335,10 @@ L(fwd_write_22bytes_align):
movl %ecx, -6(%edx)
movzwl -2(%eax), %ecx
movw %cx, -2(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2397,12 +2355,10 @@ L(fwd_write_10bytes_align):
L(fwd_write_2bytes_align):
movzwl -2(%eax), %ecx
movw %cx, -2(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2415,12 +2371,10 @@ L(fwd_write_18bytes_align):
movdqa %xmm0, -18(%edx)
movzwl -2(%eax), %ecx
movw %cx, -2(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2441,12 +2395,10 @@ L(fwd_write_7bytes_align):
movzbl -1(%eax), %eax
movw %cx, -3(%edx)
movb %al, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2463,12 +2415,10 @@ L(fwd_write_23bytes_align):
movzbl -1(%eax), %eax
movw %cx, -3(%edx)
movb %al, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2487,12 +2437,10 @@ L(fwd_write_3bytes_align):
movzbl -1(%eax), %eax
movw %cx, -3(%edx)
movb %al, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2507,12 +2455,10 @@ L(fwd_write_19bytes_align):
movzbl -1(%eax), %eax
movw %cx, -3(%edx)
movb %al, -1(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2529,12 +2475,10 @@ L(fwd_write_12bytes_align):
L(fwd_write_4bytes_align):
movl -4(%eax), %ecx
movl %ecx, -4(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN
@@ -2547,12 +2491,10 @@ L(fwd_write_20bytes_align):
movdqa %xmm0, -20(%edx)
movl -4(%eax), %ecx
movl %ecx, -4(%edx)
-# ifndef USE_AS_BCOPY
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl %edx, %eax
-# else
+# else
movl DEST(%esp), %eax
-# endif
# endif
RETURN_END
@@ -2646,12 +2588,10 @@ L(bk_write_4bytes):
movl (%eax), %ecx
movl %ecx, (%edx)
L(bk_write_0bytes):
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN
@@ -2671,12 +2611,10 @@ L(bk_write_16bytes):
L(bk_write_8bytes):
movq (%eax), %xmm0
movq %xmm0, (%edx)
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN
@@ -2702,12 +2640,10 @@ L(bk_write_5bytes):
L(bk_write_1bytes):
movzbl (%eax), %ecx
movb %cl, (%edx)
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN
@@ -2729,12 +2665,10 @@ L(bk_write_9bytes):
movq %xmm0, 1(%edx)
movzbl (%eax), %ecx
movb %cl, (%edx)
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN
@@ -2759,12 +2693,10 @@ L(bk_write_6bytes):
movl %ecx, 2(%edx)
movzwl (%eax), %ecx
movw %cx, (%edx)
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN
@@ -2787,12 +2719,10 @@ L(bk_write_10bytes):
L(bk_write_2bytes):
movzwl (%eax), %ecx
movw %cx, (%edx)
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN
@@ -2819,12 +2749,10 @@ L(bk_write_7bytes):
movw %cx, 1(%edx)
movzbl (%eax), %eax
movb %al, (%edx)
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN
@@ -2849,12 +2777,10 @@ L(bk_write_3bytes):
movw %cx, 1(%edx)
movzbl (%eax), %eax
movb %al, (%edx)
-# ifndef USE_AS_BCOPY
movl DEST(%esp), %eax
-# ifdef USE_AS_MEMPCPY
+# ifdef USE_AS_MEMPCPY
movl LEN(%esp), %ecx
add %ecx, %eax
-# endif
# endif
RETURN_END
diff --git a/sysdeps/i386/memcpy.S b/sysdeps/i386/memcpy.S
index 0eca548e3e..fc197e92a8 100644
--- a/sysdeps/i386/memcpy.S
+++ b/sysdeps/i386/memcpy.S
@@ -24,15 +24,9 @@
# define MEMCPY_CHK __memcpy_chk
#endif
-#ifdef USE_AS_BCOPY
-# define STR2 12
-# define STR1 STR2+4
-# define N STR1+4
-#else
-# define STR1 12
-# define STR2 STR1+4
-# define N STR2+4
-#endif
+#define STR1 12
+#define STR2 STR1+4
+#define N STR2+4
#define CFI_PUSH(REG) \
cfi_adjust_cfa_offset (4); \
@@ -46,7 +40,7 @@
#define POP(REG) popl REG; CFI_POP (REG)
.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BCOPY
+#if defined SHARED && IS_IN (libc)
ENTRY (MEMCPY_CHK)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -90,6 +84,4 @@ L(bwd_write_0bytes):
END (MEMCPY)
-#ifndef USE_AS_BCOPY
libc_hidden_builtin_def (MEMCPY)
-#endif
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 04/11] x86_64: Remove bcopy optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (2 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 03/11] i386: " Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-05-12 19:28 ` Sunil Pandey
2022-02-23 14:09 ` [PATCH v2 05/11] alpha: Remove bzero optimization Adhemerval Zanella
` (7 subsequent siblings)
11 siblings, 1 reply; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
sysdeps/x86_64/multiarch/bcopy.S | 7 -------
1 file changed, 7 deletions(-)
delete mode 100644 sysdeps/x86_64/multiarch/bcopy.S
diff --git a/sysdeps/x86_64/multiarch/bcopy.S b/sysdeps/x86_64/multiarch/bcopy.S
deleted file mode 100644
index 639f02bde3..0000000000
--- a/sysdeps/x86_64/multiarch/bcopy.S
+++ /dev/null
@@ -1,7 +0,0 @@
-#include <sysdep.h>
-
- .text
-ENTRY(bcopy)
- xchg %rdi, %rsi
- jmp __libc_memmove /* Branch to IFUNC memmove. */
-END(bcopy)
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 05/11] alpha: Remove bzero optimization
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (3 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 04/11] x86_64: " Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 06/11] ia64: " Adhemerval Zanella
` (6 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
sysdeps/alpha/bzero.S | 109 ------------------------------------------
1 file changed, 109 deletions(-)
delete mode 100644 sysdeps/alpha/bzero.S
diff --git a/sysdeps/alpha/bzero.S b/sysdeps/alpha/bzero.S
deleted file mode 100644
index 4821778622..0000000000
--- a/sysdeps/alpha/bzero.S
+++ /dev/null
@@ -1,109 +0,0 @@
-/* Copyright (C) 1996-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library. If not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Fill a block of memory with zeros. Optimized for the Alpha architecture:
-
- - memory accessed as aligned quadwords only
- - destination memory not read unless needed for good cache behaviour
- - basic blocks arranged to optimize branch prediction for full-quadword
- aligned memory blocks.
- - partial head and tail quadwords constructed with byte-mask instructions
-
- This is generally scheduled for the EV5 (got to look out for my own
- interests :-), but with EV4 needs in mind. There *should* be no more
- stalls for the EV4 than there are for the EV5.
-*/
-
-
-#include <sysdep.h>
-
- .set noat
- .set noreorder
-
- .text
- .type __bzero, @function
- .globl __bzero
- .usepv __bzero, USEPV_PROF
-
- cfi_startproc
-
- /* On entry to this basic block:
- t3 == loop counter
- t4 == bytes in partial final word
- a0 == possibly misaligned destination pointer */
-
- .align 3
-bzero_loop:
- beq t3, $tail #
- blbc t3, 0f # skip single store if count even
-
- stq_u zero, 0(a0) # e0 : store one word
- subq t3, 1, t3 # .. e1 :
- addq a0, 8, a0 # e0 :
- beq t3, $tail # .. e1 :
-
-0: stq_u zero, 0(a0) # e0 : store two words
- subq t3, 2, t3 # .. e1 :
- stq_u zero, 8(a0) # e0 :
- addq a0, 16, a0 # .. e1 :
- bne t3, 0b # e1 :
-
-$tail: bne t4, 1f # is there a tail to do?
- ret # no
-
-1: ldq_u t0, 0(a0) # yes, load original data
- mskqh t0, t4, t0 #
- stq_u t0, 0(a0) #
- ret #
-
-__bzero:
-#ifdef PROF
- ldgp gp, 0(pv)
- lda AT, _mcount
- jsr AT, (AT), _mcount
-#endif
-
- mov a0, v0 # e0 : move return value in place
- beq a1, $done # .. e1 : early exit for zero-length store
- and a0, 7, t1 # e0 :
- addq a1, t1, a1 # e1 : add dest misalignment to count
- srl a1, 3, t3 # e0 : loop = count >> 3
- and a1, 7, t4 # .. e1 : find number of bytes in tail
- unop # :
- beq t1, bzero_loop # e1 : aligned head, jump right in
-
- ldq_u t0, 0(a0) # e0 : load original data to mask into
- cmpult a1, 8, t2 # .. e1 : is this a sub-word set?
- bne t2, $oneq # e1 :
-
- mskql t0, a0, t0 # e0 : we span words. finish this partial
- subq t3, 1, t3 # e0 :
- addq a0, 8, a0 # .. e1 :
- stq_u t0, -8(a0) # e0 :
- br bzero_loop # .. e1 :
-
- .align 3
-$oneq:
- mskql t0, a0, t2 # e0 :
- mskqh t0, a1, t3 # e0 :
- or t2, t3, t0 # e1 :
- stq_u t0, 0(a0) # e0 :
-
-$done: ret
-
- cfi_endproc
-weak_alias (__bzero, bzero)
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 06/11] ia64: Remove bzero optimization
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (4 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 05/11] alpha: Remove bzero optimization Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 07/11] sparc: " Adhemerval Zanella
` (5 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbol is not present current POSIX specification and compiler
already generates memset call.
---
sysdeps/ia64/bzero.S | 312 -------------------------------------------
1 file changed, 312 deletions(-)
delete mode 100644 sysdeps/ia64/bzero.S
diff --git a/sysdeps/ia64/bzero.S b/sysdeps/ia64/bzero.S
deleted file mode 100644
index cd01abb436..0000000000
--- a/sysdeps/ia64/bzero.S
+++ /dev/null
@@ -1,312 +0,0 @@
-/* Optimized version of the standard bzero() function.
- This file is part of the GNU C Library.
- Copyright (C) 2000-2022 Free Software Foundation, Inc.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Return: dest
-
- Inputs:
- in0: dest
- in1: count
-
- The algorithm is fairly straightforward: set byte by byte until we
- we get to a 16B-aligned address, then loop on 128 B chunks using an
- early store as prefetching, then loop on 32B chucks, then clear remaining
- words, finally clear remaining bytes.
- Since a stf.spill f0 can store 16B in one go, we use this instruction
- to get peak speed. */
-
-#include <sysdep.h>
-#undef ret
-
-#define dest in0
-#define cnt in1
-
-#define tmp r31
-#define save_lc r30
-#define ptr0 r29
-#define ptr1 r28
-#define ptr2 r27
-#define ptr3 r26
-#define ptr9 r24
-#define loopcnt r23
-#define linecnt r22
-#define bytecnt r21
-
-// This routine uses only scratch predicate registers (p6 - p15)
-#define p_scr p6 // default register for same-cycle branches
-#define p_unalgn p9
-#define p_y p11
-#define p_n p12
-#define p_yy p13
-#define p_nn p14
-
-#define movi0 mov
-
-#define MIN1 15
-#define MIN1P1HALF 8
-#define LINE_SIZE 128
-#define LSIZE_SH 7 // shift amount
-#define PREF_AHEAD 8
-
-#define USE_FLP
-#if defined(USE_INT)
-#define store st8
-#define myval r0
-#elif defined(USE_FLP)
-#define store stf8
-#define myval f0
-#endif
-
-.align 64
-ENTRY(bzero)
-{ .mmi
- .prologue
- alloc tmp = ar.pfs, 2, 0, 0, 0
- lfetch.nt1 [dest]
- .save ar.lc, save_lc
- movi0 save_lc = ar.lc
-} { .mmi
- .body
- mov ret0 = dest // return value
- nop.m 0
- cmp.eq p_scr, p0 = cnt, r0
-;; }
-{ .mmi
- and ptr2 = -(MIN1+1), dest // aligned address
- and tmp = MIN1, dest // prepare to check for alignment
- tbit.nz p_y, p_n = dest, 0 // Do we have an odd address? (M_B_U)
-} { .mib
- mov ptr1 = dest
- nop.i 0
-(p_scr) br.ret.dpnt.many rp // return immediately if count = 0
-;; }
-{ .mib
- cmp.ne p_unalgn, p0 = tmp, r0
-} { .mib // NB: # of bytes to move is 1
- sub bytecnt = (MIN1+1), tmp // higher than loopcnt
- cmp.gt p_scr, p0 = 16, cnt // is it a minimalistic task?
-(p_scr) br.cond.dptk.many .move_bytes_unaligned // go move just a few (M_B_U)
-;; }
-{ .mmi
-(p_unalgn) add ptr1 = (MIN1+1), ptr2 // after alignment
-(p_unalgn) add ptr2 = MIN1P1HALF, ptr2 // after alignment
-(p_unalgn) tbit.nz.unc p_y, p_n = bytecnt, 3 // should we do a st8 ?
-;; }
-{ .mib
-(p_y) add cnt = -8, cnt
-(p_unalgn) tbit.nz.unc p_yy, p_nn = bytecnt, 2 // should we do a st4 ?
-} { .mib
-(p_y) st8 [ptr2] = r0,-4
-(p_n) add ptr2 = 4, ptr2
-;; }
-{ .mib
-(p_yy) add cnt = -4, cnt
-(p_unalgn) tbit.nz.unc p_y, p_n = bytecnt, 1 // should we do a st2 ?
-} { .mib
-(p_yy) st4 [ptr2] = r0,-2
-(p_nn) add ptr2 = 2, ptr2
-;; }
-{ .mmi
- mov tmp = LINE_SIZE+1 // for compare
-(p_y) add cnt = -2, cnt
-(p_unalgn) tbit.nz.unc p_yy, p_nn = bytecnt, 0 // should we do a st1 ?
-} { .mmi
- nop.m 0
-(p_y) st2 [ptr2] = r0,-1
-(p_n) add ptr2 = 1, ptr2
-;; }
-
-{ .mmi
-(p_yy) st1 [ptr2] = r0
- cmp.gt p_scr, p0 = tmp, cnt // is it a minimalistic task?
-} { .mbb
-(p_yy) add cnt = -1, cnt
-(p_scr) br.cond.dpnt.many .fraction_of_line // go move just a few
-;; }
-{ .mib
- nop.m 0
- shr.u linecnt = cnt, LSIZE_SH
- nop.b 0
-;; }
-
- .align 32
-.l1b: // ------------------// L1B: store ahead into cache lines; fill later
-{ .mmi
- and tmp = -(LINE_SIZE), cnt // compute end of range
- mov ptr9 = ptr1 // used for prefetching
- and cnt = (LINE_SIZE-1), cnt // remainder
-} { .mmi
- mov loopcnt = PREF_AHEAD-1 // default prefetch loop
- cmp.gt p_scr, p0 = PREF_AHEAD, linecnt // check against actual value
-;; }
-{ .mmi
-(p_scr) add loopcnt = -1, linecnt
- add ptr2 = 16, ptr1 // start of stores (beyond prefetch stores)
- add ptr1 = tmp, ptr1 // first address beyond total range
-;; }
-{ .mmi
- add tmp = -1, linecnt // next loop count
- movi0 ar.lc = loopcnt
-;; }
-.pref_l1b:
-{ .mib
- stf.spill [ptr9] = f0, 128 // Do stores one cache line apart
- nop.i 0
- br.cloop.dptk.few .pref_l1b
-;; }
-{ .mmi
- add ptr0 = 16, ptr2 // Two stores in parallel
- movi0 ar.lc = tmp
-;; }
-.l1bx:
- { .mmi
- stf.spill [ptr2] = f0, 32
- stf.spill [ptr0] = f0, 32
- ;; }
- { .mmi
- stf.spill [ptr2] = f0, 32
- stf.spill [ptr0] = f0, 32
- ;; }
- { .mmi
- stf.spill [ptr2] = f0, 32
- stf.spill [ptr0] = f0, 64
- cmp.lt p_scr, p0 = ptr9, ptr1 // do we need more prefetching?
- ;; }
-{ .mmb
- stf.spill [ptr2] = f0, 32
-(p_scr) stf.spill [ptr9] = f0, 128
- br.cloop.dptk.few .l1bx
-;; }
-{ .mib
- cmp.gt p_scr, p0 = 8, cnt // just a few bytes left ?
-(p_scr) br.cond.dpnt.many .move_bytes_from_alignment
-;; }
-
-.fraction_of_line:
-{ .mib
- add ptr2 = 16, ptr1
- shr.u loopcnt = cnt, 5 // loopcnt = cnt / 32
-;; }
-{ .mib
- cmp.eq p_scr, p0 = loopcnt, r0
- add loopcnt = -1, loopcnt
-(p_scr) br.cond.dpnt.many .store_words
-;; }
-{ .mib
- and cnt = 0x1f, cnt // compute the remaining cnt
- movi0 ar.lc = loopcnt
-;; }
- .align 32
-.l2: // -----------------------------// L2A: store 32B in 2 cycles
-{ .mmb
- store [ptr1] = myval, 8
- store [ptr2] = myval, 8
-;; } { .mmb
- store [ptr1] = myval, 24
- store [ptr2] = myval, 24
- br.cloop.dptk.many .l2
-;; }
-.store_words:
-{ .mib
- cmp.gt p_scr, p0 = 8, cnt // just a few bytes left ?
-(p_scr) br.cond.dpnt.many .move_bytes_from_alignment // Branch
-;; }
-
-{ .mmi
- store [ptr1] = myval, 8 // store
- cmp.le p_y, p_n = 16, cnt //
- add cnt = -8, cnt // subtract
-;; }
-{ .mmi
-(p_y) store [ptr1] = myval, 8 // store
-(p_y) cmp.le.unc p_yy, p_nn = 16, cnt
-(p_y) add cnt = -8, cnt // subtract
-;; }
-{ .mmi // store
-(p_yy) store [ptr1] = myval, 8
-(p_yy) add cnt = -8, cnt // subtract
-;; }
-
-.move_bytes_from_alignment:
-{ .mib
- cmp.eq p_scr, p0 = cnt, r0
- tbit.nz.unc p_y, p0 = cnt, 2 // should we terminate with a st4 ?
-(p_scr) br.cond.dpnt.few .restore_and_exit
-;; }
-{ .mib
-(p_y) st4 [ptr1] = r0,4
- tbit.nz.unc p_yy, p0 = cnt, 1 // should we terminate with a st2 ?
-;; }
-{ .mib
-(p_yy) st2 [ptr1] = r0,2
- tbit.nz.unc p_y, p0 = cnt, 0 // should we terminate with a st1 ?
-;; }
-
-{ .mib
-(p_y) st1 [ptr1] = r0
-;; }
-.restore_and_exit:
-{ .mib
- nop.m 0
- movi0 ar.lc = save_lc
- br.ret.sptk.many rp
-;; }
-
-.move_bytes_unaligned:
-{ .mmi
- .pred.rel "mutex",p_y, p_n
- .pred.rel "mutex",p_yy, p_nn
-(p_n) cmp.le p_yy, p_nn = 4, cnt
-(p_y) cmp.le p_yy, p_nn = 5, cnt
-(p_n) add ptr2 = 2, ptr1
-} { .mmi
-(p_y) add ptr2 = 3, ptr1
-(p_y) st1 [ptr1] = r0, 1 // fill 1 (odd-aligned) byte
-(p_y) add cnt = -1, cnt // [15, 14 (or less) left]
-;; }
-{ .mmi
-(p_yy) cmp.le.unc p_y, p0 = 8, cnt
- add ptr3 = ptr1, cnt // prepare last store
- movi0 ar.lc = save_lc
-} { .mmi
-(p_yy) st2 [ptr1] = r0, 4 // fill 2 (aligned) bytes
-(p_yy) st2 [ptr2] = r0, 4 // fill 2 (aligned) bytes
-(p_yy) add cnt = -4, cnt // [11, 10 (o less) left]
-;; }
-{ .mmi
-(p_y) cmp.le.unc p_yy, p0 = 8, cnt
- add ptr3 = -1, ptr3 // last store
- tbit.nz p_scr, p0 = cnt, 1 // will there be a st2 at the end ?
-} { .mmi
-(p_y) st2 [ptr1] = r0, 4 // fill 2 (aligned) bytes
-(p_y) st2 [ptr2] = r0, 4 // fill 2 (aligned) bytes
-(p_y) add cnt = -4, cnt // [7, 6 (or less) left]
-;; }
-{ .mmi
-(p_yy) st2 [ptr1] = r0, 4 // fill 2 (aligned) bytes
-(p_yy) st2 [ptr2] = r0, 4 // fill 2 (aligned) bytes
- // [3, 2 (or less) left]
- tbit.nz p_y, p0 = cnt, 0 // will there be a st1 at the end ?
-} { .mmi
-(p_yy) add cnt = -4, cnt
-;; }
-{ .mmb
-(p_scr) st2 [ptr1] = r0 // fill 2 (aligned) bytes
-(p_y) st1 [ptr3] = r0 // fill last byte (using ptr3)
- br.ret.sptk.many rp
-;; }
-END(bzero)
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 07/11] sparc: Remove bzero optimization
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (5 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 06/11] ia64: " Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 08/11] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
` (4 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/sparc/sparc32/bzero.c | 1 -
sysdeps/sparc/sparc32/memset.S | 37 ++++++++-----------
sysdeps/sparc/sparc32/sparcv9/bzero.c | 1 -
.../sparc/sparc32/sparcv9/multiarch/bzero.c | 1 -
.../sparc32/sparcv9/multiarch/memset-ultra1.S | 1 -
sysdeps/sparc/sparc64/bzero.c | 1 -
sysdeps/sparc/sparc64/memset.S | 30 ++++++---------
sysdeps/sparc/sparc64/multiarch/bzero.c | 33 -----------------
.../sparc/sparc64/multiarch/ifunc-impl-list.c | 9 -----
.../sparc/sparc64/multiarch/ifunc-memset.h | 2 +-
.../sparc/sparc64/multiarch/memset-niagara1.S | 5 +--
.../sparc/sparc64/multiarch/memset-niagara4.S | 6 +--
.../sparc/sparc64/multiarch/memset-niagara7.S | 7 ----
.../sparc/sparc64/multiarch/memset-ultra1.S | 1 -
14 files changed, 30 insertions(+), 105 deletions(-)
delete mode 100644 sysdeps/sparc/sparc32/bzero.c
delete mode 100644 sysdeps/sparc/sparc32/sparcv9/bzero.c
delete mode 100644 sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
delete mode 100644 sysdeps/sparc/sparc64/bzero.c
delete mode 100644 sysdeps/sparc/sparc64/multiarch/bzero.c
diff --git a/sysdeps/sparc/sparc32/bzero.c b/sysdeps/sparc/sparc32/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc32/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc32/memset.S b/sysdeps/sparc/sparc32/memset.S
index d222fa7506..b1b67cb2d1 100644
--- a/sysdeps/sparc/sparc32/memset.S
+++ b/sysdeps/sparc/sparc32/memset.S
@@ -42,25 +42,6 @@
.text
.align 4
-ENTRY(__bzero)
- b 1f
- mov %g0, %g3
-
-3: cmp %o2, 3
- be 2f
- stb %g3, [%o0]
-
- cmp %o2, 2
- be 2f
- stb %g3, [%o0 + 0x01]
-
- stb %g3, [%o0 + 0x02]
-2: sub %o2, 4, %o2
- add %o1, %o2, %o1
- b 4f
- sub %o0, %o2, %o0
-END(__bzero)
-
ENTRY(memset)
and %o1, 0xff, %g3
sll %g3, 8, %g2
@@ -73,7 +54,7 @@ ENTRY(memset)
mov %o0, %g1
andcc %o0, 3, %o2
- bne 3b
+ bne 3f
4: andcc %o0, 4, %g0
be 2f
@@ -146,7 +127,19 @@ ENTRY(memset)
stb %g3, [%o0 + 6]
0: retl
nop
+
+3: cmp %o2, 3
+ be 2f
+ stb %g3, [%o0]
+
+ cmp %o2, 2
+ be 2f
+ stb %g3, [%o0 + 0x01]
+
+ stb %g3, [%o0 + 0x02]
+2: sub %o2, 4, %o2
+ add %o1, %o2, %o1
+ b 4b
+ sub %o0, %o2, %o0
END(memset)
libc_hidden_builtin_def (memset)
-
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/sparc/sparc32/sparcv9/bzero.c b/sysdeps/sparc/sparc32/sparcv9/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c b/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
deleted file mode 100644
index cf6803ef44..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-#include <sysdeps/sparc/sparc64/multiarch/bzero.c>
diff --git a/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S b/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
index 6038611134..2dda6f1ed6 100644
--- a/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
+++ b/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
@@ -25,6 +25,5 @@
# define weak_alias(x, y)
# define memset __memset_ultra1
-# define __bzero __bzero_ultra1
# include <sysdeps/sparc/sparc32/sparcv9/memset.S>
#endif
diff --git a/sysdeps/sparc/sparc64/bzero.c b/sysdeps/sparc/sparc64/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc64/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc64/memset.S b/sysdeps/sparc/sparc64/memset.S
index a7f8361fa3..33ecbc93fe 100644
--- a/sysdeps/sparc/sparc64/memset.S
+++ b/sysdeps/sparc/sparc64/memset.S
@@ -31,6 +31,16 @@
stx source, [base - offset - 0x08]; \
stx source, [base - offset - 0x00];
+#define ZERO_BLOCKS(base, offset, source) \
+ stx source, [base - offset - 0x38]; \
+ stx source, [base - offset - 0x30]; \
+ stx source, [base - offset - 0x28]; \
+ stx source, [base - offset - 0x20]; \
+ stx source, [base - offset - 0x18]; \
+ stx source, [base - offset - 0x10]; \
+ stx source, [base - offset - 0x08]; \
+ stx source, [base - offset - 0x00];
+
/* Well, memset is a lot easier to get right than bcopy... */
.text
.align 32
@@ -174,22 +184,7 @@ ENTRY(memset)
nop
ba,pt %xcc, 18b
ldd [%o0], %f0
-END(memset)
-libc_hidden_builtin_def (memset)
-#define ZERO_BLOCKS(base, offset, source) \
- stx source, [base - offset - 0x38]; \
- stx source, [base - offset - 0x30]; \
- stx source, [base - offset - 0x28]; \
- stx source, [base - offset - 0x20]; \
- stx source, [base - offset - 0x18]; \
- stx source, [base - offset - 0x10]; \
- stx source, [base - offset - 0x08]; \
- stx source, [base - offset - 0x00];
-
- .text
- .align 32
-ENTRY(__bzero)
#ifndef USE_BPR
srl %o1, 0, %o1
#endif
@@ -307,6 +302,5 @@ ENTRY(__bzero)
stb %g0, [%o0 - 1]
0: retl
mov %o5, %o0
-END(__bzero)
-
-weak_alias (__bzero, bzero)
+END(memset)
+libc_hidden_builtin_def (memset)
diff --git a/sysdeps/sparc/sparc64/multiarch/bzero.c b/sysdeps/sparc/sparc64/multiarch/bzero.c
deleted file mode 100644
index 409d66a864..0000000000
--- a/sysdeps/sparc/sparc64/multiarch/bzero.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* Multiple versions of bzero. SPARC64/Linux version.
- All versions must be listed in ifunc-impl-list.c.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#if IS_IN (libc)
-# define bzero __redirect_bzero
-# include <string.h>
-# undef bzero
-
-# include <sparc-ifunc.h>
-
-# define SYMBOL_NAME bzero
-# include "ifunc-memset.h"
-
-sparc_libc_ifunc_redirected (__redirect_bzero, __bzero, IFUNC_SELECTOR)
-weak_alias (__bzero, bzero)
-
-#endif
diff --git a/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c b/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
index 05926e605b..9be12f9130 100644
--- a/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
@@ -61,15 +61,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__mempcpy_ultra3)
IFUNC_IMPL_ADD (array, i, mempcpy, 1, __mempcpy_ultra1));
- IFUNC_IMPL (i, name, bzero,
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_ADP,
- __bzero_niagara7)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_CRYPTO,
- __bzero_niagara4)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_BLKINIT,
- __bzero_niagara1)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ultra1));
-
IFUNC_IMPL (i, name, memset,
IFUNC_IMPL_ADD (array, i, memset, hwcap & HWCAP_SPARC_ADP,
__memset_niagara7)
diff --git a/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h b/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
index 56893b6883..0a2f16b3f1 100644
--- a/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
+++ b/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
@@ -1,4 +1,4 @@
-/* Common definition for memset/bzero implementation.
+/* Common definition for memset implementation.
All versions must be listed in ifunc-impl-list.c.
Copyright (C) 2017-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
index 13432effc1..7865691eca 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
@@ -45,9 +45,6 @@ ENTRY(__memset_niagara1)
sllx %o2, 32, %g1
ba,pt %XCC, 1f
or %g1, %o2, %o2
-END(__memset_niagara1)
-
-ENTRY(__bzero_niagara1)
clr %o2
1:
# ifndef USE_BRP
@@ -171,6 +168,6 @@ ENTRY(__bzero_niagara1)
90:
retl
mov %o3, %o0
-END(__bzero_niagara1)
+END(__memset_niagara1)
#endif
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
index 1ccf24e516..d6fbd83009 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
@@ -39,10 +39,6 @@ ENTRY(__memset_niagara4)
sllx %o2, 32, %g1
ba,pt %icc, 1f
or %g1, %o2, %o4
-END(__memset_niagara4)
-
- .align 32
-ENTRY(__bzero_niagara4)
clr %o4
1: cmp %o1, 16
ble %icc, .Ltiny
@@ -118,6 +114,6 @@ ENTRY(__bzero_niagara4)
bne,pt %icc, 1b
add %o0, 0x30, %o0
ba,a,pt %icc, .Lpostloop
-END(__bzero_niagara4)
+END(__memset_niagara4)
#endif
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
index 491b203ff9..6fcbf56675 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
@@ -99,13 +99,6 @@
.text
.align 32
-ENTRY(__bzero_niagara7)
- /* bzero (dst, size) */
- mov %o1, %o2
- mov 0, %o1
- /* fall through into memset code */
-END(__bzero_niagara7)
-
ENTRY(__memset_niagara7)
/* memset (src, c, size) */
mov %o0, %o5 /* copy sp1 before using it */
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S b/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
index e0d3424307..3c3add791e 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
@@ -25,6 +25,5 @@
# define weak_alias(x, y)
# define memset __memset_ultra1
-# define __bzero __bzero_ultra1
# include <sysdeps/sparc/sparc64/memset.S>
#endif
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 08/11] powerpc: Remove powerpc32 bzero optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (6 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 07/11] sparc: " Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 09/11] powerpc: Remove powerpc64 " Adhemerval Zanella
` (3 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/powerpc/powerpc32/bzero.S | 27 --------------
.../powerpc32/power4/multiarch/Makefile | 4 +-
.../powerpc32/power4/multiarch/bzero-power6.S | 25 -------------
.../powerpc32/power4/multiarch/bzero-power7.S | 25 -------------
.../powerpc32/power4/multiarch/bzero-ppc32.S | 34 -----------------
.../powerpc32/power4/multiarch/bzero.c | 37 -------------------
.../power4/multiarch/ifunc-impl-list.c | 8 ----
7 files changed, 2 insertions(+), 158 deletions(-)
delete mode 100644 sysdeps/powerpc/powerpc32/bzero.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
diff --git a/sysdeps/powerpc/powerpc32/bzero.S b/sysdeps/powerpc/powerpc32/bzero.S
deleted file mode 100644
index 9cc03c92df..0000000000
--- a/sysdeps/powerpc/powerpc32/bzero.S
+++ /dev/null
@@ -1,27 +0,0 @@
-/* Optimized bzero `implementation' for PowerPC.
- Copyright (C) 1997-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY (__bzero)
-
- mr r5,r4
- li r4,0
- b memset@local
-END (__bzero)
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile b/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
index 5c68f07d19..b2f9deefb8 100644
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
@@ -1,8 +1,8 @@
ifeq ($(subdir),string)
sysdep_routines += memcpy-power7 memcpy-a2 memcpy-power6 memcpy-cell \
memcpy-ppc32 memcmp-power7 memcmp-ppc32 memset-power7 \
- memset-power6 memset-ppc32 bzero-power7 bzero-power6 \
- bzero-ppc32 mempcpy-power7 mempcpy-ppc32 memchr-power7 \
+ memset-power6 memset-ppc32 \
+ mempcpy-power7 mempcpy-ppc32 memchr-power7 \
memchr-ppc32 memrchr-power7 memrchr-ppc32 rawmemchr-power7 \
rawmemchr-ppc32 strlen-power7 strlen-ppc32 strnlen-power7 \
strnlen-ppc32 strncmp-power7 strncmp-ppc32 \
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
deleted file mode 100644
index b352433283..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
+++ /dev/null
@@ -1,25 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/POWER6.
- Copyright (C) 2010-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY (__bzero_power6)
- mr r5,r4
- li r4,0
- b __memset_power6@local
-END (__bzero_power6)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
deleted file mode 100644
index 80c8ffe55a..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
+++ /dev/null
@@ -1,25 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/POWER7.
- Copyright (C) 2010-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY (__bzero_power7)
- mr r5,r4
- li r4,0
- b __memset_power7@local
-END (__bzero_power7)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
deleted file mode 100644
index 86711e8e22..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
+++ /dev/null
@@ -1,34 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/PPC32.
- Copyright (C) 2010-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-/* memset ifunc selector is not built for static and memset@local
- for shared builds makes the linker point the call to the ifunc
- selector. */
-#ifdef SHARED
-# define MEMSET __memset_ppc
-#else
-# define MEMSET memset
-#endif
-
-ENTRY (__bzero_ppc)
- mr r5,r4
- li r4,0
- b MEMSET@local
-END (__bzero_ppc)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
deleted file mode 100644
index 5d9270289f..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
+++ /dev/null
@@ -1,37 +0,0 @@
-/* Multiple versions of bzero.
- Copyright (C) 2013-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for definition in libc. */
-#if IS_IN (libc)
-# include <string.h>
-# include <strings.h>
-# include "init-arch.h"
-
-extern __typeof (bzero) __bzero_ppc attribute_hidden;
-extern __typeof (bzero) __bzero_power6 attribute_hidden;
-extern __typeof (bzero) __bzero_power7 attribute_hidden;
-
-libc_ifunc (__bzero,
- (hwcap & PPC_FEATURE_HAS_VSX)
- ? __bzero_power7 :
- (hwcap & PPC_FEATURE_ARCH_2_05)
- ? __bzero_power6
- : __bzero_ppc);
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
index 9832f366bb..01890367a4 100644
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
@@ -73,14 +73,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__memset_power6)
IFUNC_IMPL_ADD (array, i, memset, 1, __memset_ppc))
- /* Support sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c. */
- IFUNC_IMPL (i, name, bzero,
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_HAS_VSX,
- __bzero_power7)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_ARCH_2_05,
- __bzero_power6)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
-
/* Support sysdeps/powerpc/powerpc32/power4/multiarch/strlen.c. */
IFUNC_IMPL (i, name, strlen,
IFUNC_IMPL_ADD (array, i, strlen, hwcap & PPC_FEATURE_HAS_VSX,
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 09/11] powerpc: Remove powerpc64 bzero optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (7 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 08/11] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 10/11] s390: Remove " Adhemerval Zanella
` (2 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/powerpc/powerpc64/bzero.S | 20 -------
sysdeps/powerpc/powerpc64/le/power10/memset.S | 12 -----
sysdeps/powerpc/powerpc64/memset.S | 13 -----
sysdeps/powerpc/powerpc64/multiarch/bzero.c | 54 -------------------
.../powerpc64/multiarch/ifunc-impl-list.c | 21 --------
.../powerpc64/multiarch/memset-power10.S | 3 --
.../powerpc64/multiarch/memset-power4.S | 3 --
.../powerpc64/multiarch/memset-power6.S | 3 --
.../powerpc64/multiarch/memset-power7.S | 2 -
.../powerpc64/multiarch/memset-power8.S | 3 --
.../powerpc64/multiarch/memset-ppc64.S | 16 +-----
sysdeps/powerpc/powerpc64/power4/memset.S | 12 -----
sysdeps/powerpc/powerpc64/power6/memset.S | 12 -----
sysdeps/powerpc/powerpc64/power7/memset.S | 12 -----
sysdeps/powerpc/powerpc64/power8/memset.S | 12 -----
15 files changed, 1 insertion(+), 197 deletions(-)
delete mode 100644 sysdeps/powerpc/powerpc64/bzero.S
delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bzero.c
diff --git a/sysdeps/powerpc/powerpc64/bzero.S b/sysdeps/powerpc/powerpc64/bzero.S
deleted file mode 100644
index a7ca73cc39..0000000000
--- a/sysdeps/powerpc/powerpc64/bzero.S
+++ /dev/null
@@ -1,20 +0,0 @@
-/* Optimized bzero `implementation' for PowerPC64.
- Copyright (C) 1997-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* This code was moved into memset.S to solve a double stub call problem.
- @local would have worked but it is not supported in PowerPC64 asm. */
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memset.S b/sysdeps/powerpc/powerpc64/le/power10/memset.S
index bee6d8b31b..0f43b002bf 100644
--- a/sysdeps/powerpc/powerpc64/le/power10/memset.S
+++ b/sysdeps/powerpc/powerpc64/le/power10/memset.S
@@ -242,15 +242,3 @@ L(bcdz_tail):
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 2
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/memset.S b/sysdeps/powerpc/powerpc64/memset.S
index 34ee8ffca4..b813cd3c6b 100644
--- a/sysdeps/powerpc/powerpc64/memset.S
+++ b/sysdeps/powerpc/powerpc64/memset.S
@@ -253,16 +253,3 @@ L(medium_28t):
blr
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-#ifndef NO_BZERO_IMPL
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END_GEN_TB (__bzero,TB_TOCLESS)
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bzero.c b/sysdeps/powerpc/powerpc64/multiarch/bzero.c
deleted file mode 100644
index f83d6da55b..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bzero.c
+++ /dev/null
@@ -1,54 +0,0 @@
-/* Multiple versions of bzero. PowerPC64 version.
- Copyright (C) 2013-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for definition in libc. */
-#if IS_IN (libc)
-# include <string.h>
-# include <strings.h>
-# include "init-arch.h"
-
-extern __typeof (bzero) __bzero_ppc attribute_hidden;
-extern __typeof (bzero) __bzero_power4 attribute_hidden;
-extern __typeof (bzero) __bzero_power6 attribute_hidden;
-extern __typeof (bzero) __bzero_power7 attribute_hidden;
-extern __typeof (bzero) __bzero_power8 attribute_hidden;
-# ifdef __LITTLE_ENDIAN__
-extern __typeof (bzero) __bzero_power10 attribute_hidden;
-# endif
-
-libc_ifunc (__bzero,
-# ifdef __LITTLE_ENDIAN__
- (hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX)
- ? __bzero_power10 :
-# endif
- (hwcap2 & PPC_FEATURE2_ARCH_2_07
- && hwcap & PPC_FEATURE_HAS_ALTIVEC)
- ? __bzero_power8 :
- (hwcap & PPC_FEATURE_HAS_VSX)
- ? __bzero_power7 :
- (hwcap & PPC_FEATURE_ARCH_2_05
- && hwcap & PPC_FEATURE_HAS_ALTIVEC)
- ? __bzero_power6 :
- (hwcap & PPC_FEATURE_POWER4)
- ? __bzero_power4
- : __bzero_ppc);
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 280b8616b2..ac533a9886 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -223,27 +223,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
__memcmp_power4)
IFUNC_IMPL_ADD (array, i, memcmp, 1, __memcmp_ppc))
- /* Support sysdeps/powerpc/powerpc64/multiarch/bzero.c. */
- IFUNC_IMPL (i, name, bzero,
-#ifdef __LITTLE_ENDIAN__
- IFUNC_IMPL_ADD (array, i, bzero,
- hwcap2 & PPC_FEATURE2_ARCH_3_1
- && hwcap2 & PPC_FEATURE2_HAS_ISEL
- && hwcap & PPC_FEATURE_HAS_VSX,
- __bzero_power10)
-#endif
- IFUNC_IMPL_ADD (array, i, bzero, hwcap2 & PPC_FEATURE2_ARCH_2_07
- && hwcap & PPC_FEATURE_HAS_ALTIVEC,
- __bzero_power8)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_HAS_VSX,
- __bzero_power7)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_ARCH_2_05
- && hwcap & PPC_FEATURE_HAS_ALTIVEC,
- __bzero_power6)
- IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_POWER4,
- __bzero_power4)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
-
/* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c. */
IFUNC_IMPL (i, name, mempcpy,
IFUNC_IMPL_ADD (array, i, mempcpy,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
index ead0b67926..ba5bee1c7a 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power10
-
#include <sysdeps/powerpc/powerpc64/le/power10/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
index 6f5631d03d..4ee567c6f9 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power4
-
#include <sysdeps/powerpc/powerpc64/power4/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
index b81f4f0d64..9f5e7d1b37 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power6
-
#include <sysdeps/powerpc/powerpc64/power6/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
index a8ca12db83..6fd92d5afc 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
@@ -21,6 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power7
#include <sysdeps/powerpc/powerpc64/power7/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
index b06587aa2d..43cc5c7339 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
@@ -21,7 +21,4 @@
#undef libc_hidden_builtin_def
#define libc_hidden_builtin_def(name)
-#undef __bzero
-#define __bzero __bzero_power8
-
#include <sysdeps/powerpc/powerpc64/power8/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S b/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
index 876954d36b..30b25ef15f 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
@@ -1,4 +1,4 @@
-/* Default memset/bzero implementation for PowerPC64.
+/* Default memset implementation for PowerPC64.
Copyright (C) 2013-2022 Free Software Foundation, Inc.
This file is part of the GNU C Library.
@@ -18,17 +18,6 @@
#include <sysdep.h>
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. NOTE: this code should be positioned
- before ENTRY/END_GEN_TB redefinition. */
-ENTRY (__bzero_ppc)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END_GEN_TB (__bzero_ppc,TB_TOCLESS)
-
-
#if defined SHARED && IS_IN (libc)
# define MEMSET __memset_ppc
@@ -36,7 +25,4 @@ END_GEN_TB (__bzero_ppc,TB_TOCLESS)
# define libc_hidden_builtin_def(name)
#endif
-/* Do not implement __bzero at powerpc64/memset.S. */
-#define NO_BZERO_IMPL
-
#include <sysdeps/powerpc/powerpc64/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/power4/memset.S b/sysdeps/powerpc/powerpc64/power4/memset.S
index dfc136261b..0f14a5198a 100644
--- a/sysdeps/powerpc/powerpc64/power4/memset.S
+++ b/sysdeps/powerpc/powerpc64/power4/memset.S
@@ -237,15 +237,3 @@ L(medium_28t):
blr
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power6/memset.S b/sysdeps/powerpc/powerpc64/power6/memset.S
index 7ad82c38e6..140a756348 100644
--- a/sysdeps/powerpc/powerpc64/power6/memset.S
+++ b/sysdeps/powerpc/powerpc64/power6/memset.S
@@ -381,15 +381,3 @@ L(medium_28t):
blr
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power7/memset.S b/sysdeps/powerpc/powerpc64/power7/memset.S
index 31aa0f91cf..358199a805 100644
--- a/sysdeps/powerpc/powerpc64/power7/memset.S
+++ b/sysdeps/powerpc/powerpc64/power7/memset.S
@@ -384,15 +384,3 @@ L(small):
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power8/memset.S b/sysdeps/powerpc/powerpc64/power8/memset.S
index 9ecb6f3067..70cace14ef 100644
--- a/sysdeps/powerpc/powerpc64/power8/memset.S
+++ b/sysdeps/powerpc/powerpc64/power8/memset.S
@@ -504,15 +504,3 @@ L(LE7_tail5):
END_GEN_TB (MEMSET,TB_TOCLESS)
libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
- between bzero and memset. */
-ENTRY_TOCLESS (__bzero)
- CALL_MCOUNT 3
- mr r5,r4
- li r4,0
- b L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 10/11] s390: Remove bzero optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (8 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 09/11] powerpc: Remove powerpc64 " Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 11/11] i686: " Adhemerval Zanella
2022-02-23 14:13 ` [PATCH v2 00/11] Remove bcopy and " Adhemerval Zanella
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/s390/Makefile | 2 +-
sysdeps/s390/bzero.c | 47 ------------------------
sysdeps/s390/ifunc-memset.h | 9 -----
sysdeps/s390/memset-z900.S | 32 +---------------
sysdeps/s390/multiarch/ifunc-impl-list.c | 15 --------
5 files changed, 2 insertions(+), 103 deletions(-)
delete mode 100644 sysdeps/s390/bzero.c
diff --git a/sysdeps/s390/Makefile b/sysdeps/s390/Makefile
index ade8663218..5b6a96579c 100644
--- a/sysdeps/s390/Makefile
+++ b/sysdeps/s390/Makefile
@@ -66,7 +66,7 @@ endif
endif
ifeq ($(subdir),string)
-sysdep_routines += bzero memset memset-z900 \
+sysdep_routines += memset memset-z900 \
memcmp memcmp-z900 \
mempcpy memcpy memcpy-z900 \
memmove memmove-c \
diff --git a/sysdeps/s390/bzero.c b/sysdeps/s390/bzero.c
deleted file mode 100644
index 1f0a03e2ed..0000000000
--- a/sysdeps/s390/bzero.c
+++ /dev/null
@@ -1,47 +0,0 @@
-/* Multiple versions of bzero.
- Copyright (C) 2018-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <ifunc-memset.h>
-#if HAVE_MEMSET_IFUNC
-# include <string.h>
-# include <ifunc-resolve.h>
-
-# if HAVE_MEMSET_Z900_G5
-extern __typeof (__bzero) BZERO_Z900_G5 attribute_hidden;
-# endif
-
-# if HAVE_MEMSET_Z10
-extern __typeof (__bzero) BZERO_Z10 attribute_hidden;
-# endif
-
-# if HAVE_MEMSET_Z196
-extern __typeof (__bzero) BZERO_Z196 attribute_hidden;
-# endif
-
-s390_libc_ifunc_expr (__bzero, __bzero,
- ({
- s390_libc_ifunc_expr_stfle_init ();
- (HAVE_MEMSET_Z196 && S390_IS_Z196 (stfle_bits))
- ? BZERO_Z196
- : (HAVE_MEMSET_Z10 && S390_IS_Z10 (stfle_bits))
- ? BZERO_Z10
- : BZERO_DEFAULT;
- })
- )
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/s390/ifunc-memset.h b/sysdeps/s390/ifunc-memset.h
index db15df9bc1..7098332e92 100644
--- a/sysdeps/s390/ifunc-memset.h
+++ b/sysdeps/s390/ifunc-memset.h
@@ -25,19 +25,16 @@
#if defined HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
# define MEMSET_DEFAULT MEMSET_Z196
-# define BZERO_DEFAULT BZERO_Z196
# define HAVE_MEMSET_Z900_G5 0
# define HAVE_MEMSET_Z10 0
# define HAVE_MEMSET_Z196 1
#elif defined HAVE_S390_MIN_Z10_ZARCH_ASM_SUPPORT
# define MEMSET_DEFAULT MEMSET_Z10
-# define BZERO_DEFAULT BZERO_Z10
# define HAVE_MEMSET_Z900_G5 0
# define HAVE_MEMSET_Z10 1
# define HAVE_MEMSET_Z196 HAVE_MEMSET_IFUNC
#else
# define MEMSET_DEFAULT MEMSET_Z900_G5
-# define BZERO_DEFAULT BZERO_Z900_G5
# define HAVE_MEMSET_Z900_G5 1
# define HAVE_MEMSET_Z10 HAVE_MEMSET_IFUNC
# define HAVE_MEMSET_Z196 HAVE_MEMSET_IFUNC
@@ -51,24 +48,18 @@
#if HAVE_MEMSET_Z900_G5
# define MEMSET_Z900_G5 __memset_default
-# define BZERO_Z900_G5 __bzero_default
#else
# define MEMSET_Z900_G5 NULL
-# define BZERO_Z900_G5 NULL
#endif
#if HAVE_MEMSET_Z10
# define MEMSET_Z10 __memset_z10
-# define BZERO_Z10 __bzero_z10
#else
# define MEMSET_Z10 NULL
-# define BZERO_Z10 NULL
#endif
#if HAVE_MEMSET_Z196
# define MEMSET_Z196 __memset_z196
-# define BZERO_Z196 __bzero_z196
#else
# define MEMSET_Z196 NULL
-# define BZERO_Z196 NULL
#endif
diff --git a/sysdeps/s390/memset-z900.S b/sysdeps/s390/memset-z900.S
index d454743f75..7adb466bb1 100644
--- a/sysdeps/s390/memset-z900.S
+++ b/sysdeps/s390/memset-z900.S
@@ -24,11 +24,7 @@
/* INPUT PARAMETERS - MEMSET
%r2 = address of memory area
%r3 = byte to fill memory with
- %r4 = number of bytes to fill.
-
- INPUT PARAMETERS - BZERO
- %r2 = address of memory area
- %r3 = number of bytes to fill. */
+ %r4 = number of bytes to fill. */
.text
@@ -47,12 +43,6 @@
# define BRCTG brct
# endif /* ! defined __s390x__ */
-ENTRY(BZERO_Z900_G5)
- LGR %r4,%r3
- xr %r3,%r3
- j .L_Z900_G5_start
-END(BZERO_Z900_G5)
-
ENTRY(MEMSET_Z900_G5)
.L_Z900_G5_start:
#if defined __s390x__
@@ -100,14 +90,6 @@ END(MEMSET_Z900_G5)
#endif /* HAVE_MEMSET_Z900_G5 */
#if HAVE_MEMSET_Z10
-ENTRY(BZERO_Z10)
- .machine "z10"
- .machinemode "zarch_nohighgprs"
- lgr %r4,%r3
- xr %r3,%r3
- j .L_Z10_start
-END(BZERO_Z10)
-
ENTRY(MEMSET_Z10)
.L_Z10_start:
.machine "z10"
@@ -141,14 +123,6 @@ END(MEMSET_Z10)
#endif /* HAVE_MEMSET_Z10 */
#if HAVE_MEMSET_Z196
-ENTRY(BZERO_Z196)
- .machine "z196"
- .machinemode "zarch_nohighgprs"
- lgr %r4,%r3
- xr %r3,%r3
- j .L_Z196_start
-END(BZERO_Z196)
-
ENTRY(MEMSET_Z196)
.L_Z196_start:
.machine "z196"
@@ -204,10 +178,6 @@ END(__memset_mvcle)
/* If we don't use ifunc, define an alias for memset here.
Otherwise see sysdeps/s390/memset.c. */
strong_alias (MEMSET_DEFAULT, memset)
-/* Same for bzero. If ifunc is used, see
- sysdeps/s390/bzero.c. */
-strong_alias (BZERO_DEFAULT, __bzero)
-weak_alias (__bzero, bzero)
#endif
#if defined SHARED && IS_IN (libc)
diff --git a/sysdeps/s390/multiarch/ifunc-impl-list.c b/sysdeps/s390/multiarch/ifunc-impl-list.c
index 29598c2a6e..c1902b2c26 100644
--- a/sysdeps/s390/multiarch/ifunc-impl-list.c
+++ b/sysdeps/s390/multiarch/ifunc-impl-list.c
@@ -102,21 +102,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
# endif
# if HAVE_MEMSET_Z900_G5
IFUNC_IMPL_ADD (array, i, memset, 1, MEMSET_Z900_G5)
-# endif
- )
-
- /* Note: bzero is implemented in memset. */
- IFUNC_IMPL (i, name, bzero,
-# if HAVE_MEMSET_Z196
- IFUNC_IMPL_ADD (array, i, bzero,
- S390_IS_Z196 (stfle_bits), BZERO_Z196)
-# endif
-# if HAVE_MEMSET_Z10
- IFUNC_IMPL_ADD (array, i, bzero,
- S390_IS_Z10 (stfle_bits), BZERO_Z10)
-# endif
-# if HAVE_MEMSET_Z900_G5
- IFUNC_IMPL_ADD (array, i, bzero, 1, BZERO_Z900_G5)
# endif
)
#endif /* HAVE_MEMSET_IFUNC */
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 11/11] i686: Remove bzero optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (9 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 10/11] s390: Remove " Adhemerval Zanella
@ 2022-02-23 14:09 ` Adhemerval Zanella
2022-02-23 14:13 ` [PATCH v2 00/11] Remove bcopy and " Adhemerval Zanella
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:09 UTC (permalink / raw)
To: libc-alpha
The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
sysdeps/i386/bzero.S | 5 ---
sysdeps/i386/i586/bzero.S | 4 --
sysdeps/i386/i586/memset.S | 16 ++------
sysdeps/i386/i686/bzero.S | 4 --
sysdeps/i386/i686/memset.S | 23 +++---------
sysdeps/i386/i686/multiarch/Makefile | 6 +--
sysdeps/i386/i686/multiarch/bzero-ia32.S | 37 -------------------
sysdeps/i386/i686/multiarch/bzero-sse2-rep.S | 3 --
sysdeps/i386/i686/multiarch/bzero-sse2.S | 3 --
sysdeps/i386/i686/multiarch/bzero.c | 32 ----------------
sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 8 ----
sysdeps/i386/i686/multiarch/memset-sse2-rep.S | 24 +++---------
sysdeps/i386/i686/multiarch/memset-sse2.S | 24 +++---------
sysdeps/i386/memset.S | 14 +------
14 files changed, 22 insertions(+), 181 deletions(-)
delete mode 100644 sysdeps/i386/bzero.S
delete mode 100644 sysdeps/i386/i586/bzero.S
delete mode 100644 sysdeps/i386/i686/bzero.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-ia32.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2.S
delete mode 100644 sysdeps/i386/i686/multiarch/bzero.c
diff --git a/sysdeps/i386/bzero.S b/sysdeps/i386/bzero.S
deleted file mode 100644
index c8dd47b4da..0000000000
--- a/sysdeps/i386/bzero.S
+++ /dev/null
@@ -1,5 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include "memset.S"
-
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i586/bzero.S b/sysdeps/i386/i586/bzero.S
deleted file mode 100644
index 2a106719a4..0000000000
--- a/sysdeps/i386/i586/bzero.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include <sysdeps/i386/i586/memset.S>
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i586/memset.S b/sysdeps/i386/i586/memset.S
index ae09c3b40a..672af41398 100644
--- a/sysdeps/i386/i586/memset.S
+++ b/sysdeps/i386/i586/memset.S
@@ -23,15 +23,11 @@
#define PARMS 4+4 /* space for 1 saved reg */
#define RTN PARMS
#define DEST RTN
-#ifdef USE_AS_BZERO
-# define LEN DEST+4
-#else
-# define CHR DEST+4
-# define LEN CHR+4
-#endif
+#define CHR DEST+4
+#define LEN CHR+4
.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -46,15 +42,11 @@ ENTRY (memset)
movl DEST(%esp), %edi
cfi_rel_offset (edi, 0)
movl LEN(%esp), %edx
-#ifdef USE_AS_BZERO
- xorl %eax, %eax /* we fill with 0 */
-#else
movb CHR(%esp), %al
movb %al, %ah
movl %eax, %ecx
shll $16, %eax
movw %cx, %ax
-#endif
cld
/* If less than 36 bytes to write, skip tricky code (it wouldn't work). */
@@ -100,10 +92,8 @@ L(2): shrl $2, %ecx /* convert byte count to longword count */
rep
stosb
-#ifndef USE_AS_BZERO
/* Load result (only if used as memset). */
movl DEST(%esp), %eax /* start address of destination is result */
-#endif
popl %edi
cfi_adjust_cfa_offset (-4)
cfi_restore (edi)
diff --git a/sysdeps/i386/i686/bzero.S b/sysdeps/i386/i686/bzero.S
deleted file mode 100644
index c7898f18e0..0000000000
--- a/sysdeps/i386/i686/bzero.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include <sysdeps/i386/i686/memset.S>
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i686/memset.S b/sysdeps/i386/i686/memset.S
index fd5b26aeae..3cb86c016d 100644
--- a/sysdeps/i386/i686/memset.S
+++ b/sysdeps/i386/i686/memset.S
@@ -21,18 +21,13 @@
#include "asm-syntax.h"
#define PARMS 4+4 /* space for 1 saved reg */
-#ifdef USE_AS_BZERO
-# define DEST PARMS
-# define LEN DEST+4
-#else
-# define RTN PARMS
-# define DEST RTN
-# define CHR DEST+4
-# define LEN CHR+4
-#endif
+#define RTN PARMS
+#define DEST RTN
+#define CHR DEST+4
+#define LEN CHR+4
.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY_CHK (__memset_chk)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -46,11 +41,7 @@ ENTRY (memset)
cfi_adjust_cfa_offset (4)
movl DEST(%esp), %edx
movl LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
- xorl %eax, %eax /* fill with 0 */
-#else
movzbl CHR(%esp), %eax
-#endif
jecxz 1f
movl %edx, %edi
cfi_rel_offset (edi, 0)
@@ -70,9 +61,7 @@ ENTRY (memset)
2: movl %ecx, %edx
shrl $2, %ecx
andl $3, %edx
-#ifndef USE_AS_BZERO
imul $0x01010101, %eax
-#endif
rep
stosl
movl %edx, %ecx
@@ -80,9 +69,7 @@ ENTRY (memset)
stosb
1:
-#ifndef USE_AS_BZERO
movl DEST(%esp), %eax /* start address of destination is result */
-#endif
popl %edi
cfi_adjust_cfa_offset (-4)
cfi_restore (edi)
diff --git a/sysdeps/i386/i686/multiarch/Makefile b/sysdeps/i386/i686/multiarch/Makefile
index 02fa02658e..9fe5ea8639 100644
--- a/sysdeps/i386/i686/multiarch/Makefile
+++ b/sysdeps/i386/i686/multiarch/Makefile
@@ -1,9 +1,9 @@
ifeq ($(subdir),string)
gen-as-const-headers += locale-defines.sym
-sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
+sysdep_routines += memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
memmove-ssse3 memcpy-ssse3-rep mempcpy-ssse3-rep \
memmove-ssse3-rep \
- memset-sse2-rep bzero-sse2-rep strcmp-ssse3 \
+ memset-sse2-rep strcmp-ssse3 \
strcmp-sse4 strncmp-c strncmp-ssse3 strncmp-sse4 \
memcmp-ssse3 memcmp-sse4 varshift \
strlen-sse2 strlen-sse2-bsf strncpy-c strcpy-ssse3 \
@@ -21,7 +21,7 @@ sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
memcpy-sse2-unaligned \
mempcpy-sse2-unaligned memmove-sse2-unaligned \
strcspn-c strpbrk-c strspn-c \
- bzero-ia32 rawmemchr-ia32 \
+ rawmemchr-ia32 \
memchr-ia32 memcmp-ia32 memcpy-ia32 memmove-ia32 \
mempcpy-ia32 memset-ia32 strcat-ia32 strchr-ia32 \
strrchr-ia32 strcpy-ia32 strcmp-ia32 strcspn-ia32 \
diff --git a/sysdeps/i386/i686/multiarch/bzero-ia32.S b/sysdeps/i386/i686/multiarch/bzero-ia32.S
deleted file mode 100644
index 96afe9bad1..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-ia32.S
+++ /dev/null
@@ -1,37 +0,0 @@
-/* bzero optimized for i686.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-#if IS_IN (libc)
-# define __bzero __bzero_ia32
-
-# ifdef SHARED
-# undef libc_hidden_builtin_def
-/* IFUNC doesn't work with the hidden functions in shared library since
- they will be called without setting up EBX needed for PLT which is
- used by IFUNC. */
-# define libc_hidden_builtin_def(name) \
- .globl __GI___bzero; __GI___bzero = __bzero
-# endif
-
-# undef weak_alias
-# define weak_alias(original, alias)
-
-# include <sysdeps/i386/i686/bzero.S>
-#endif
diff --git a/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S b/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
deleted file mode 100644
index 507b288bb3..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BZERO
-#define __memset_sse2_rep __bzero_sse2_rep
-#include "memset-sse2-rep.S"
diff --git a/sysdeps/i386/i686/multiarch/bzero-sse2.S b/sysdeps/i386/i686/multiarch/bzero-sse2.S
deleted file mode 100644
index 8d04512e4e..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-sse2.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BZERO
-#define __memset_sse2 __bzero_sse2
-#include "memset-sse2.S"
diff --git a/sysdeps/i386/i686/multiarch/bzero.c b/sysdeps/i386/i686/multiarch/bzero.c
deleted file mode 100644
index 7fd0ddd576..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* Multiple versions of bzero.
- All versions must be listed in ifunc-impl-list.c.
- Copyright (C) 2017-2022 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-/* Define multiple versions only for the definition in libc. */
-#if IS_IN (libc)
-# define bzero __redirect_bzero
-# include <string.h>
-# undef bzero
-
-# define SYMBOL_NAME bzero
-# include "ifunc-memset.h"
-
-libc_ifunc_redirected (__redirect_bzero, __bzero, IFUNC_SELECTOR ());
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
index 5c7a42dc97..c014f52bf9 100644
--- a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
+++ b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
@@ -36,14 +36,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
size_t i = 0;
- /* Support sysdeps/i386/i686/multiarch/bzero.S. */
- IFUNC_IMPL (i, name, bzero,
- IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
- __bzero_sse2_rep)
- IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
- __bzero_sse2)
- IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ia32))
-
/* Support sysdeps/i386/i686/multiarch/memchr.S. */
IFUNC_IMPL (i, name, memchr,
IFUNC_IMPL_ADD (array, i, memchr, CPU_FEATURE_USABLE (SSE2),
diff --git a/sysdeps/i386/i686/multiarch/memset-sse2-rep.S b/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
index 37a10575e7..28df7836e0 100644
--- a/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
+++ b/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
@@ -32,16 +32,10 @@
#define PUSH(REG) pushl REG; CFI_PUSH (REG)
#define POP(REG) popl REG; CFI_POP (REG)
-#ifdef USE_AS_BZERO
-# define DEST PARMS
-# define LEN DEST+4
-# define SETRTNVAL
-#else
-# define DEST PARMS
-# define CHR DEST+4
-# define LEN CHR+4
-# define SETRTNVAL movl DEST(%esp), %eax
-#endif
+#define DEST PARMS
+#define CHR DEST+4
+#define LEN CHR+4
+#define SETRTNVAL movl DEST(%esp), %eax
#ifdef PIC
# define ENTRANCE PUSH (%ebx);
@@ -78,7 +72,7 @@
#endif
.section .text.sse2,"ax",@progbits
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk_sse2_rep)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -89,16 +83,12 @@ ENTRY (__memset_sse2_rep)
ENTRANCE
movl LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
- xor %eax, %eax
-#else
movzbl CHR(%esp), %eax
movb %al, %ah
/* Fill the whole EAX with pattern. */
movl %eax, %edx
shl $16, %eax
or %edx, %eax
-#endif
movl DEST(%esp), %edx
cmp $32, %ecx
jae L(32bytesormore)
@@ -228,12 +218,8 @@ L(write_3bytes):
/* ECX > 32 and EDX is 4 byte aligned. */
L(32bytesormore):
/* Fill xmm0 with the pattern. */
-#ifdef USE_AS_BZERO
- pxor %xmm0, %xmm0
-#else
movd %eax, %xmm0
pshufd $0, %xmm0, %xmm0
-#endif
testl $0xf, %edx
jz L(aligned_16)
/* ECX > 32 and EDX is not 16 byte aligned. */
diff --git a/sysdeps/i386/i686/multiarch/memset-sse2.S b/sysdeps/i386/i686/multiarch/memset-sse2.S
index 455519c7ac..4e8414fd51 100644
--- a/sysdeps/i386/i686/multiarch/memset-sse2.S
+++ b/sysdeps/i386/i686/multiarch/memset-sse2.S
@@ -32,16 +32,10 @@
#define PUSH(REG) pushl REG; CFI_PUSH (REG)
#define POP(REG) popl REG; CFI_POP (REG)
-#ifdef USE_AS_BZERO
-# define DEST PARMS
-# define LEN DEST+4
-# define SETRTNVAL
-#else
-# define DEST PARMS
-# define CHR DEST+4
-# define LEN CHR+4
-# define SETRTNVAL movl DEST(%esp), %eax
-#endif
+#define DEST PARMS
+#define CHR DEST+4
+#define LEN CHR+4
+#define SETRTNVAL movl DEST(%esp), %eax
#ifdef PIC
# define ENTRANCE PUSH (%ebx);
@@ -78,7 +72,7 @@
#endif
.section .text.sse2,"ax",@progbits
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk_sse2)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -89,16 +83,12 @@ ENTRY (__memset_sse2)
ENTRANCE
movl LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
- xor %eax, %eax
-#else
movzbl CHR(%esp), %eax
movb %al, %ah
/* Fill the whole EAX with pattern. */
movl %eax, %edx
shl $16, %eax
or %edx, %eax
-#endif
movl DEST(%esp), %edx
cmp $32, %ecx
jae L(32bytesormore)
@@ -228,12 +218,8 @@ L(write_3bytes):
/* ECX > 32 and EDX is 4 byte aligned. */
L(32bytesormore):
/* Fill xmm0 with the pattern. */
-#ifdef USE_AS_BZERO
- pxor %xmm0, %xmm0
-#else
movd %eax, %xmm0
pshufd $0, %xmm0, %xmm0
-#endif
testl $0xf, %edx
jz L(aligned_16)
/* ECX > 32 and EDX is not 16 byte aligned. */
diff --git a/sysdeps/i386/memset.S b/sysdeps/i386/memset.S
index f470511b64..db2753eb2f 100644
--- a/sysdeps/i386/memset.S
+++ b/sysdeps/i386/memset.S
@@ -30,15 +30,11 @@
#define POP(REG) popl REG; CFI_POP (REG)
#define STR1 8
-#ifdef USE_AS_BZERO
-#define N STR1+4
-#else
#define STR2 STR1+4
#define N STR2+4
-#endif
.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
ENTRY (__memset_chk)
movl 12(%esp), %eax
cmpl %eax, 16(%esp)
@@ -49,20 +45,12 @@ ENTRY (memset)
PUSH (%edi)
movl N(%esp), %ecx
movl STR1(%esp), %edi
-#ifdef USE_AS_BZERO
- xor %eax, %eax
-#else
movzbl STR2(%esp), %eax
mov %edi, %edx
-#endif
rep stosb
-#ifndef USE_AS_BZERO
mov %edx, %eax
-#endif
POP (%edi)
ret
END (memset)
-#ifndef USE_AS_BZERO
libc_hidden_builtin_def (memset)
-#endif
--
2.32.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 00/11] Remove bcopy and bzero optimizations
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
` (10 preceding siblings ...)
2022-02-23 14:09 ` [PATCH v2 11/11] i686: " Adhemerval Zanella
@ 2022-02-23 14:13 ` Adhemerval Zanella
11 siblings, 0 replies; 14+ messages in thread
From: Adhemerval Zanella @ 2022-02-23 14:13 UTC (permalink / raw)
To: libc-alpha
I will intend to commit this patch shortly. I remove the x86_64 part because
we are still discussing the __memzeroset addition, but I think the idea is
also to eventually remove the x86_64 optimization as well.
On 23/02/2022 11:09, Adhemerval Zanella wrote:
> Both symbols are marked as legacy in POSIX.1-2001 and removed on
> POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE
> or _DEFAULT_SOURCE.
>
> Most architectures just route bcopy/bzero to internal memmove/memset
> implementation, however some do implement iFUNC variants when memset
> or memmove are also provided through iFUNC.
>
> However, gcc already replaces bcopy with a memmove and bzero with memset
> on default configuration (to actually get a bstring libc call the code
> requires to omit string.h inclusion and built with --fno-builtin), so
> it is highly unlikely programs are actually calling libc bcopy or
> bzero symbols.
>
> On a recent Linux distro (Ubuntu 21.04), I see only 1 'bcmp' call
> (which is already aliased to memcmp):
>
> $ cat count_bstring.sh
> #!/bin/bash
>
> files=`IFS=':';for i in $PATH; do test -d "$i" && find "$i" -maxdepth
> 1 -executable -type f; done`
> total=0
> for file in $files; do
> symbols=`objdump -R $file 2>&1`
> if [ $? -eq 0 ]; then
> ncalls=`echo $symbols | grep -w $1 | wc -l`
> ((total=total+ncalls))
> if [ $ncalls -gt 0 ]; then
> echo "$file: $ncalls"
> fi
> fi
> done
> echo "TOTAL=$total"
> $ ./count_bstring.sh bcmp
> /usr/bin/rg: 1
> TOTAL=1
> $ ./count_bstring.sh bcopy
> TOTAL=0
> $ ./count_bstring.sh bzero
> TOTAL=0
>
> So there is point in keeping such optimization.
>
> v2: Fix ia64 extra __bzero symbol, cleanup more i686 bzero definitions,
> remove x86_64 bzero part.
>
> Adhemerval Zanella (11):
> ia64: Remove bcopy
> powerpc: Remove bcopy optimizations
> i386: Remove bcopy optimizations
> x86_64: Remove bcopy optimizations
> alpha: Remove bzero optimization
> ia64: Remove bzero optimization
> sparc: Remove bzero optimization
> powerpc: Remove powerpc32 bzero optimizations
> powerpc: Remove powerpc64 bzero optimizations
> s390: Remove bzero optimizations
> i686: Remove bzero optimizations
>
> string/bzero.c | 4 +-
> sysdeps/alpha/bzero.S | 109 ------
> sysdeps/i386/bcopy.S | 4 -
> sysdeps/i386/bzero.S | 5 -
> sysdeps/i386/i586/bzero.S | 4 -
> sysdeps/i386/i586/memset.S | 16 +-
> sysdeps/i386/i686/bcopy.S | 3 -
> sysdeps/i386/i686/bzero.S | 4 -
> sysdeps/i386/i686/memmove.S | 22 +-
> sysdeps/i386/i686/memset.S | 23 +-
> sysdeps/i386/i686/multiarch/Makefile | 10 +-
> sysdeps/i386/i686/multiarch/bcopy-ia32.S | 20 --
> .../i686/multiarch/bcopy-sse2-unaligned.S | 4 -
> sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S | 4 -
> sysdeps/i386/i686/multiarch/bcopy-ssse3.S | 4 -
> sysdeps/i386/i686/multiarch/bcopy.c | 30 --
> sysdeps/i386/i686/multiarch/bzero-ia32.S | 37 ---
> sysdeps/i386/i686/multiarch/bzero-sse2-rep.S | 3 -
> sysdeps/i386/i686/multiarch/bzero-sse2.S | 3 -
> sysdeps/i386/i686/multiarch/bzero.c | 32 --
> sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 18 -
> .../i686/multiarch/memcpy-sse2-unaligned.S | 16 +-
> .../i386/i686/multiarch/memcpy-ssse3-rep.S | 64 ++--
> sysdeps/i386/i686/multiarch/memcpy-ssse3.S | 202 ++++--------
> sysdeps/i386/i686/multiarch/memset-sse2-rep.S | 24 +-
> sysdeps/i386/i686/multiarch/memset-sse2.S | 24 +-
> sysdeps/i386/memcpy.S | 16 +-
> sysdeps/i386/memset.S | 14 +-
> sysdeps/ia64/bcopy.S | 10 -
> sysdeps/ia64/bzero.S | 312 ------------------
> sysdeps/ia64/bzero.c | 3 +
> sysdeps/powerpc/powerpc32/bzero.S | 27 --
> .../powerpc32/power4/multiarch/Makefile | 4 +-
> .../powerpc32/power4/multiarch/bzero-power6.S | 25 --
> .../powerpc32/power4/multiarch/bzero-power7.S | 25 --
> .../powerpc32/power4/multiarch/bzero-ppc32.S | 34 --
> .../powerpc32/power4/multiarch/bzero.c | 37 ---
> .../power4/multiarch/ifunc-impl-list.c | 8 -
> sysdeps/powerpc/powerpc64/bzero.S | 20 --
> .../powerpc/powerpc64/le/power10/memmove.S | 13 -
> sysdeps/powerpc/powerpc64/le/power10/memset.S | 12 -
> sysdeps/powerpc/powerpc64/memset.S | 13 -
> sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +-
> .../powerpc/powerpc64/multiarch/bcopy-ppc64.c | 27 --
> sysdeps/powerpc/powerpc64/multiarch/bcopy.c | 38 ---
> sysdeps/powerpc/powerpc64/multiarch/bzero.c | 54 ---
> .../powerpc64/multiarch/ifunc-impl-list.c | 34 --
> .../powerpc64/multiarch/memmove-power10.S | 3 -
> .../powerpc64/multiarch/memmove-power7.S | 3 -
> .../powerpc64/multiarch/memset-power10.S | 3 -
> .../powerpc64/multiarch/memset-power4.S | 3 -
> .../powerpc64/multiarch/memset-power6.S | 3 -
> .../powerpc64/multiarch/memset-power7.S | 2 -
> .../powerpc64/multiarch/memset-power8.S | 3 -
> .../powerpc64/multiarch/memset-ppc64.S | 16 +-
> sysdeps/powerpc/powerpc64/power4/memset.S | 12 -
> sysdeps/powerpc/powerpc64/power6/memset.S | 12 -
> sysdeps/powerpc/powerpc64/power7/bcopy.c | 1 -
> sysdeps/powerpc/powerpc64/power7/memmove.S | 14 -
> sysdeps/powerpc/powerpc64/power7/memset.S | 12 -
> sysdeps/powerpc/powerpc64/power8/memset.S | 12 -
> sysdeps/s390/Makefile | 2 +-
> sysdeps/s390/bzero.c | 47 ---
> sysdeps/s390/ifunc-memset.h | 9 -
> sysdeps/s390/memset-z900.S | 32 +-
> sysdeps/s390/multiarch/ifunc-impl-list.c | 15 -
> sysdeps/sparc/sparc32/bzero.c | 1 -
> sysdeps/sparc/sparc32/memset.S | 37 +--
> sysdeps/sparc/sparc32/sparcv9/bzero.c | 1 -
> .../sparc/sparc32/sparcv9/multiarch/bzero.c | 1 -
> .../sparc32/sparcv9/multiarch/memset-ultra1.S | 1 -
> sysdeps/sparc/sparc64/bzero.c | 1 -
> sysdeps/sparc/sparc64/memset.S | 30 +-
> sysdeps/sparc/sparc64/multiarch/bzero.c | 33 --
> .../sparc/sparc64/multiarch/ifunc-impl-list.c | 9 -
> .../sparc/sparc64/multiarch/ifunc-memset.h | 2 +-
> .../sparc/sparc64/multiarch/memset-niagara1.S | 5 +-
> .../sparc/sparc64/multiarch/memset-niagara4.S | 6 +-
> .../sparc/sparc64/multiarch/memset-niagara7.S | 7 -
> .../sparc/sparc64/multiarch/memset-ultra1.S | 1 -
> sysdeps/x86_64/multiarch/bcopy.S | 7 -
> 81 files changed, 162 insertions(+), 1601 deletions(-)
> delete mode 100644 sysdeps/alpha/bzero.S
> delete mode 100644 sysdeps/i386/bcopy.S
> delete mode 100644 sysdeps/i386/bzero.S
> delete mode 100644 sysdeps/i386/i586/bzero.S
> delete mode 100644 sysdeps/i386/i686/bcopy.S
> delete mode 100644 sysdeps/i386/i686/bzero.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ia32.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bcopy.c
> delete mode 100644 sysdeps/i386/i686/multiarch/bzero-ia32.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2.S
> delete mode 100644 sysdeps/i386/i686/multiarch/bzero.c
> delete mode 100644 sysdeps/ia64/bcopy.S
> delete mode 100644 sysdeps/ia64/bzero.S
> create mode 100644 sysdeps/ia64/bzero.c
> delete mode 100644 sysdeps/powerpc/powerpc32/bzero.S
> delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
> delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
> delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
> delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
> delete mode 100644 sysdeps/powerpc/powerpc64/bzero.S
> delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
> delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy.c
> delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bzero.c
> delete mode 100644 sysdeps/powerpc/powerpc64/power7/bcopy.c
> delete mode 100644 sysdeps/s390/bzero.c
> delete mode 100644 sysdeps/sparc/sparc32/bzero.c
> delete mode 100644 sysdeps/sparc/sparc32/sparcv9/bzero.c
> delete mode 100644 sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
> delete mode 100644 sysdeps/sparc/sparc64/bzero.c
> delete mode 100644 sysdeps/sparc/sparc64/multiarch/bzero.c
> delete mode 100644 sysdeps/x86_64/multiarch/bcopy.S
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 04/11] x86_64: Remove bcopy optimizations
2022-02-23 14:09 ` [PATCH v2 04/11] x86_64: " Adhemerval Zanella
@ 2022-05-12 19:28 ` Sunil Pandey
0 siblings, 0 replies; 14+ messages in thread
From: Sunil Pandey @ 2022-05-12 19:28 UTC (permalink / raw)
To: Adhemerval Zanella, Libc-stable Mailing List; +Cc: GNU C Library
On Wed, Feb 23, 2022 at 6:12 AM Adhemerval Zanella via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> The symbols is not present in current POSIX specification and compiler
> already generates memmove call.
> ---
> sysdeps/x86_64/multiarch/bcopy.S | 7 -------
> 1 file changed, 7 deletions(-)
> delete mode 100644 sysdeps/x86_64/multiarch/bcopy.S
>
> diff --git a/sysdeps/x86_64/multiarch/bcopy.S b/sysdeps/x86_64/multiarch/bcopy.S
> deleted file mode 100644
> index 639f02bde3..0000000000
> --- a/sysdeps/x86_64/multiarch/bcopy.S
> +++ /dev/null
> @@ -1,7 +0,0 @@
> -#include <sysdep.h>
> -
> - .text
> -ENTRY(bcopy)
> - xchg %rdi, %rsi
> - jmp __libc_memmove /* Branch to IFUNC memmove. */
> -END(bcopy)
> --
> 2.32.0
>
I would like to backport this patch to release branches.
Any comments or objections?
--Sunil
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-05-12 19:28 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-23 14:09 [PATCH v2 00/11] Remove bcopy and bzero optimizations Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 01/11] ia64: Remove bcopy Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 02/11] powerpc: Remove bcopy optimizations Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 03/11] i386: " Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 04/11] x86_64: " Adhemerval Zanella
2022-05-12 19:28 ` Sunil Pandey
2022-02-23 14:09 ` [PATCH v2 05/11] alpha: Remove bzero optimization Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 06/11] ia64: " Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 07/11] sparc: " Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 08/11] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 09/11] powerpc: Remove powerpc64 " Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 10/11] s390: Remove " Adhemerval Zanella
2022-02-23 14:09 ` [PATCH v2 11/11] i686: " Adhemerval Zanella
2022-02-23 14:13 ` [PATCH v2 00/11] Remove bcopy and " Adhemerval Zanella
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).