public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 00/12] Remove bcopy and bzero optimizations
@ 2022-02-10 19:58 Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 01/12] ia64: Remove bcopy Adhemerval Zanella
                   ` (13 more replies)
  0 siblings, 14 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

Both symbols are marked as legacy in POSIX.1-2001 and removed on
POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE
or _DEFAULT_SOURCE.

Most architectures just route bcopy/bzero to internal memmove/memset
implementation, however some do implement iFUNC variants when memset
or memmove are also provided through iFUNC.

However, gcc already replaces bcopy with a memmove and bzero with memset
on default configuration (to actually get a bstring libc call the code
requires to omit string.h inclusion and built with --fno-builtin), so
it is highly unlikely programs are actually calling libc bcopy or
bzero symbols.

On a recent Linux distro (Ubuntu 21.04), I see only 1 'bcmp' call
(which is already aliased to memcmp):

  $ cat count_bstring.sh 
  #!/bin/bash

  files=`IFS=':';for i in $PATH; do test -d "$i" && find "$i" -maxdepth 1 -executable -type f; done`
  total=0
  for file in $files; do
    symbols=`objdump -R $file 2>&1`
    if [ $? -eq 0 ]; then
      ncalls=`echo $symbols | grep -w $1 | wc -l`
      ((total=total+ncalls))
      if [ $ncalls -gt 0 ]; then
        echo "$file: $ncalls"
      fi
    fi
  done
  echo "TOTAL=$total"
  $ ./count_bstring.sh bcmp
  /usr/bin/rg: 1
  TOTAL=1
  $ ./count_bstring.sh bcopy
  TOTAL=0
  $ ./count_bstring.sh bzero
  TOTAL=0

So there is point in keeping such optimization.

Adhemerval Zanella (12):
  ia64: Remove bcopy
  powerpc: Remove bcopy optimizations
  i386: Remove bcopy optimizations
  x86_64: Remove bcopy optimizations
  alpha: Remove bzero optimization
  ia64: Remove bzero optimization
  Remove bzero optimization
  powerpc: Remove powerpc32 bzero optimizations
  powerpc: Remove powerpc64 bzero optimizations
  s390: Remove bzero optimizations
  i686: Remove bzero optimizations
  x86_64: Remove bzero optimizations

 sysdeps/alpha/bzero.S                         | 109 ------
 sysdeps/i386/bcopy.S                          |   4 -
 sysdeps/i386/bzero.S                          |   5 -
 sysdeps/i386/i586/bzero.S                     |   4 -
 sysdeps/i386/i586/memset.S                    |  16 +-
 sysdeps/i386/i686/bcopy.S                     |   3 -
 sysdeps/i386/i686/bzero.S                     |   4 -
 sysdeps/i386/i686/memset.S                    |  23 +-
 sysdeps/i386/i686/multiarch/Makefile          |  10 +-
 sysdeps/i386/i686/multiarch/bcopy-ia32.S      |  20 --
 .../i686/multiarch/bcopy-sse2-unaligned.S     |   4 -
 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S |   4 -
 sysdeps/i386/i686/multiarch/bcopy-ssse3.S     |   4 -
 sysdeps/i386/i686/multiarch/bcopy.c           |  30 --
 sysdeps/i386/i686/multiarch/bzero-ia32.S      |  37 ---
 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S  |   3 -
 sysdeps/i386/i686/multiarch/bzero-sse2.S      |   3 -
 sysdeps/i386/i686/multiarch/bzero.c           |  32 --
 sysdeps/i386/i686/multiarch/ifunc-impl-list.c |  18 -
 sysdeps/i386/i686/multiarch/memset-sse2-rep.S |  24 +-
 sysdeps/i386/i686/multiarch/memset-sse2.S     |  24 +-
 sysdeps/i386/memset.S                         |  14 +-
 sysdeps/ia64/bcopy.S                          |  10 -
 sysdeps/ia64/bzero.S                          | 312 ------------------
 sysdeps/powerpc/powerpc32/bzero.S             |  27 --
 .../powerpc32/power4/multiarch/Makefile       |   4 +-
 .../powerpc32/power4/multiarch/bzero-power6.S |  25 --
 .../powerpc32/power4/multiarch/bzero-power7.S |  25 --
 .../powerpc32/power4/multiarch/bzero-ppc32.S  |  34 --
 .../powerpc32/power4/multiarch/bzero.c        |  37 ---
 .../power4/multiarch/ifunc-impl-list.c        |   8 -
 sysdeps/powerpc/powerpc64/bzero.S             |  20 --
 .../powerpc/powerpc64/le/power10/memmove.S    |  13 -
 sysdeps/powerpc/powerpc64/le/power10/memset.S |  12 -
 sysdeps/powerpc/powerpc64/memset.S            |  13 -
 sysdeps/powerpc/powerpc64/multiarch/Makefile  |   2 +-
 .../powerpc/powerpc64/multiarch/bcopy-ppc64.c |  27 --
 sysdeps/powerpc/powerpc64/multiarch/bcopy.c   |  38 ---
 sysdeps/powerpc/powerpc64/multiarch/bzero.c   |  54 ---
 .../powerpc64/multiarch/ifunc-impl-list.c     |  34 --
 .../powerpc64/multiarch/memmove-power10.S     |   3 -
 .../powerpc64/multiarch/memmove-power7.S      |   3 -
 .../powerpc64/multiarch/memset-power10.S      |   3 -
 .../powerpc64/multiarch/memset-power4.S       |   3 -
 .../powerpc64/multiarch/memset-power6.S       |   3 -
 .../powerpc64/multiarch/memset-power7.S       |   2 -
 .../powerpc64/multiarch/memset-power8.S       |   3 -
 .../powerpc64/multiarch/memset-ppc64.S        |  16 +-
 sysdeps/powerpc/powerpc64/power4/memset.S     |  12 -
 sysdeps/powerpc/powerpc64/power6/memset.S     |  12 -
 sysdeps/powerpc/powerpc64/power7/bcopy.c      |   1 -
 sysdeps/powerpc/powerpc64/power7/memmove.S    |  14 -
 sysdeps/powerpc/powerpc64/power7/memset.S     |  12 -
 sysdeps/powerpc/powerpc64/power8/memset.S     |  12 -
 sysdeps/s390/Makefile                         |   2 +-
 sysdeps/s390/bzero.c                          |  47 ---
 sysdeps/s390/ifunc-memset.h                   |   9 -
 sysdeps/s390/memset-z900.S                    |  32 +-
 sysdeps/s390/multiarch/ifunc-impl-list.c      |  15 -
 sysdeps/sparc/sparc32/bzero.c                 |   1 -
 sysdeps/sparc/sparc32/memset.S                |  37 +--
 sysdeps/sparc/sparc32/sparcv9/bzero.c         |   1 -
 .../sparc/sparc32/sparcv9/multiarch/bzero.c   |   1 -
 .../sparc32/sparcv9/multiarch/memset-ultra1.S |   1 -
 sysdeps/sparc/sparc64/bzero.c                 |   1 -
 sysdeps/sparc/sparc64/memset.S                |  30 +-
 sysdeps/sparc/sparc64/multiarch/bzero.c       |  33 --
 .../sparc/sparc64/multiarch/ifunc-impl-list.c |   9 -
 .../sparc/sparc64/multiarch/ifunc-memset.h    |   2 +-
 .../sparc/sparc64/multiarch/memset-niagara1.S |   5 +-
 .../sparc/sparc64/multiarch/memset-niagara4.S |   6 +-
 .../sparc/sparc64/multiarch/memset-niagara7.S |   7 -
 .../sparc/sparc64/multiarch/memset-ultra1.S   |   1 -
 sysdeps/x86_64/bzero.S                        |   1 -
 sysdeps/x86_64/memset.S                       |  10 +-
 sysdeps/x86_64/multiarch/Makefile             |   1 -
 sysdeps/x86_64/multiarch/bcopy.S              |   7 -
 sysdeps/x86_64/multiarch/bzero.c              | 106 ------
 sysdeps/x86_64/multiarch/ifunc-impl-list.c    |  42 ---
 .../memset-avx2-unaligned-erms-rtm.S          |   1 -
 .../multiarch/memset-avx2-unaligned-erms.S    |   6 -
 .../multiarch/memset-avx512-unaligned-erms.S  |   3 -
 .../multiarch/memset-evex-unaligned-erms.S    |   3 -
 .../multiarch/memset-sse2-unaligned-erms.S    |   5 -
 .../multiarch/memset-vec-unaligned-erms.S     |  56 +---
 85 files changed, 62 insertions(+), 1608 deletions(-)
 delete mode 100644 sysdeps/alpha/bzero.S
 delete mode 100644 sysdeps/i386/bcopy.S
 delete mode 100644 sysdeps/i386/bzero.S
 delete mode 100644 sysdeps/i386/i586/bzero.S
 delete mode 100644 sysdeps/i386/i686/bcopy.S
 delete mode 100644 sysdeps/i386/i686/bzero.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ia32.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy.c
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero-ia32.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero.c
 delete mode 100644 sysdeps/ia64/bcopy.S
 delete mode 100644 sysdeps/ia64/bzero.S
 delete mode 100644 sysdeps/powerpc/powerpc32/bzero.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
 delete mode 100644 sysdeps/powerpc/powerpc64/bzero.S
 delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
 delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy.c
 delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bzero.c
 delete mode 100644 sysdeps/powerpc/powerpc64/power7/bcopy.c
 delete mode 100644 sysdeps/s390/bzero.c
 delete mode 100644 sysdeps/sparc/sparc32/bzero.c
 delete mode 100644 sysdeps/sparc/sparc32/sparcv9/bzero.c
 delete mode 100644 sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
 delete mode 100644 sysdeps/sparc/sparc64/bzero.c
 delete mode 100644 sysdeps/sparc/sparc64/multiarch/bzero.c
 delete mode 100644 sysdeps/x86_64/bzero.S
 delete mode 100644 sysdeps/x86_64/multiarch/bcopy.S
 delete mode 100644 sysdeps/x86_64/multiarch/bzero.c

-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 01/12] ia64: Remove bcopy
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 02/12] powerpc: Remove bcopy optimizations Adhemerval Zanella
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

It just call memmove as the generic implementation.
---
 sysdeps/ia64/bcopy.S | 10 ----------
 1 file changed, 10 deletions(-)
 delete mode 100644 sysdeps/ia64/bcopy.S

diff --git a/sysdeps/ia64/bcopy.S b/sysdeps/ia64/bcopy.S
deleted file mode 100644
index bdabf5acdc..0000000000
--- a/sysdeps/ia64/bcopy.S
+++ /dev/null
@@ -1,10 +0,0 @@
-#include <sysdep.h>
-
-ENTRY(bcopy)
-	.regstk 3, 0, 0, 0
-	mov r8 = in0
-	mov in0 = in1
-	;;
-	mov in1 = r8
-	br.cond.sptk.many HIDDEN_BUILTIN_JUMPTARGET(memmove)
-END(bcopy)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 02/12] powerpc: Remove bcopy optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 01/12] ia64: Remove bcopy Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 03/12] i386: " Adhemerval Zanella
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
 .../powerpc/powerpc64/le/power10/memmove.S    | 13 -------
 sysdeps/powerpc/powerpc64/multiarch/Makefile  |  2 +-
 .../powerpc/powerpc64/multiarch/bcopy-ppc64.c | 27 -------------
 sysdeps/powerpc/powerpc64/multiarch/bcopy.c   | 38 -------------------
 .../powerpc64/multiarch/ifunc-impl-list.c     | 13 -------
 .../powerpc64/multiarch/memmove-power10.S     |  3 --
 .../powerpc64/multiarch/memmove-power7.S      |  3 --
 sysdeps/powerpc/powerpc64/power7/bcopy.c      |  1 -
 sysdeps/powerpc/powerpc64/power7/memmove.S    | 14 -------
 9 files changed, 1 insertion(+), 113 deletions(-)
 delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
 delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bcopy.c
 delete mode 100644 sysdeps/powerpc/powerpc64/power7/bcopy.c

diff --git a/sysdeps/powerpc/powerpc64/le/power10/memmove.S b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
index eda86b194e..3024718fdf 100644
--- a/sysdeps/powerpc/powerpc64/le/power10/memmove.S
+++ b/sysdeps/powerpc/powerpc64/le/power10/memmove.S
@@ -305,16 +305,3 @@ L(tail1_bwd):
 
 END_GEN_TB (MEMMOVE,TB_TOCLESS)
 libc_hidden_builtin_def (memmove)
-
-/* void bcopy(const void *src [r3], void *dest [r4], size_t n [r5])
-   Implemented in this file to avoid linker create a stub function call
-   in the branch to '_memmove'.  */
-ENTRY_TOCLESS (__bcopy)
-	mr	r6,r3
-	mr	r3,r4
-	mr	r4,r6
-	b	L(_memmove)
-END (__bcopy)
-#ifndef __bcopy
-weak_alias (__bcopy, bcopy)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
index 626845a43c..6f2436b660 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -24,7 +24,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
 		   stpncpy-power8 stpncpy-power7 stpncpy-ppc64 \
 		   strcmp-power8 strcmp-power7 strcmp-ppc64 \
 		   strcat-power8 strcat-power7 strcat-ppc64 \
-		   memmove-power7 memmove-ppc64 wordcopy-ppc64 bcopy-ppc64 \
+		   memmove-power7 memmove-ppc64 wordcopy-ppc64 \
 		   strncpy-power8 strstr-power7 strstr-ppc64 \
 		   strspn-power8 strspn-ppc64 strcspn-power8 strcspn-ppc64 \
 		   strlen-power8 strcasestr-power8 strcasestr-ppc64 \
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
deleted file mode 100644
index fe68713ad7..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy-ppc64.c
+++ /dev/null
@@ -1,27 +0,0 @@
-/* PowerPC64 default bcopy.
-   Copyright (C) 2014-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <string.h>
-
-extern __typeof (bcopy)   __bcopy_ppc attribute_hidden;
-extern __typeof (memmove) __memmove_ppc attribute_hidden;
-
-void __bcopy_ppc (const void *src, void *dest, size_t n)
-{
-  __memmove_ppc (dest, src, n);
-}
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c b/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
deleted file mode 100644
index 84c6adfd6e..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bcopy.c
+++ /dev/null
@@ -1,38 +0,0 @@
-/* PowerPC64 multiarch bcopy.
-   Copyright (C) 2014-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <string.h>
-#include "init-arch.h"
-
-extern __typeof (bcopy) __bcopy_ppc attribute_hidden;
-/* __bcopy_power7 symbol is implemented at memmove-power7.S  */
-extern __typeof (bcopy) __bcopy_power7 attribute_hidden;
-#ifdef __LITTLE_ENDIAN__
-extern __typeof (bcopy) __bcopy_power10 attribute_hidden;
-#endif
-
-libc_ifunc (bcopy,
-#ifdef __LITTLE_ENDIAN__
-	    (hwcap2 & PPC_FEATURE2_ARCH_3_1
-	     && hwcap2 & PPC_FEATURE2_HAS_ISEL
-	     && hwcap & PPC_FEATURE_HAS_VSX)
-	    ? __bcopy_power10 :
-#endif
-            (hwcap & PPC_FEATURE_HAS_VSX)
-            ? __bcopy_power7
-            : __bcopy_ppc);
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index a0f9fce25d..280b8616b2 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -244,19 +244,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 			      __bzero_power4)
 	      IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
 
-  /* Support sysdeps/powerpc/powerpc64/multiarch/bcopy.c.  */
-  IFUNC_IMPL (i, name, bcopy,
-#ifdef __LITTLE_ENDIAN__
-	      IFUNC_IMPL_ADD (array, i, bcopy,
-			      hwcap2 & PPC_FEATURE2_ARCH_3_1
-			      && hwcap2 & PPC_FEATURE2_HAS_ISEL
-			      && hwcap & PPC_FEATURE_HAS_VSX,
-			      __bcopy_power10)
-#endif
-	      IFUNC_IMPL_ADD (array, i, bcopy, hwcap & PPC_FEATURE_HAS_VSX,
-			      __bcopy_power7)
-	      IFUNC_IMPL_ADD (array, i, bcopy, 1, __bcopy_ppc))
-
   /* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c.  */
   IFUNC_IMPL (i, name, mempcpy,
 	      IFUNC_IMPL_ADD (array, i, mempcpy,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
index e5df0851c0..a66d2892c4 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power10.S
@@ -21,7 +21,4 @@
 #undef libc_hidden_builtin_def
 #define libc_hidden_builtin_def(name)
 
-#undef __bcopy
-#define __bcopy __bcopy_power10
-
 #include <sysdeps/powerpc/powerpc64/le/power10/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
index a7b05ebfa9..0a6c7cb96e 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S
@@ -21,7 +21,4 @@
 #undef libc_hidden_builtin_def
 #define libc_hidden_builtin_def(name)
 
-#undef __bcopy
-#define __bcopy __bcopy_power7
-
 #include <sysdeps/powerpc/powerpc64/power7/memmove.S>
diff --git a/sysdeps/powerpc/powerpc64/power7/bcopy.c b/sysdeps/powerpc/powerpc64/power7/bcopy.c
deleted file mode 100644
index 4a6a400e7a..0000000000
--- a/sysdeps/powerpc/powerpc64/power7/bcopy.c
+++ /dev/null
@@ -1 +0,0 @@
-/* Implemented at memmove.S  */
diff --git a/sysdeps/powerpc/powerpc64/power7/memmove.S b/sysdeps/powerpc/powerpc64/power7/memmove.S
index 1d10a3d593..5a1055c097 100644
--- a/sysdeps/powerpc/powerpc64/power7/memmove.S
+++ b/sysdeps/powerpc/powerpc64/power7/memmove.S
@@ -821,17 +821,3 @@ L(end_unaligned_loop_bwd):
 	blr
 END_GEN_TB (MEMMOVE, TB_TOCLESS)
 libc_hidden_builtin_def (memmove)
-
-
-/* void bcopy(const void *src [r3], void *dest [r4], size_t n [r5])
-   Implemented in this file to avoid linker create a stub function call
-   in the branch to '_memmove'.  */
-ENTRY_TOCLESS (__bcopy)
-	mr	r6,r3
-	mr	r3,r4
-	mr	r4,r6
-	b	L(_memmove)
-END (__bcopy)
-#ifndef __bcopy
-weak_alias (__bcopy, bcopy)
-#endif
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 03/12] i386: Remove bcopy optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 01/12] ia64: Remove bcopy Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 02/12] powerpc: Remove bcopy optimizations Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 04/12] x86_64: " Adhemerval Zanella
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
 sysdeps/i386/bcopy.S                          |  4 ---
 sysdeps/i386/i686/bcopy.S                     |  3 --
 sysdeps/i386/i686/multiarch/Makefile          |  6 ++--
 sysdeps/i386/i686/multiarch/bcopy-ia32.S      | 20 -------------
 .../i686/multiarch/bcopy-sse2-unaligned.S     |  4 ---
 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S |  4 ---
 sysdeps/i386/i686/multiarch/bcopy-ssse3.S     |  4 ---
 sysdeps/i386/i686/multiarch/bcopy.c           | 30 -------------------
 sysdeps/i386/i686/multiarch/ifunc-impl-list.c | 10 -------
 9 files changed, 3 insertions(+), 82 deletions(-)
 delete mode 100644 sysdeps/i386/bcopy.S
 delete mode 100644 sysdeps/i386/i686/bcopy.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ia32.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy-ssse3.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bcopy.c

diff --git a/sysdeps/i386/bcopy.S b/sysdeps/i386/bcopy.S
deleted file mode 100644
index 12b8ddb886..0000000000
--- a/sysdeps/i386/bcopy.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY		bcopy
-#include "memcpy.S"
diff --git a/sysdeps/i386/i686/bcopy.S b/sysdeps/i386/i686/bcopy.S
deleted file mode 100644
index 15ef9419a4..0000000000
--- a/sysdeps/i386/i686/bcopy.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BCOPY
-#define memmove bcopy
-#include <sysdeps/i386/i686/memmove.S>
diff --git a/sysdeps/i386/i686/multiarch/Makefile b/sysdeps/i386/i686/multiarch/Makefile
index c4897922d7..02fa02658e 100644
--- a/sysdeps/i386/i686/multiarch/Makefile
+++ b/sysdeps/i386/i686/multiarch/Makefile
@@ -2,7 +2,7 @@ ifeq ($(subdir),string)
 gen-as-const-headers += locale-defines.sym
 sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
 		   memmove-ssse3 memcpy-ssse3-rep mempcpy-ssse3-rep \
-		   memmove-ssse3-rep bcopy-ssse3 bcopy-ssse3-rep \
+		   memmove-ssse3-rep \
 		   memset-sse2-rep bzero-sse2-rep strcmp-ssse3 \
 		   strcmp-sse4 strncmp-c strncmp-ssse3 strncmp-sse4 \
 		   memcmp-ssse3 memcmp-sse4 varshift \
@@ -18,10 +18,10 @@ sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
 		   strcasecmp_l-c strcasecmp-c strcasecmp_l-ssse3 \
 		   strncase_l-c strncase-c strncase_l-ssse3 \
 		   strcasecmp_l-sse4 strncase_l-sse4 \
-		   bcopy-sse2-unaligned memcpy-sse2-unaligned \
+		   memcpy-sse2-unaligned \
 		   mempcpy-sse2-unaligned memmove-sse2-unaligned \
 		   strcspn-c strpbrk-c strspn-c \
-		   bcopy-ia32 bzero-ia32 rawmemchr-ia32 \
+		   bzero-ia32 rawmemchr-ia32 \
 		   memchr-ia32 memcmp-ia32 memcpy-ia32 memmove-ia32 \
 		   mempcpy-ia32 memset-ia32 strcat-ia32 strchr-ia32 \
 		   strrchr-ia32 strcpy-ia32 strcmp-ia32 strcspn-ia32 \
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ia32.S b/sysdeps/i386/i686/multiarch/bcopy-ia32.S
deleted file mode 100644
index e0fadc0f3f..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ia32.S
+++ /dev/null
@@ -1,20 +0,0 @@
-/* bcopy optimized for i686.
-   Copyright (C) 2017-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#define bcopy __bcopy_ia32
-#include <sysdeps/i386/i686/bcopy.S>
diff --git a/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S b/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
deleted file mode 100644
index efef2a10dd..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-sse2-unaligned.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY		__bcopy_sse2_unaligned
-#include "memcpy-sse2-unaligned.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S b/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
deleted file mode 100644
index cbc8b420e8..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ssse3-rep.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY		__bcopy_ssse3_rep
-#include "memcpy-ssse3-rep.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy-ssse3.S b/sysdeps/i386/i686/multiarch/bcopy-ssse3.S
deleted file mode 100644
index 36aac44b9c..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy-ssse3.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_MEMMOVE
-#define USE_AS_BCOPY
-#define MEMCPY		__bcopy_ssse3
-#include "memcpy-ssse3.S"
diff --git a/sysdeps/i386/i686/multiarch/bcopy.c b/sysdeps/i386/i686/multiarch/bcopy.c
deleted file mode 100644
index bc2c2ac55d..0000000000
--- a/sysdeps/i386/i686/multiarch/bcopy.c
+++ /dev/null
@@ -1,30 +0,0 @@
-/* Multiple versions of bcopy.
-   All versions must be listed in ifunc-impl-list.c.
-   Copyright (C) 2017-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* Define multiple versions only for the definition in libc.  */
-#if IS_IN (libc)
-# define bcopy __redirect_bcopy
-# include <string.h>
-# undef bcopy
-
-# define SYMBOL_NAME bcopy
-# include "ifunc-memmove.h"
-
-libc_ifunc_redirected (__redirect_bcopy, bcopy, IFUNC_SELECTOR ());
-#endif
diff --git a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
index 6883b3d226..5c7a42dc97 100644
--- a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
+++ b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
@@ -36,16 +36,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 
   size_t i = 0;
 
-  /* Support sysdeps/i386/i686/multiarch/bcopy.S.  */
-  IFUNC_IMPL (i, name, bcopy,
-	      IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSSE3),
-			      __bcopy_ssse3_rep)
-	      IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSSE3),
-			      __bcopy_ssse3)
-	      IFUNC_IMPL_ADD (array, i, bcopy, CPU_FEATURE_USABLE (SSE2),
-			      __bcopy_sse2_unaligned)
-	      IFUNC_IMPL_ADD (array, i, bcopy, 1, __bcopy_ia32))
-
   /* Support sysdeps/i386/i686/multiarch/bzero.S.  */
   IFUNC_IMPL (i, name, bzero,
 	      IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 04/12] x86_64: Remove bcopy optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (2 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 03/12] i386: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 05/12] alpha: Remove bzero optimization Adhemerval Zanella
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
 sysdeps/x86_64/multiarch/bcopy.S | 7 -------
 1 file changed, 7 deletions(-)
 delete mode 100644 sysdeps/x86_64/multiarch/bcopy.S

diff --git a/sysdeps/x86_64/multiarch/bcopy.S b/sysdeps/x86_64/multiarch/bcopy.S
deleted file mode 100644
index 639f02bde3..0000000000
--- a/sysdeps/x86_64/multiarch/bcopy.S
+++ /dev/null
@@ -1,7 +0,0 @@
-#include <sysdep.h>
-
-	.text
-ENTRY(bcopy)
-	xchg	%rdi, %rsi
-	jmp	__libc_memmove	/* Branch to IFUNC memmove.  */
-END(bcopy)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 05/12] alpha: Remove bzero optimization
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (3 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 04/12] x86_64: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 06/12] ia64: " Adhemerval Zanella
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbols is not present in current POSIX specification and compiler
already generates memmove call.
---
 sysdeps/alpha/bzero.S | 109 ------------------------------------------
 1 file changed, 109 deletions(-)
 delete mode 100644 sysdeps/alpha/bzero.S

diff --git a/sysdeps/alpha/bzero.S b/sysdeps/alpha/bzero.S
deleted file mode 100644
index 4821778622..0000000000
--- a/sysdeps/alpha/bzero.S
+++ /dev/null
@@ -1,109 +0,0 @@
-/* Copyright (C) 1996-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library.  If not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* Fill a block of memory with zeros.  Optimized for the Alpha architecture:
-
-   - memory accessed as aligned quadwords only
-   - destination memory not read unless needed for good cache behaviour
-   - basic blocks arranged to optimize branch prediction for full-quadword
-     aligned memory blocks.
-   - partial head and tail quadwords constructed with byte-mask instructions
-
-   This is generally scheduled for the EV5 (got to look out for my own
-   interests :-), but with EV4 needs in mind.  There *should* be no more
-   stalls for the EV4 than there are for the EV5.
-*/
-
-
-#include <sysdep.h>
-
-	.set noat
-	.set noreorder
-
-	.text
-	.type	__bzero, @function
-	.globl	__bzero
-	.usepv	__bzero, USEPV_PROF
-
-	cfi_startproc
-
-	/* On entry to this basic block:
-	   t3 == loop counter
-	   t4 == bytes in partial final word
-	   a0 == possibly misaligned destination pointer  */
-
-	.align 3
-bzero_loop:
-	beq	t3, $tail	#
-	blbc	t3, 0f		# skip single store if count even
-
-	stq_u	zero, 0(a0)	# e0    : store one word
-	subq	t3, 1, t3	# .. e1 :
-	addq	a0, 8, a0	# e0    :
-	beq	t3, $tail	# .. e1 :
-
-0:	stq_u	zero, 0(a0)	# e0    : store two words
-	subq	t3, 2, t3	# .. e1 :
-	stq_u	zero, 8(a0)	# e0    :
-	addq	a0, 16, a0	# .. e1 :
-	bne	t3, 0b		# e1    :
-
-$tail:	bne	t4, 1f		# is there a tail to do?
-	ret			# no
-
-1:	ldq_u	t0, 0(a0)	# yes, load original data
-	mskqh	t0, t4, t0	#
-	stq_u	t0, 0(a0)	#
-	ret			#
-
-__bzero:
-#ifdef PROF
-	ldgp	gp, 0(pv)
-	lda	AT, _mcount
-	jsr	AT, (AT), _mcount
-#endif
-
-	mov	a0, v0		# e0    : move return value in place
-	beq	a1, $done	# .. e1 : early exit for zero-length store
-	and	a0, 7, t1	# e0    :
-	addq	a1, t1, a1	# e1    : add dest misalignment to count
-	srl	a1, 3, t3	# e0    : loop = count >> 3
-	and	a1, 7, t4	# .. e1 : find number of bytes in tail
-	unop			#       :
-	beq	t1, bzero_loop	# e1    : aligned head, jump right in
-
-	ldq_u	t0, 0(a0)	# e0    : load original data to mask into
-	cmpult	a1, 8, t2	# .. e1 : is this a sub-word set?
-	bne	t2, $oneq	# e1    :
-
-	mskql	t0, a0, t0	# e0    : we span words.  finish this partial
-	subq	t3, 1, t3	# e0    :
-	addq	a0, 8, a0	# .. e1 :
-	stq_u	t0, -8(a0)	# e0    :
-	br 	bzero_loop	# .. e1 :
-
-	.align 3
-$oneq:
-	mskql	t0, a0, t2	# e0    :
-	mskqh	t0, a1, t3	# e0    :
-	or	t2, t3, t0	# e1    :
-	stq_u	t0, 0(a0)	# e0    :
-
-$done:	ret
-
-	cfi_endproc
-weak_alias (__bzero, bzero)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 06/12] ia64: Remove bzero optimization
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (4 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 05/12] alpha: Remove bzero optimization Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 07/12] " Adhemerval Zanella
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbol is not present current POSIX specification and compiler
already generates memset call.
---
 sysdeps/ia64/bzero.S | 312 -------------------------------------------
 1 file changed, 312 deletions(-)
 delete mode 100644 sysdeps/ia64/bzero.S

diff --git a/sysdeps/ia64/bzero.S b/sysdeps/ia64/bzero.S
deleted file mode 100644
index cd01abb436..0000000000
--- a/sysdeps/ia64/bzero.S
+++ /dev/null
@@ -1,312 +0,0 @@
-/* Optimized version of the standard bzero() function.
-   This file is part of the GNU C Library.
-   Copyright (C) 2000-2022 Free Software Foundation, Inc.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* Return: dest
-
-   Inputs:
-        in0:    dest
-        in1:    count
-
-   The algorithm is fairly straightforward: set byte by byte until we
-   we get to a 16B-aligned address, then loop on 128 B chunks using an
-   early store as prefetching, then loop on 32B chucks, then clear remaining
-   words, finally clear remaining bytes.
-   Since a stf.spill f0 can store 16B in one go, we use this instruction
-   to get peak speed.  */
-
-#include <sysdep.h>
-#undef ret
-
-#define dest		in0
-#define	cnt		in1
-
-#define tmp		r31
-#define save_lc		r30
-#define ptr0		r29
-#define ptr1		r28
-#define ptr2		r27
-#define ptr3		r26
-#define ptr9 		r24
-#define	loopcnt		r23
-#define linecnt		r22
-#define bytecnt		r21
-
-// This routine uses only scratch predicate registers (p6 - p15)
-#define p_scr		p6	// default register for same-cycle branches
-#define p_unalgn	p9
-#define p_y		p11
-#define p_n		p12
-#define p_yy		p13
-#define p_nn		p14
-
-#define movi0		mov
-
-#define MIN1		15
-#define MIN1P1HALF	8
-#define LINE_SIZE	128
-#define LSIZE_SH        7			// shift amount
-#define PREF_AHEAD	8
-
-#define USE_FLP
-#if defined(USE_INT)
-#define store		st8
-#define myval		r0
-#elif defined(USE_FLP)
-#define store		stf8
-#define myval		f0
-#endif
-
-.align	64
-ENTRY(bzero)
-{ .mmi
-	.prologue
-	alloc	tmp = ar.pfs, 2, 0, 0, 0
-	lfetch.nt1 [dest]
-	.save   ar.lc, save_lc
-	movi0	save_lc = ar.lc
-} { .mmi
-	.body
-	mov	ret0 = dest		// return value
-	nop.m	0
-	cmp.eq	p_scr, p0 = cnt, r0
-;; }
-{ .mmi
-	and	ptr2 = -(MIN1+1), dest	// aligned address
-	and	tmp = MIN1, dest	// prepare to check for alignment
-	tbit.nz p_y, p_n = dest, 0	// Do we have an odd address? (M_B_U)
-} { .mib
-	mov	ptr1 = dest
-	nop.i	0
-(p_scr)	br.ret.dpnt.many rp		// return immediately if count = 0
-;; }
-{ .mib
-	cmp.ne	p_unalgn, p0 = tmp, r0
-} { .mib					// NB: # of bytes to move is 1
-	sub	bytecnt = (MIN1+1), tmp		//     higher than loopcnt
-	cmp.gt	p_scr, p0 = 16, cnt		// is it a minimalistic task?
-(p_scr)	br.cond.dptk.many .move_bytes_unaligned	// go move just a few (M_B_U)
-;; }
-{ .mmi
-(p_unalgn) add	ptr1 = (MIN1+1), ptr2		// after alignment
-(p_unalgn) add	ptr2 = MIN1P1HALF, ptr2		// after alignment
-(p_unalgn) tbit.nz.unc p_y, p_n = bytecnt, 3	// should we do a st8 ?
-;; }
-{ .mib
-(p_y)	add	cnt = -8, cnt
-(p_unalgn) tbit.nz.unc p_yy, p_nn = bytecnt, 2	// should we do a st4 ?
-} { .mib
-(p_y)	st8	[ptr2] = r0,-4
-(p_n)	add	ptr2 = 4, ptr2
-;; }
-{ .mib
-(p_yy)	add	cnt = -4, cnt
-(p_unalgn) tbit.nz.unc p_y, p_n = bytecnt, 1	// should we do a st2 ?
-} { .mib
-(p_yy)	st4	[ptr2] = r0,-2
-(p_nn)	add	ptr2 = 2, ptr2
-;; }
-{ .mmi
-	mov	tmp = LINE_SIZE+1		// for compare
-(p_y)	add	cnt = -2, cnt
-(p_unalgn) tbit.nz.unc p_yy, p_nn = bytecnt, 0	// should we do a st1 ?
-} { .mmi
-	nop.m	0
-(p_y)	st2	[ptr2] = r0,-1
-(p_n)	add	ptr2 = 1, ptr2
-;; }
-
-{ .mmi
-(p_yy)	st1	[ptr2] = r0
-	cmp.gt	p_scr, p0 = tmp, cnt		// is it a minimalistic task?
-} { .mbb
-(p_yy)	add	cnt = -1, cnt
-(p_scr)	br.cond.dpnt.many .fraction_of_line	// go move just a few
-;; }
-{ .mib
-	nop.m 	0
-	shr.u	linecnt = cnt, LSIZE_SH
-	nop.b	0
-;; }
-
-	.align 32
-.l1b:	// ------------------//  L1B: store ahead into cache lines; fill later
-{ .mmi
-	and	tmp = -(LINE_SIZE), cnt		// compute end of range
-	mov	ptr9 = ptr1			// used for prefetching
-	and	cnt = (LINE_SIZE-1), cnt	// remainder
-} { .mmi
-	mov	loopcnt = PREF_AHEAD-1		// default prefetch loop
-	cmp.gt	p_scr, p0 = PREF_AHEAD, linecnt	// check against actual value
-;; }
-{ .mmi
-(p_scr)	add	loopcnt = -1, linecnt
-	add	ptr2 = 16, ptr1	// start of stores (beyond prefetch stores)
-	add	ptr1 = tmp, ptr1	// first address beyond total range
-;; }
-{ .mmi
-	add	tmp = -1, linecnt	// next loop count
-	movi0	ar.lc = loopcnt
-;; }
-.pref_l1b:
-{ .mib
-	stf.spill [ptr9] = f0, 128	// Do stores one cache line apart
-	nop.i   0
-	br.cloop.dptk.few .pref_l1b
-;; }
-{ .mmi
-	add	ptr0 = 16, ptr2		// Two stores in parallel
-	movi0	ar.lc = tmp
-;; }
-.l1bx:
- { .mmi
-	stf.spill [ptr2] = f0, 32
-	stf.spill [ptr0] = f0, 32
- ;; }
- { .mmi
-	stf.spill [ptr2] = f0, 32
-	stf.spill [ptr0] = f0, 32
- ;; }
- { .mmi
-	stf.spill [ptr2] = f0, 32
-	stf.spill [ptr0] = f0, 64
-	cmp.lt	p_scr, p0 = ptr9, ptr1	// do we need more prefetching?
- ;; }
-{ .mmb
-	stf.spill [ptr2] = f0, 32
-(p_scr)	stf.spill [ptr9] = f0, 128
-	br.cloop.dptk.few .l1bx
-;; }
-{ .mib
-	cmp.gt  p_scr, p0 = 8, cnt	// just a few bytes left ?
-(p_scr)	br.cond.dpnt.many  .move_bytes_from_alignment
-;; }
-
-.fraction_of_line:
-{ .mib
-	add	ptr2 = 16, ptr1
-	shr.u	loopcnt = cnt, 5   	// loopcnt = cnt / 32
-;; }
-{ .mib
-	cmp.eq	p_scr, p0 = loopcnt, r0
-	add	loopcnt = -1, loopcnt
-(p_scr)	br.cond.dpnt.many .store_words
-;; }
-{ .mib
-	and	cnt = 0x1f, cnt		// compute the remaining cnt
-	movi0   ar.lc = loopcnt
-;; }
-	.align 32
-.l2:	// -----------------------------//  L2A:  store 32B in 2 cycles
-{ .mmb
-	store	[ptr1] = myval, 8
-	store	[ptr2] = myval, 8
-;; } { .mmb
-	store	[ptr1] = myval, 24
-	store	[ptr2] = myval, 24
-	br.cloop.dptk.many .l2
-;; }
-.store_words:
-{ .mib
-	cmp.gt	p_scr, p0 = 8, cnt	// just a few bytes left ?
-(p_scr)	br.cond.dpnt.many .move_bytes_from_alignment	// Branch
-;; }
-
-{ .mmi
-	store	[ptr1] = myval, 8	// store
-	cmp.le	p_y, p_n = 16, cnt	//
-	add	cnt = -8, cnt		// subtract
-;; }
-{ .mmi
-(p_y)	store	[ptr1] = myval, 8	// store
-(p_y)	cmp.le.unc p_yy, p_nn = 16, cnt
-(p_y)	add	cnt = -8, cnt		// subtract
-;; }
-{ .mmi					// store
-(p_yy)	store	[ptr1] = myval, 8
-(p_yy)	add	cnt = -8, cnt		// subtract
-;; }
-
-.move_bytes_from_alignment:
-{ .mib
-	cmp.eq	p_scr, p0 = cnt, r0
-	tbit.nz.unc p_y, p0 = cnt, 2	// should we terminate with a st4 ?
-(p_scr)	br.cond.dpnt.few .restore_and_exit
-;; }
-{ .mib
-(p_y)	st4	[ptr1] = r0,4
-	tbit.nz.unc p_yy, p0 = cnt, 1	// should we terminate with a st2 ?
-;; }
-{ .mib
-(p_yy)	st2	[ptr1] = r0,2
-	tbit.nz.unc p_y, p0 = cnt, 0	// should we terminate with a st1 ?
-;; }
-
-{ .mib
-(p_y)	st1	[ptr1] = r0
-;; }
-.restore_and_exit:
-{ .mib
-	nop.m	0
-	movi0	ar.lc = save_lc
-	br.ret.sptk.many rp
-;; }
-
-.move_bytes_unaligned:
-{ .mmi
-       .pred.rel "mutex",p_y, p_n
-       .pred.rel "mutex",p_yy, p_nn
-(p_n)	cmp.le  p_yy, p_nn = 4, cnt
-(p_y)	cmp.le  p_yy, p_nn = 5, cnt
-(p_n)	add	ptr2 = 2, ptr1
-} { .mmi
-(p_y)	add	ptr2 = 3, ptr1
-(p_y)	st1	[ptr1] = r0, 1		// fill 1 (odd-aligned) byte
-(p_y)	add	cnt = -1, cnt		// [15, 14 (or less) left]
-;; }
-{ .mmi
-(p_yy)	cmp.le.unc p_y, p0 = 8, cnt
-	add	ptr3 = ptr1, cnt	// prepare last store
-	movi0	ar.lc = save_lc
-} { .mmi
-(p_yy)	st2	[ptr1] = r0, 4		// fill 2 (aligned) bytes
-(p_yy)	st2	[ptr2] = r0, 4		// fill 2 (aligned) bytes
-(p_yy)	add	cnt = -4, cnt		// [11, 10 (o less) left]
-;; }
-{ .mmi
-(p_y)	cmp.le.unc p_yy, p0 = 8, cnt
-	add	ptr3 = -1, ptr3		// last store
-	tbit.nz p_scr, p0 = cnt, 1	// will there be a st2 at the end ?
-} { .mmi
-(p_y)	st2	[ptr1] = r0, 4		// fill 2 (aligned) bytes
-(p_y)	st2	[ptr2] = r0, 4		// fill 2 (aligned) bytes
-(p_y)	add	cnt = -4, cnt		// [7, 6 (or less) left]
-;; }
-{ .mmi
-(p_yy)	st2	[ptr1] = r0, 4		// fill 2 (aligned) bytes
-(p_yy)	st2	[ptr2] = r0, 4		// fill 2 (aligned) bytes
-					// [3, 2 (or less) left]
-	tbit.nz p_y, p0 = cnt, 0	// will there be a st1 at the end ?
-} { .mmi
-(p_yy)	add	cnt = -4, cnt
-;; }
-{ .mmb
-(p_scr)	st2	[ptr1] = r0		// fill 2 (aligned) bytes
-(p_y)	st1	[ptr3] = r0		// fill last byte (using ptr3)
-	br.ret.sptk.many rp
-;; }
-END(bzero)
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 07/12] Remove bzero optimization
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (5 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 06/12] ia64: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 08/12] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
 sysdeps/sparc/sparc32/bzero.c                 |  1 -
 sysdeps/sparc/sparc32/memset.S                | 37 ++++++++-----------
 sysdeps/sparc/sparc32/sparcv9/bzero.c         |  1 -
 .../sparc/sparc32/sparcv9/multiarch/bzero.c   |  1 -
 .../sparc32/sparcv9/multiarch/memset-ultra1.S |  1 -
 sysdeps/sparc/sparc64/bzero.c                 |  1 -
 sysdeps/sparc/sparc64/memset.S                | 30 ++++++---------
 sysdeps/sparc/sparc64/multiarch/bzero.c       | 33 -----------------
 .../sparc/sparc64/multiarch/ifunc-impl-list.c |  9 -----
 .../sparc/sparc64/multiarch/ifunc-memset.h    |  2 +-
 .../sparc/sparc64/multiarch/memset-niagara1.S |  5 +--
 .../sparc/sparc64/multiarch/memset-niagara4.S |  6 +--
 .../sparc/sparc64/multiarch/memset-niagara7.S |  7 ----
 .../sparc/sparc64/multiarch/memset-ultra1.S   |  1 -
 14 files changed, 30 insertions(+), 105 deletions(-)
 delete mode 100644 sysdeps/sparc/sparc32/bzero.c
 delete mode 100644 sysdeps/sparc/sparc32/sparcv9/bzero.c
 delete mode 100644 sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
 delete mode 100644 sysdeps/sparc/sparc64/bzero.c
 delete mode 100644 sysdeps/sparc/sparc64/multiarch/bzero.c

diff --git a/sysdeps/sparc/sparc32/bzero.c b/sysdeps/sparc/sparc32/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc32/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc32/memset.S b/sysdeps/sparc/sparc32/memset.S
index d222fa7506..b1b67cb2d1 100644
--- a/sysdeps/sparc/sparc32/memset.S
+++ b/sysdeps/sparc/sparc32/memset.S
@@ -42,25 +42,6 @@
 
 	.text
 	.align 4
-ENTRY(__bzero)
-	b		1f
-	 mov		%g0, %g3
-
-3:	cmp		%o2, 3
-	be		2f
-	 stb		%g3, [%o0]
-
-	cmp		%o2, 2
-	be		2f
-	 stb		%g3, [%o0 + 0x01]
-
-	stb		%g3, [%o0 + 0x02]
-2:	sub		%o2, 4, %o2
-	add		%o1, %o2, %o1
-	b		4f
-	 sub		%o0, %o2, %o0
-END(__bzero)
-
 ENTRY(memset)
 	and		%o1, 0xff, %g3
 	sll		%g3, 8, %g2
@@ -73,7 +54,7 @@ ENTRY(memset)
 	 mov		%o0, %g1
 
 	andcc		%o0, 3, %o2
-	bne		3b
+	bne		3f
 4:	 andcc		%o0, 4, %g0
 
 	be		2f
@@ -146,7 +127,19 @@ ENTRY(memset)
 	stb		%g3, [%o0 + 6]
 0:	retl
 	 nop
+
+3:	cmp		%o2, 3
+	be		2f
+	 stb		%g3, [%o0]
+
+	cmp		%o2, 2
+	be		2f
+	 stb		%g3, [%o0 + 0x01]
+
+	stb		%g3, [%o0 + 0x02]
+2:	sub		%o2, 4, %o2
+	add		%o1, %o2, %o1
+	b		4b
+	 sub		%o0, %o2, %o0
 END(memset)
 libc_hidden_builtin_def (memset)
-
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/sparc/sparc32/sparcv9/bzero.c b/sysdeps/sparc/sparc32/sparcv9/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c b/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
deleted file mode 100644
index cf6803ef44..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/multiarch/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-#include <sysdeps/sparc/sparc64/multiarch/bzero.c>
diff --git a/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S b/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
index 6038611134..2dda6f1ed6 100644
--- a/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
+++ b/sysdeps/sparc/sparc32/sparcv9/multiarch/memset-ultra1.S
@@ -25,6 +25,5 @@
 # define weak_alias(x, y)
 
 # define memset  __memset_ultra1
-# define __bzero __bzero_ultra1
 # include <sysdeps/sparc/sparc32/sparcv9/memset.S>
 #endif
diff --git a/sysdeps/sparc/sparc64/bzero.c b/sysdeps/sparc/sparc64/bzero.c
deleted file mode 100644
index 37f0f6f993..0000000000
--- a/sysdeps/sparc/sparc64/bzero.c
+++ /dev/null
@@ -1 +0,0 @@
-/* bzero is in memset.S */
diff --git a/sysdeps/sparc/sparc64/memset.S b/sysdeps/sparc/sparc64/memset.S
index a7f8361fa3..33ecbc93fe 100644
--- a/sysdeps/sparc/sparc64/memset.S
+++ b/sysdeps/sparc/sparc64/memset.S
@@ -31,6 +31,16 @@
 	stx		source, [base - offset - 0x08];	\
 	stx		source, [base - offset - 0x00];
 
+#define ZERO_BLOCKS(base, offset, source)		\
+	stx		source, [base - offset - 0x38];	\
+	stx		source, [base - offset - 0x30];	\
+	stx		source, [base - offset - 0x28];	\
+	stx		source, [base - offset - 0x20];	\
+	stx		source, [base - offset - 0x18];	\
+	stx		source, [base - offset - 0x10];	\
+	stx		source, [base - offset - 0x08];	\
+	stx		source, [base - offset - 0x00];
+
 	/* Well, memset is a lot easier to get right than bcopy... */
 	.text
 	.align		32
@@ -174,22 +184,7 @@ ENTRY(memset)
 	 nop
 	ba,pt		%xcc, 18b
 	 ldd		[%o0], %f0
-END(memset)
-libc_hidden_builtin_def (memset)
 
-#define ZERO_BLOCKS(base, offset, source)		\
-	stx		source, [base - offset - 0x38];	\
-	stx		source, [base - offset - 0x30];	\
-	stx		source, [base - offset - 0x28];	\
-	stx		source, [base - offset - 0x20];	\
-	stx		source, [base - offset - 0x18];	\
-	stx		source, [base - offset - 0x10];	\
-	stx		source, [base - offset - 0x08];	\
-	stx		source, [base - offset - 0x00];
-
-	.text
-	.align		32
-ENTRY(__bzero)
 #ifndef USE_BPR
 	srl		%o1, 0, %o1
 #endif
@@ -307,6 +302,5 @@ ENTRY(__bzero)
 	 stb		%g0, [%o0 - 1]
 0:	retl
 	 mov		%o5, %o0
-END(__bzero)
-
-weak_alias (__bzero, bzero)
+END(memset)
+libc_hidden_builtin_def (memset)
diff --git a/sysdeps/sparc/sparc64/multiarch/bzero.c b/sysdeps/sparc/sparc64/multiarch/bzero.c
deleted file mode 100644
index 409d66a864..0000000000
--- a/sysdeps/sparc/sparc64/multiarch/bzero.c
+++ /dev/null
@@ -1,33 +0,0 @@
-/* Multiple versions of bzero.  SPARC64/Linux version.
-   All versions must be listed in ifunc-impl-list.c.
-   Copyright (C) 2017-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#if IS_IN (libc)
-# define bzero __redirect_bzero
-# include <string.h>
-# undef bzero
-
-# include <sparc-ifunc.h>
-
-# define SYMBOL_NAME bzero
-# include "ifunc-memset.h"
-
-sparc_libc_ifunc_redirected (__redirect_bzero, __bzero, IFUNC_SELECTOR)
-weak_alias (__bzero, bzero)
-
-#endif
diff --git a/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c b/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
index 05926e605b..9be12f9130 100644
--- a/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/sparc/sparc64/multiarch/ifunc-impl-list.c
@@ -61,15 +61,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 			      __mempcpy_ultra3)
 	      IFUNC_IMPL_ADD (array, i, mempcpy, 1, __mempcpy_ultra1));
 
-  IFUNC_IMPL (i, name, bzero,
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_ADP,
-			      __bzero_niagara7)
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_CRYPTO,
-			      __bzero_niagara4)
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & HWCAP_SPARC_BLKINIT,
-			      __bzero_niagara1)
-	      IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ultra1));
-
   IFUNC_IMPL (i, name, memset,
 	      IFUNC_IMPL_ADD (array, i, memset, hwcap & HWCAP_SPARC_ADP,
 			      __memset_niagara7)
diff --git a/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h b/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
index 56893b6883..0a2f16b3f1 100644
--- a/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
+++ b/sysdeps/sparc/sparc64/multiarch/ifunc-memset.h
@@ -1,4 +1,4 @@
-/* Common definition for memset/bzero implementation.
+/* Common definition for memset implementation.
    All versions must be listed in ifunc-impl-list.c.
    Copyright (C) 2017-2022 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
index 13432effc1..7865691eca 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara1.S
@@ -45,9 +45,6 @@ ENTRY(__memset_niagara1)
 	sllx		%o2, 32, %g1
 	ba,pt		%XCC, 1f
 	 or		%g1, %o2, %o2
-END(__memset_niagara1)
-
-ENTRY(__bzero_niagara1)
 	clr		%o2
 1:
 # ifndef USE_BRP
@@ -171,6 +168,6 @@ ENTRY(__bzero_niagara1)
 90:
 	retl
 	 mov		%o3, %o0
-END(__bzero_niagara1)
+END(__memset_niagara1)
 
 #endif
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
index 1ccf24e516..d6fbd83009 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara4.S
@@ -39,10 +39,6 @@ ENTRY(__memset_niagara4)
 	sllx		%o2, 32, %g1
 	ba,pt		%icc, 1f
 	 or		%g1, %o2, %o4
-END(__memset_niagara4)
-
-	.align		32
-ENTRY(__bzero_niagara4)
 	clr		%o4
 1:	cmp		%o1, 16
 	ble		%icc, .Ltiny
@@ -118,6 +114,6 @@ ENTRY(__bzero_niagara4)
 	bne,pt		%icc, 1b
 	 add		%o0, 0x30, %o0
 	ba,a,pt		%icc, .Lpostloop
-END(__bzero_niagara4)
+END(__memset_niagara4)
 
 #endif
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S b/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
index 491b203ff9..6fcbf56675 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-niagara7.S
@@ -99,13 +99,6 @@
 	.text
 	.align		32
 
-ENTRY(__bzero_niagara7)
-	/* bzero (dst, size)  */
-	mov	%o1, %o2
-	mov	0, %o1
-	/* fall through into memset code */
-END(__bzero_niagara7)
-
 ENTRY(__memset_niagara7)
 	/* memset (src, c, size)  */
 	mov	%o0, %o5		/* copy sp1 before using it  */
diff --git a/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S b/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
index e0d3424307..3c3add791e 100644
--- a/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
+++ b/sysdeps/sparc/sparc64/multiarch/memset-ultra1.S
@@ -25,6 +25,5 @@
 # define weak_alias(x, y)
 
 # define memset  __memset_ultra1
-# define __bzero __bzero_ultra1
 # include <sysdeps/sparc/sparc64/memset.S>
 #endif
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 08/12] powerpc: Remove powerpc32 bzero optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (6 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 07/12] " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 09/12] powerpc: Remove powerpc64 " Adhemerval Zanella
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
 sysdeps/powerpc/powerpc32/bzero.S             | 27 --------------
 .../powerpc32/power4/multiarch/Makefile       |  4 +-
 .../powerpc32/power4/multiarch/bzero-power6.S | 25 -------------
 .../powerpc32/power4/multiarch/bzero-power7.S | 25 -------------
 .../powerpc32/power4/multiarch/bzero-ppc32.S  | 34 -----------------
 .../powerpc32/power4/multiarch/bzero.c        | 37 -------------------
 .../power4/multiarch/ifunc-impl-list.c        |  8 ----
 7 files changed, 2 insertions(+), 158 deletions(-)
 delete mode 100644 sysdeps/powerpc/powerpc32/bzero.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
 delete mode 100644 sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c

diff --git a/sysdeps/powerpc/powerpc32/bzero.S b/sysdeps/powerpc/powerpc32/bzero.S
deleted file mode 100644
index 9cc03c92df..0000000000
--- a/sysdeps/powerpc/powerpc32/bzero.S
+++ /dev/null
@@ -1,27 +0,0 @@
-/* Optimized bzero `implementation' for PowerPC.
-   Copyright (C) 1997-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-
-ENTRY (__bzero)
-
-	mr	r5,r4
-	li	r4,0
-	b	memset@local
-END (__bzero)
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile b/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
index 5c68f07d19..b2f9deefb8 100644
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
@@ -1,8 +1,8 @@
 ifeq ($(subdir),string)
 sysdep_routines += memcpy-power7 memcpy-a2 memcpy-power6 memcpy-cell \
 		   memcpy-ppc32 memcmp-power7 memcmp-ppc32 memset-power7 \
-		   memset-power6 memset-ppc32 bzero-power7 bzero-power6 \
-		   bzero-ppc32 mempcpy-power7 mempcpy-ppc32 memchr-power7 \
+		   memset-power6 memset-ppc32 \
+		   mempcpy-power7 mempcpy-ppc32 memchr-power7 \
 		   memchr-ppc32 memrchr-power7 memrchr-ppc32 rawmemchr-power7 \
 		   rawmemchr-ppc32 strlen-power7 strlen-ppc32 strnlen-power7 \
 		   strnlen-ppc32 strncmp-power7 strncmp-ppc32 \
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
deleted file mode 100644
index b352433283..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power6.S
+++ /dev/null
@@ -1,25 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/POWER6.
-   Copyright (C) 2010-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-
-ENTRY (__bzero_power6)
-        mr      r5,r4
-        li      r4,0
-        b       __memset_power6@local
-END (__bzero_power6)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
deleted file mode 100644
index 80c8ffe55a..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-power7.S
+++ /dev/null
@@ -1,25 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/POWER7.
-   Copyright (C) 2010-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-
-ENTRY (__bzero_power7)
-        mr      r5,r4
-        li      r4,0
-        b       __memset_power7@local
-END (__bzero_power7)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
deleted file mode 100644
index 86711e8e22..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero-ppc32.S
+++ /dev/null
@@ -1,34 +0,0 @@
-/* Optimized bzero implementation for PowerPC32/PPC32.
-   Copyright (C) 2010-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-
-/* memset ifunc selector is not built for static and memset@local
-   for shared builds makes the linker point the call to the ifunc
-   selector.  */
-#ifdef SHARED
-# define MEMSET __memset_ppc
-#else
-# define MEMSET memset
-#endif
-
-ENTRY (__bzero_ppc)
-        mr      r5,r4
-        li      r4,0
-        b       MEMSET@local
-END (__bzero_ppc)
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c b/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
deleted file mode 100644
index 5d9270289f..0000000000
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c
+++ /dev/null
@@ -1,37 +0,0 @@
-/* Multiple versions of bzero.
-   Copyright (C) 2013-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* Define multiple versions only for definition in libc.  */
-#if IS_IN (libc)
-# include <string.h>
-# include <strings.h>
-# include "init-arch.h"
-
-extern __typeof (bzero) __bzero_ppc attribute_hidden;
-extern __typeof (bzero) __bzero_power6 attribute_hidden;
-extern __typeof (bzero) __bzero_power7 attribute_hidden;
-
-libc_ifunc (__bzero,
-            (hwcap & PPC_FEATURE_HAS_VSX)
-            ? __bzero_power7 :
-	      (hwcap & PPC_FEATURE_ARCH_2_05)
-		? __bzero_power6
-            : __bzero_ppc);
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
index 9832f366bb..01890367a4 100644
--- a/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c
@@ -73,14 +73,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 			      __memset_power6)
 	      IFUNC_IMPL_ADD (array, i, memset, 1, __memset_ppc))
 
-  /* Support sysdeps/powerpc/powerpc32/power4/multiarch/bzero.c.  */
-  IFUNC_IMPL (i, name, bzero,
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_HAS_VSX,
-			      __bzero_power7)
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_ARCH_2_05,
-			      __bzero_power6)
-	      IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
-
   /* Support sysdeps/powerpc/powerpc32/power4/multiarch/strlen.c.  */
   IFUNC_IMPL (i, name, strlen,
 	      IFUNC_IMPL_ADD (array, i, strlen, hwcap & PPC_FEATURE_HAS_VSX,
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 09/12] powerpc: Remove powerpc64 bzero optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (7 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 08/12] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 10/12] s390: Remove " Adhemerval Zanella
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
 sysdeps/powerpc/powerpc64/bzero.S             | 20 -------
 sysdeps/powerpc/powerpc64/le/power10/memset.S | 12 -----
 sysdeps/powerpc/powerpc64/memset.S            | 13 -----
 sysdeps/powerpc/powerpc64/multiarch/bzero.c   | 54 -------------------
 .../powerpc64/multiarch/ifunc-impl-list.c     | 21 --------
 .../powerpc64/multiarch/memset-power10.S      |  3 --
 .../powerpc64/multiarch/memset-power4.S       |  3 --
 .../powerpc64/multiarch/memset-power6.S       |  3 --
 .../powerpc64/multiarch/memset-power7.S       |  2 -
 .../powerpc64/multiarch/memset-power8.S       |  3 --
 .../powerpc64/multiarch/memset-ppc64.S        | 16 +-----
 sysdeps/powerpc/powerpc64/power4/memset.S     | 12 -----
 sysdeps/powerpc/powerpc64/power6/memset.S     | 12 -----
 sysdeps/powerpc/powerpc64/power7/memset.S     | 12 -----
 sysdeps/powerpc/powerpc64/power8/memset.S     | 12 -----
 15 files changed, 1 insertion(+), 197 deletions(-)
 delete mode 100644 sysdeps/powerpc/powerpc64/bzero.S
 delete mode 100644 sysdeps/powerpc/powerpc64/multiarch/bzero.c

diff --git a/sysdeps/powerpc/powerpc64/bzero.S b/sysdeps/powerpc/powerpc64/bzero.S
deleted file mode 100644
index a7ca73cc39..0000000000
--- a/sysdeps/powerpc/powerpc64/bzero.S
+++ /dev/null
@@ -1,20 +0,0 @@
-/* Optimized bzero `implementation' for PowerPC64.
-   Copyright (C) 1997-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* This code was moved into memset.S to solve a double stub call problem.
-   @local would have worked but it is not supported in PowerPC64 asm.  */
diff --git a/sysdeps/powerpc/powerpc64/le/power10/memset.S b/sysdeps/powerpc/powerpc64/le/power10/memset.S
index bee6d8b31b..0f43b002bf 100644
--- a/sysdeps/powerpc/powerpc64/le/power10/memset.S
+++ b/sysdeps/powerpc/powerpc64/le/power10/memset.S
@@ -242,15 +242,3 @@ L(bcdz_tail):
 
 END_GEN_TB (MEMSET,TB_TOCLESS)
 libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
-   between bzero and memset.  */
-ENTRY_TOCLESS (__bzero)
-	CALL_MCOUNT 2
-	mr	r5,r4
-	li	r4,0
-	b	L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/memset.S b/sysdeps/powerpc/powerpc64/memset.S
index 34ee8ffca4..b813cd3c6b 100644
--- a/sysdeps/powerpc/powerpc64/memset.S
+++ b/sysdeps/powerpc/powerpc64/memset.S
@@ -253,16 +253,3 @@ L(medium_28t):
 	blr
 END_GEN_TB (MEMSET,TB_TOCLESS)
 libc_hidden_builtin_def (memset)
-
-#ifndef NO_BZERO_IMPL
-/* Copied from bzero.S to prevent the linker from inserting a stub
-   between bzero and memset.  */
-ENTRY (__bzero)
-	CALL_MCOUNT 3
-	mr	r5,r4
-	li	r4,0
-	b	L(_memset)
-END_GEN_TB (__bzero,TB_TOCLESS)
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/bzero.c b/sysdeps/powerpc/powerpc64/multiarch/bzero.c
deleted file mode 100644
index f83d6da55b..0000000000
--- a/sysdeps/powerpc/powerpc64/multiarch/bzero.c
+++ /dev/null
@@ -1,54 +0,0 @@
-/* Multiple versions of bzero. PowerPC64 version.
-   Copyright (C) 2013-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* Define multiple versions only for definition in libc.  */
-#if IS_IN (libc)
-# include <string.h>
-# include <strings.h>
-# include "init-arch.h"
-
-extern __typeof (bzero) __bzero_ppc attribute_hidden;
-extern __typeof (bzero) __bzero_power4 attribute_hidden;
-extern __typeof (bzero) __bzero_power6 attribute_hidden;
-extern __typeof (bzero) __bzero_power7 attribute_hidden;
-extern __typeof (bzero) __bzero_power8 attribute_hidden;
-# ifdef __LITTLE_ENDIAN__
-extern __typeof (bzero) __bzero_power10 attribute_hidden;
-# endif
-
-libc_ifunc (__bzero,
-# ifdef __LITTLE_ENDIAN__
-	    (hwcap2 & PPC_FEATURE2_ARCH_3_1
-	     && hwcap2 & PPC_FEATURE2_HAS_ISEL
-	     && hwcap & PPC_FEATURE_HAS_VSX)
-	    ? __bzero_power10 :
-# endif
-	    (hwcap2 & PPC_FEATURE2_ARCH_2_07
-	     && hwcap & PPC_FEATURE_HAS_ALTIVEC)
-            ? __bzero_power8 :
-	      (hwcap & PPC_FEATURE_HAS_VSX)
-	      ? __bzero_power7 :
-		(hwcap & PPC_FEATURE_ARCH_2_05
-		 && hwcap & PPC_FEATURE_HAS_ALTIVEC)
-		? __bzero_power6 :
-		  (hwcap & PPC_FEATURE_POWER4)
-		  ? __bzero_power4
-            : __bzero_ppc);
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
index 280b8616b2..ac533a9886 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -223,27 +223,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 			      __memcmp_power4)
 	      IFUNC_IMPL_ADD (array, i, memcmp, 1, __memcmp_ppc))
 
-  /* Support sysdeps/powerpc/powerpc64/multiarch/bzero.c.  */
-  IFUNC_IMPL (i, name, bzero,
-#ifdef __LITTLE_ENDIAN__
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      hwcap2 & PPC_FEATURE2_ARCH_3_1
-			      && hwcap2 & PPC_FEATURE2_HAS_ISEL
-			      && hwcap & PPC_FEATURE_HAS_VSX,
-			      __bzero_power10)
-#endif
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap2 & PPC_FEATURE2_ARCH_2_07
-			      && hwcap & PPC_FEATURE_HAS_ALTIVEC,
-			      __bzero_power8)
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_HAS_VSX,
-			      __bzero_power7)
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_ARCH_2_05
-			      && hwcap & PPC_FEATURE_HAS_ALTIVEC,
-			      __bzero_power6)
-	      IFUNC_IMPL_ADD (array, i, bzero, hwcap & PPC_FEATURE_POWER4,
-			      __bzero_power4)
-	      IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ppc))
-
   /* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c.  */
   IFUNC_IMPL (i, name, mempcpy,
 	      IFUNC_IMPL_ADD (array, i, mempcpy,
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
index ead0b67926..ba5bee1c7a 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power10.S
@@ -21,7 +21,4 @@
 #undef libc_hidden_builtin_def
 #define libc_hidden_builtin_def(name)
 
-#undef __bzero
-#define __bzero __bzero_power10
-
 #include <sysdeps/powerpc/powerpc64/le/power10/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
index 6f5631d03d..4ee567c6f9 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power4.S
@@ -21,7 +21,4 @@
 #undef libc_hidden_builtin_def
 #define libc_hidden_builtin_def(name)
 
-#undef __bzero
-#define __bzero __bzero_power4
-
 #include <sysdeps/powerpc/powerpc64/power4/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
index b81f4f0d64..9f5e7d1b37 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power6.S
@@ -21,7 +21,4 @@
 #undef libc_hidden_builtin_def
 #define libc_hidden_builtin_def(name)
 
-#undef __bzero
-#define __bzero __bzero_power6
-
 #include <sysdeps/powerpc/powerpc64/power6/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
index a8ca12db83..6fd92d5afc 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power7.S
@@ -21,6 +21,4 @@
 #undef libc_hidden_builtin_def
 #define libc_hidden_builtin_def(name)
 
-#undef __bzero
-#define __bzero __bzero_power7
 #include <sysdeps/powerpc/powerpc64/power7/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S b/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
index b06587aa2d..43cc5c7339 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-power8.S
@@ -21,7 +21,4 @@
 #undef libc_hidden_builtin_def
 #define libc_hidden_builtin_def(name)
 
-#undef __bzero
-#define __bzero __bzero_power8
-
 #include <sysdeps/powerpc/powerpc64/power8/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S b/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
index 876954d36b..30b25ef15f 100644
--- a/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
+++ b/sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S
@@ -1,4 +1,4 @@
-/* Default memset/bzero implementation for PowerPC64.
+/* Default memset implementation for PowerPC64.
    Copyright (C) 2013-2022 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -18,17 +18,6 @@
 
 #include <sysdep.h>
 
-/* Copied from bzero.S to prevent the linker from inserting a stub
-   between bzero and memset.  NOTE: this code should be positioned
-   before ENTRY/END_GEN_TB redefinition.  */
-ENTRY (__bzero_ppc)
-        CALL_MCOUNT 3
-        mr      r5,r4
-        li      r4,0
-        b       L(_memset)
-END_GEN_TB (__bzero_ppc,TB_TOCLESS)
-
-
 #if defined SHARED && IS_IN (libc)
 # define MEMSET __memset_ppc
 
@@ -36,7 +25,4 @@ END_GEN_TB (__bzero_ppc,TB_TOCLESS)
 # define libc_hidden_builtin_def(name)
 #endif
 
-/* Do not implement __bzero at powerpc64/memset.S.  */
-#define NO_BZERO_IMPL
-
 #include <sysdeps/powerpc/powerpc64/memset.S>
diff --git a/sysdeps/powerpc/powerpc64/power4/memset.S b/sysdeps/powerpc/powerpc64/power4/memset.S
index dfc136261b..0f14a5198a 100644
--- a/sysdeps/powerpc/powerpc64/power4/memset.S
+++ b/sysdeps/powerpc/powerpc64/power4/memset.S
@@ -237,15 +237,3 @@ L(medium_28t):
 	blr
 END_GEN_TB (MEMSET,TB_TOCLESS)
 libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
-   between bzero and memset.  */
-ENTRY_TOCLESS (__bzero)
-	CALL_MCOUNT 3
-	mr	r5,r4
-	li	r4,0
-	b	L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power6/memset.S b/sysdeps/powerpc/powerpc64/power6/memset.S
index 7ad82c38e6..140a756348 100644
--- a/sysdeps/powerpc/powerpc64/power6/memset.S
+++ b/sysdeps/powerpc/powerpc64/power6/memset.S
@@ -381,15 +381,3 @@ L(medium_28t):
 	blr
 END_GEN_TB (MEMSET,TB_TOCLESS)
 libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
-   between bzero and memset.  */
-ENTRY_TOCLESS (__bzero)
-	CALL_MCOUNT 3
-	mr	r5,r4
-	li	r4,0
-	b	L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power7/memset.S b/sysdeps/powerpc/powerpc64/power7/memset.S
index 31aa0f91cf..358199a805 100644
--- a/sysdeps/powerpc/powerpc64/power7/memset.S
+++ b/sysdeps/powerpc/powerpc64/power7/memset.S
@@ -384,15 +384,3 @@ L(small):
 
 END_GEN_TB (MEMSET,TB_TOCLESS)
 libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
-   between bzero and memset.  */
-ENTRY_TOCLESS (__bzero)
-	CALL_MCOUNT 3
-	mr	r5,r4
-	li	r4,0
-	b	L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/power8/memset.S b/sysdeps/powerpc/powerpc64/power8/memset.S
index 9ecb6f3067..70cace14ef 100644
--- a/sysdeps/powerpc/powerpc64/power8/memset.S
+++ b/sysdeps/powerpc/powerpc64/power8/memset.S
@@ -504,15 +504,3 @@ L(LE7_tail5):
 
 END_GEN_TB (MEMSET,TB_TOCLESS)
 libc_hidden_builtin_def (memset)
-
-/* Copied from bzero.S to prevent the linker from inserting a stub
-   between bzero and memset.  */
-ENTRY_TOCLESS (__bzero)
-	CALL_MCOUNT 3
-	mr	r5,r4
-	li	r4,0
-	b	L(_memset)
-END (__bzero)
-#ifndef __bzero
-weak_alias (__bzero, bzero)
-#endif
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 10/12] s390: Remove bzero optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (8 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 09/12] powerpc: Remove powerpc64 " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 11/12] i686: " Adhemerval Zanella
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
 sysdeps/s390/Makefile                    |  2 +-
 sysdeps/s390/bzero.c                     | 47 ------------------------
 sysdeps/s390/ifunc-memset.h              |  9 -----
 sysdeps/s390/memset-z900.S               | 32 +---------------
 sysdeps/s390/multiarch/ifunc-impl-list.c | 15 --------
 5 files changed, 2 insertions(+), 103 deletions(-)
 delete mode 100644 sysdeps/s390/bzero.c

diff --git a/sysdeps/s390/Makefile b/sysdeps/s390/Makefile
index ade8663218..5b6a96579c 100644
--- a/sysdeps/s390/Makefile
+++ b/sysdeps/s390/Makefile
@@ -66,7 +66,7 @@ endif
 endif
 
 ifeq ($(subdir),string)
-sysdep_routines += bzero memset memset-z900 \
+sysdep_routines += memset memset-z900 \
 		   memcmp memcmp-z900 \
 		   mempcpy memcpy memcpy-z900 \
 		   memmove memmove-c \
diff --git a/sysdeps/s390/bzero.c b/sysdeps/s390/bzero.c
deleted file mode 100644
index 1f0a03e2ed..0000000000
--- a/sysdeps/s390/bzero.c
+++ /dev/null
@@ -1,47 +0,0 @@
-/* Multiple versions of bzero.
-   Copyright (C) 2018-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <ifunc-memset.h>
-#if HAVE_MEMSET_IFUNC
-# include <string.h>
-# include <ifunc-resolve.h>
-
-# if HAVE_MEMSET_Z900_G5
-extern __typeof (__bzero) BZERO_Z900_G5 attribute_hidden;
-# endif
-
-# if HAVE_MEMSET_Z10
-extern __typeof (__bzero) BZERO_Z10 attribute_hidden;
-# endif
-
-# if HAVE_MEMSET_Z196
-extern __typeof (__bzero) BZERO_Z196 attribute_hidden;
-# endif
-
-s390_libc_ifunc_expr (__bzero, __bzero,
-		      ({
-			s390_libc_ifunc_expr_stfle_init ();
-			(HAVE_MEMSET_Z196 && S390_IS_Z196 (stfle_bits))
-			  ? BZERO_Z196
-			  : (HAVE_MEMSET_Z10 && S390_IS_Z10 (stfle_bits))
-			  ? BZERO_Z10
-			  : BZERO_DEFAULT;
-		      })
-		      )
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/s390/ifunc-memset.h b/sysdeps/s390/ifunc-memset.h
index db15df9bc1..7098332e92 100644
--- a/sysdeps/s390/ifunc-memset.h
+++ b/sysdeps/s390/ifunc-memset.h
@@ -25,19 +25,16 @@
 
 #if defined HAVE_S390_MIN_Z196_ZARCH_ASM_SUPPORT
 # define MEMSET_DEFAULT		MEMSET_Z196
-# define BZERO_DEFAULT		BZERO_Z196
 # define HAVE_MEMSET_Z900_G5	0
 # define HAVE_MEMSET_Z10	0
 # define HAVE_MEMSET_Z196	1
 #elif defined HAVE_S390_MIN_Z10_ZARCH_ASM_SUPPORT
 # define MEMSET_DEFAULT		MEMSET_Z10
-# define BZERO_DEFAULT		BZERO_Z10
 # define HAVE_MEMSET_Z900_G5	0
 # define HAVE_MEMSET_Z10	1
 # define HAVE_MEMSET_Z196	HAVE_MEMSET_IFUNC
 #else
 # define MEMSET_DEFAULT		MEMSET_Z900_G5
-# define BZERO_DEFAULT		BZERO_Z900_G5
 # define HAVE_MEMSET_Z900_G5	1
 # define HAVE_MEMSET_Z10	HAVE_MEMSET_IFUNC
 # define HAVE_MEMSET_Z196	HAVE_MEMSET_IFUNC
@@ -51,24 +48,18 @@
 
 #if HAVE_MEMSET_Z900_G5
 # define MEMSET_Z900_G5		__memset_default
-# define BZERO_Z900_G5		__bzero_default
 #else
 # define MEMSET_Z900_G5		NULL
-# define BZERO_Z900_G5		NULL
 #endif
 
 #if HAVE_MEMSET_Z10
 # define MEMSET_Z10		__memset_z10
-# define BZERO_Z10		__bzero_z10
 #else
 # define MEMSET_Z10		NULL
-# define BZERO_Z10		NULL
 #endif
 
 #if HAVE_MEMSET_Z196
 # define MEMSET_Z196		__memset_z196
-# define BZERO_Z196		__bzero_z196
 #else
 # define MEMSET_Z196		NULL
-# define BZERO_Z196		NULL
 #endif
diff --git a/sysdeps/s390/memset-z900.S b/sysdeps/s390/memset-z900.S
index d454743f75..7adb466bb1 100644
--- a/sysdeps/s390/memset-z900.S
+++ b/sysdeps/s390/memset-z900.S
@@ -24,11 +24,7 @@
 /* INPUT PARAMETERS - MEMSET
      %r2 = address of memory area
      %r3 = byte to fill memory with
-     %r4 = number of bytes to fill.
-
-   INPUT PARAMETERS - BZERO
-     %r2 = address of memory area
-     %r3 = number of bytes to fill.  */
+     %r4 = number of bytes to fill.  */
 
        .text
 
@@ -47,12 +43,6 @@
 #  define BRCTG	brct
 # endif /* ! defined __s390x__  */
 
-ENTRY(BZERO_Z900_G5)
-	LGR	%r4,%r3
-	xr	%r3,%r3
-	j	.L_Z900_G5_start
-END(BZERO_Z900_G5)
-
 ENTRY(MEMSET_Z900_G5)
 .L_Z900_G5_start:
 #if defined __s390x__
@@ -100,14 +90,6 @@ END(MEMSET_Z900_G5)
 #endif /*  HAVE_MEMSET_Z900_G5  */
 
 #if HAVE_MEMSET_Z10
-ENTRY(BZERO_Z10)
-	.machine "z10"
-	.machinemode "zarch_nohighgprs"
-	lgr	%r4,%r3
-	xr	%r3,%r3
-	j	.L_Z10_start
-END(BZERO_Z10)
-
 ENTRY(MEMSET_Z10)
 .L_Z10_start:
 	.machine "z10"
@@ -141,14 +123,6 @@ END(MEMSET_Z10)
 #endif /* HAVE_MEMSET_Z10  */
 
 #if HAVE_MEMSET_Z196
-ENTRY(BZERO_Z196)
-	.machine "z196"
-	.machinemode "zarch_nohighgprs"
-	lgr	%r4,%r3
-	xr	%r3,%r3
-	j	.L_Z196_start
-END(BZERO_Z196)
-
 ENTRY(MEMSET_Z196)
 .L_Z196_start:
 	.machine "z196"
@@ -204,10 +178,6 @@ END(__memset_mvcle)
 /* If we don't use ifunc, define an alias for memset here.
    Otherwise see sysdeps/s390/memset.c.  */
 strong_alias (MEMSET_DEFAULT, memset)
-/* Same for bzero.  If ifunc is used, see
-   sysdeps/s390/bzero.c.  */
-strong_alias (BZERO_DEFAULT, __bzero)
-weak_alias (__bzero, bzero)
 #endif
 
 #if defined SHARED && IS_IN (libc)
diff --git a/sysdeps/s390/multiarch/ifunc-impl-list.c b/sysdeps/s390/multiarch/ifunc-impl-list.c
index 29598c2a6e..c1902b2c26 100644
--- a/sysdeps/s390/multiarch/ifunc-impl-list.c
+++ b/sysdeps/s390/multiarch/ifunc-impl-list.c
@@ -102,21 +102,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 # endif
 # if HAVE_MEMSET_Z900_G5
 	      IFUNC_IMPL_ADD (array, i, memset, 1, MEMSET_Z900_G5)
-# endif
-	      )
-
-  /* Note: bzero is implemented in memset.  */
-  IFUNC_IMPL (i, name, bzero,
-# if HAVE_MEMSET_Z196
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      S390_IS_Z196 (stfle_bits), BZERO_Z196)
-# endif
-# if HAVE_MEMSET_Z10
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      S390_IS_Z10 (stfle_bits), BZERO_Z10)
-# endif
-# if HAVE_MEMSET_Z900_G5
-	      IFUNC_IMPL_ADD (array, i, bzero, 1, BZERO_Z900_G5)
 # endif
 	      )
 #endif /* HAVE_MEMSET_IFUNC */
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 11/12] i686: Remove bzero optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (9 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 10/12] s390: Remove " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-10 19:58 ` [PATCH 12/12] x86_64: " Adhemerval Zanella
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
 sysdeps/i386/bzero.S                          |  5 ---
 sysdeps/i386/i586/bzero.S                     |  4 --
 sysdeps/i386/i586/memset.S                    | 16 ++------
 sysdeps/i386/i686/bzero.S                     |  4 --
 sysdeps/i386/i686/memset.S                    | 23 +++---------
 sysdeps/i386/i686/multiarch/Makefile          |  6 +--
 sysdeps/i386/i686/multiarch/bzero-ia32.S      | 37 -------------------
 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S  |  3 --
 sysdeps/i386/i686/multiarch/bzero-sse2.S      |  3 --
 sysdeps/i386/i686/multiarch/bzero.c           | 32 ----------------
 sysdeps/i386/i686/multiarch/ifunc-impl-list.c |  8 ----
 sysdeps/i386/i686/multiarch/memset-sse2-rep.S | 24 +++---------
 sysdeps/i386/i686/multiarch/memset-sse2.S     | 24 +++---------
 sysdeps/i386/memset.S                         | 14 +------
 14 files changed, 22 insertions(+), 181 deletions(-)
 delete mode 100644 sysdeps/i386/bzero.S
 delete mode 100644 sysdeps/i386/i586/bzero.S
 delete mode 100644 sysdeps/i386/i686/bzero.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero-ia32.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero-sse2.S
 delete mode 100644 sysdeps/i386/i686/multiarch/bzero.c

diff --git a/sysdeps/i386/bzero.S b/sysdeps/i386/bzero.S
deleted file mode 100644
index c8dd47b4da..0000000000
--- a/sysdeps/i386/bzero.S
+++ /dev/null
@@ -1,5 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include "memset.S"
-
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i586/bzero.S b/sysdeps/i386/i586/bzero.S
deleted file mode 100644
index 2a106719a4..0000000000
--- a/sysdeps/i386/i586/bzero.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include <sysdeps/i386/i586/memset.S>
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i586/memset.S b/sysdeps/i386/i586/memset.S
index ae09c3b40a..672af41398 100644
--- a/sysdeps/i386/i586/memset.S
+++ b/sysdeps/i386/i586/memset.S
@@ -23,15 +23,11 @@
 #define PARMS	4+4	/* space for 1 saved reg */
 #define RTN	PARMS
 #define DEST	RTN
-#ifdef USE_AS_BZERO
-# define LEN	DEST+4
-#else
-# define CHR	DEST+4
-# define LEN	CHR+4
-#endif
+#define CHR	DEST+4
+#define LEN	CHR+4
 
         .text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
 ENTRY (__memset_chk)
 	movl	12(%esp), %eax
 	cmpl	%eax, 16(%esp)
@@ -46,15 +42,11 @@ ENTRY (memset)
 	movl	DEST(%esp), %edi
 	cfi_rel_offset (edi, 0)
 	movl	LEN(%esp), %edx
-#ifdef USE_AS_BZERO
-	xorl	%eax, %eax	/* we fill with 0 */
-#else
 	movb	CHR(%esp), %al
 	movb	%al, %ah
 	movl	%eax, %ecx
 	shll	$16, %eax
 	movw	%cx, %ax
-#endif
 	cld
 
 /* If less than 36 bytes to write, skip tricky code (it wouldn't work).  */
@@ -100,10 +92,8 @@ L(2):	shrl	$2, %ecx	/* convert byte count to longword count */
 	rep
 	stosb
 
-#ifndef USE_AS_BZERO
 	/* Load result (only if used as memset).  */
 	movl DEST(%esp), %eax	/* start address of destination is result */
-#endif
 	popl	%edi
 	cfi_adjust_cfa_offset (-4)
 	cfi_restore (edi)
diff --git a/sysdeps/i386/i686/bzero.S b/sysdeps/i386/i686/bzero.S
deleted file mode 100644
index c7898f18e0..0000000000
--- a/sysdeps/i386/i686/bzero.S
+++ /dev/null
@@ -1,4 +0,0 @@
-#define USE_AS_BZERO
-#define memset __bzero
-#include <sysdeps/i386/i686/memset.S>
-weak_alias (__bzero, bzero)
diff --git a/sysdeps/i386/i686/memset.S b/sysdeps/i386/i686/memset.S
index fd5b26aeae..3cb86c016d 100644
--- a/sysdeps/i386/i686/memset.S
+++ b/sysdeps/i386/i686/memset.S
@@ -21,18 +21,13 @@
 #include "asm-syntax.h"
 
 #define PARMS	4+4	/* space for 1 saved reg */
-#ifdef USE_AS_BZERO
-# define DEST	PARMS
-# define LEN	DEST+4
-#else
-# define RTN	PARMS
-# define DEST	RTN
-# define CHR	DEST+4
-# define LEN	CHR+4
-#endif
+#define RTN	PARMS
+#define DEST	RTN
+#define CHR	DEST+4
+#define LEN	CHR+4
 
         .text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
 ENTRY_CHK (__memset_chk)
 	movl	12(%esp), %eax
 	cmpl	%eax, 16(%esp)
@@ -46,11 +41,7 @@ ENTRY (memset)
 	cfi_adjust_cfa_offset (4)
 	movl	DEST(%esp), %edx
 	movl	LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
-	xorl	%eax, %eax	/* fill with 0 */
-#else
 	movzbl	CHR(%esp), %eax
-#endif
 	jecxz	1f
 	movl	%edx, %edi
 	cfi_rel_offset (edi, 0)
@@ -70,9 +61,7 @@ ENTRY (memset)
 2:	movl	%ecx, %edx
 	shrl	$2, %ecx
 	andl	$3, %edx
-#ifndef USE_AS_BZERO
 	imul	$0x01010101, %eax
-#endif
 	rep
 	stosl
 	movl	%edx, %ecx
@@ -80,9 +69,7 @@ ENTRY (memset)
 	stosb
 
 1:
-#ifndef USE_AS_BZERO
 	movl DEST(%esp), %eax	/* start address of destination is result */
-#endif
 	popl	%edi
 	cfi_adjust_cfa_offset (-4)
 	cfi_restore (edi)
diff --git a/sysdeps/i386/i686/multiarch/Makefile b/sysdeps/i386/i686/multiarch/Makefile
index 02fa02658e..9fe5ea8639 100644
--- a/sysdeps/i386/i686/multiarch/Makefile
+++ b/sysdeps/i386/i686/multiarch/Makefile
@@ -1,9 +1,9 @@
 ifeq ($(subdir),string)
 gen-as-const-headers += locale-defines.sym
-sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
+sysdep_routines += memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
 		   memmove-ssse3 memcpy-ssse3-rep mempcpy-ssse3-rep \
 		   memmove-ssse3-rep \
-		   memset-sse2-rep bzero-sse2-rep strcmp-ssse3 \
+		   memset-sse2-rep strcmp-ssse3 \
 		   strcmp-sse4 strncmp-c strncmp-ssse3 strncmp-sse4 \
 		   memcmp-ssse3 memcmp-sse4 varshift \
 		   strlen-sse2 strlen-sse2-bsf strncpy-c strcpy-ssse3 \
@@ -21,7 +21,7 @@ sysdep_routines += bzero-sse2 memset-sse2 memcpy-ssse3 mempcpy-ssse3 \
 		   memcpy-sse2-unaligned \
 		   mempcpy-sse2-unaligned memmove-sse2-unaligned \
 		   strcspn-c strpbrk-c strspn-c \
-		   bzero-ia32 rawmemchr-ia32 \
+		   rawmemchr-ia32 \
 		   memchr-ia32 memcmp-ia32 memcpy-ia32 memmove-ia32 \
 		   mempcpy-ia32 memset-ia32 strcat-ia32 strchr-ia32 \
 		   strrchr-ia32 strcpy-ia32 strcmp-ia32 strcspn-ia32 \
diff --git a/sysdeps/i386/i686/multiarch/bzero-ia32.S b/sysdeps/i386/i686/multiarch/bzero-ia32.S
deleted file mode 100644
index 96afe9bad1..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-ia32.S
+++ /dev/null
@@ -1,37 +0,0 @@
-/* bzero optimized for i686.
-   Copyright (C) 2017-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <sysdep.h>
-
-#if IS_IN (libc)
-# define __bzero __bzero_ia32
-
-# ifdef SHARED
-#  undef libc_hidden_builtin_def
-/* IFUNC doesn't work with the hidden functions in shared library since
-   they will be called without setting up EBX needed for PLT which is
-   used by IFUNC.  */
-#  define libc_hidden_builtin_def(name) \
-	.globl __GI___bzero; __GI___bzero = __bzero
-# endif
-
-# undef weak_alias
-# define weak_alias(original, alias)
-
-# include <sysdeps/i386/i686/bzero.S>
-#endif
diff --git a/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S b/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
deleted file mode 100644
index 507b288bb3..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-sse2-rep.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BZERO
-#define __memset_sse2_rep __bzero_sse2_rep
-#include "memset-sse2-rep.S"
diff --git a/sysdeps/i386/i686/multiarch/bzero-sse2.S b/sysdeps/i386/i686/multiarch/bzero-sse2.S
deleted file mode 100644
index 8d04512e4e..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero-sse2.S
+++ /dev/null
@@ -1,3 +0,0 @@
-#define USE_AS_BZERO
-#define __memset_sse2 __bzero_sse2
-#include "memset-sse2.S"
diff --git a/sysdeps/i386/i686/multiarch/bzero.c b/sysdeps/i386/i686/multiarch/bzero.c
deleted file mode 100644
index 7fd0ddd576..0000000000
--- a/sysdeps/i386/i686/multiarch/bzero.c
+++ /dev/null
@@ -1,32 +0,0 @@
-/* Multiple versions of bzero.
-   All versions must be listed in ifunc-impl-list.c.
-   Copyright (C) 2017-2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* Define multiple versions only for the definition in libc.  */
-#if IS_IN (libc)
-# define bzero __redirect_bzero
-# include <string.h>
-# undef bzero
-
-# define SYMBOL_NAME bzero
-# include "ifunc-memset.h"
-
-libc_ifunc_redirected (__redirect_bzero, __bzero, IFUNC_SELECTOR ());
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
index 5c7a42dc97..c014f52bf9 100644
--- a/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
+++ b/sysdeps/i386/i686/multiarch/ifunc-impl-list.c
@@ -36,14 +36,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 
   size_t i = 0;
 
-  /* Support sysdeps/i386/i686/multiarch/bzero.S.  */
-  IFUNC_IMPL (i, name, bzero,
-	      IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
-			      __bzero_sse2_rep)
-	      IFUNC_IMPL_ADD (array, i, bzero, CPU_FEATURE_USABLE (SSE2),
-			      __bzero_sse2)
-	      IFUNC_IMPL_ADD (array, i, bzero, 1, __bzero_ia32))
-
   /* Support sysdeps/i386/i686/multiarch/memchr.S.  */
   IFUNC_IMPL (i, name, memchr,
 	      IFUNC_IMPL_ADD (array, i, memchr, CPU_FEATURE_USABLE (SSE2),
diff --git a/sysdeps/i386/i686/multiarch/memset-sse2-rep.S b/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
index 37a10575e7..28df7836e0 100644
--- a/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
+++ b/sysdeps/i386/i686/multiarch/memset-sse2-rep.S
@@ -32,16 +32,10 @@
 #define PUSH(REG)	pushl REG; CFI_PUSH (REG)
 #define POP(REG)	popl REG; CFI_POP (REG)
 
-#ifdef USE_AS_BZERO
-# define DEST		PARMS
-# define LEN		DEST+4
-# define SETRTNVAL
-#else
-# define DEST		PARMS
-# define CHR		DEST+4
-# define LEN		CHR+4
-# define SETRTNVAL	movl DEST(%esp), %eax
-#endif
+#define DEST		PARMS
+#define CHR		DEST+4
+#define LEN		CHR+4
+#define SETRTNVAL	movl DEST(%esp), %eax
 
 #ifdef PIC
 # define ENTRANCE	PUSH (%ebx);
@@ -78,7 +72,7 @@
 #endif
 
 	.section .text.sse2,"ax",@progbits
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
 ENTRY (__memset_chk_sse2_rep)
 	movl	12(%esp), %eax
 	cmpl	%eax, 16(%esp)
@@ -89,16 +83,12 @@ ENTRY (__memset_sse2_rep)
 	ENTRANCE
 
 	movl	LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
-	xor	%eax, %eax
-#else
 	movzbl	CHR(%esp), %eax
 	movb	%al, %ah
 	/* Fill the whole EAX with pattern.  */
 	movl	%eax, %edx
 	shl	$16, %eax
 	or	%edx, %eax
-#endif
 	movl	DEST(%esp), %edx
 	cmp	$32, %ecx
 	jae	L(32bytesormore)
@@ -228,12 +218,8 @@ L(write_3bytes):
 /* ECX > 32 and EDX is 4 byte aligned.  */
 L(32bytesormore):
 	/* Fill xmm0 with the pattern.  */
-#ifdef USE_AS_BZERO
-	pxor	%xmm0, %xmm0
-#else
 	movd	%eax, %xmm0
 	pshufd	$0, %xmm0, %xmm0
-#endif
 	testl	$0xf, %edx
 	jz	L(aligned_16)
 /* ECX > 32 and EDX is not 16 byte aligned.  */
diff --git a/sysdeps/i386/i686/multiarch/memset-sse2.S b/sysdeps/i386/i686/multiarch/memset-sse2.S
index 455519c7ac..4e8414fd51 100644
--- a/sysdeps/i386/i686/multiarch/memset-sse2.S
+++ b/sysdeps/i386/i686/multiarch/memset-sse2.S
@@ -32,16 +32,10 @@
 #define PUSH(REG)	pushl REG; CFI_PUSH (REG)
 #define POP(REG)	popl REG; CFI_POP (REG)
 
-#ifdef USE_AS_BZERO
-# define DEST		PARMS
-# define LEN		DEST+4
-# define SETRTNVAL
-#else
-# define DEST		PARMS
-# define CHR		DEST+4
-# define LEN		CHR+4
-# define SETRTNVAL	movl DEST(%esp), %eax
-#endif
+#define DEST		PARMS
+#define CHR		DEST+4
+#define LEN		CHR+4
+#define SETRTNVAL	movl DEST(%esp), %eax
 
 #ifdef PIC
 # define ENTRANCE	PUSH (%ebx);
@@ -78,7 +72,7 @@
 #endif
 
 	.section .text.sse2,"ax",@progbits
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
 ENTRY (__memset_chk_sse2)
 	movl	12(%esp), %eax
 	cmpl	%eax, 16(%esp)
@@ -89,16 +83,12 @@ ENTRY (__memset_sse2)
 	ENTRANCE
 
 	movl	LEN(%esp), %ecx
-#ifdef USE_AS_BZERO
-	xor	%eax, %eax
-#else
 	movzbl	CHR(%esp), %eax
 	movb	%al, %ah
 	/* Fill the whole EAX with pattern.  */
 	movl	%eax, %edx
 	shl	$16, %eax
 	or	%edx, %eax
-#endif
 	movl	DEST(%esp), %edx
 	cmp	$32, %ecx
 	jae	L(32bytesormore)
@@ -228,12 +218,8 @@ L(write_3bytes):
 /* ECX > 32 and EDX is 4 byte aligned.  */
 L(32bytesormore):
 	/* Fill xmm0 with the pattern.  */
-#ifdef USE_AS_BZERO
-	pxor	%xmm0, %xmm0
-#else
 	movd	%eax, %xmm0
 	pshufd	$0, %xmm0, %xmm0
-#endif
 	testl	$0xf, %edx
 	jz	L(aligned_16)
 /* ECX > 32 and EDX is not 16 byte aligned.  */
diff --git a/sysdeps/i386/memset.S b/sysdeps/i386/memset.S
index f470511b64..db2753eb2f 100644
--- a/sysdeps/i386/memset.S
+++ b/sysdeps/i386/memset.S
@@ -30,15 +30,11 @@
 #define POP(REG)	popl REG; CFI_POP (REG)
 
 #define STR1  8
-#ifdef USE_AS_BZERO
-#define N     STR1+4
-#else
 #define STR2  STR1+4
 #define N     STR2+4
-#endif
 
 	.text
-#if defined SHARED && IS_IN (libc) && !defined USE_AS_BZERO
+#if defined SHARED && IS_IN (libc)
 ENTRY (__memset_chk)
 	movl	12(%esp), %eax
 	cmpl	%eax, 16(%esp)
@@ -49,20 +45,12 @@ ENTRY (memset)
 	PUSH    (%edi)
 	movl	N(%esp), %ecx
 	movl	STR1(%esp), %edi
-#ifdef USE_AS_BZERO
-	xor	%eax, %eax
-#else
 	movzbl	STR2(%esp), %eax
 	mov	%edi, %edx
-#endif
 	rep	stosb
-#ifndef USE_AS_BZERO
 	mov	%edx, %eax
-#endif
 	POP     (%edi)
 	ret
 END (memset)
 
-#ifndef USE_AS_BZERO
 libc_hidden_builtin_def (memset)
-#endif
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 12/12] x86_64: Remove bzero optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (10 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 11/12] i686: " Adhemerval Zanella
@ 2022-02-10 19:58 ` Adhemerval Zanella
  2022-02-14 14:29 ` [PATCH 00/12] Remove bcopy and " Florian Weimer
  2022-02-21 16:39 ` Szabolcs Nagy
  13 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-10 19:58 UTC (permalink / raw)
  To: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The symbol is not present in current POSIX specification and compiler
already generates memset call.
---
 sysdeps/x86_64/bzero.S                        |   1 -
 sysdeps/x86_64/memset.S                       |  10 +-
 sysdeps/x86_64/multiarch/Makefile             |   1 -
 sysdeps/x86_64/multiarch/bzero.c              | 106 ------------------
 sysdeps/x86_64/multiarch/ifunc-impl-list.c    |  42 -------
 .../memset-avx2-unaligned-erms-rtm.S          |   1 -
 .../multiarch/memset-avx2-unaligned-erms.S    |   6 -
 .../multiarch/memset-avx512-unaligned-erms.S  |   3 -
 .../multiarch/memset-evex-unaligned-erms.S    |   3 -
 .../multiarch/memset-sse2-unaligned-erms.S    |   5 -
 .../multiarch/memset-vec-unaligned-erms.S     |  56 +--------
 11 files changed, 2 insertions(+), 232 deletions(-)
 delete mode 100644 sysdeps/x86_64/bzero.S
 delete mode 100644 sysdeps/x86_64/multiarch/bzero.c

diff --git a/sysdeps/x86_64/bzero.S b/sysdeps/x86_64/bzero.S
deleted file mode 100644
index f96d567fd8..0000000000
--- a/sysdeps/x86_64/bzero.S
+++ /dev/null
@@ -1 +0,0 @@
-/* Implemented in memset.S.  */
diff --git a/sysdeps/x86_64/memset.S b/sysdeps/x86_64/memset.S
index af26e9cedc..a6eea61a4d 100644
--- a/sysdeps/x86_64/memset.S
+++ b/sysdeps/x86_64/memset.S
@@ -1,4 +1,4 @@
-/* memset/bzero -- set memory area to CH/0
+/* memset -- set memory area to CH/0
    Optimized version for x86-64.
    Copyright (C) 2002-2022 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
@@ -35,9 +35,6 @@
   punpcklwd %xmm0, %xmm0; \
   pshufd $0, %xmm0, %xmm0
 
-# define BZERO_ZERO_VEC0() \
-  pxor %xmm0, %xmm0
-
 # define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
   movd d, %xmm0; \
   pshufd $0, %xmm0, %xmm0; \
@@ -56,10 +53,6 @@
 # define MEMSET_SYMBOL(p,s)	memset
 #endif
 
-#ifndef BZERO_SYMBOL
-# define BZERO_SYMBOL(p,s)	__bzero
-#endif
-
 #ifndef WMEMSET_SYMBOL
 # define WMEMSET_CHK_SYMBOL(p,s) p
 # define WMEMSET_SYMBOL(p,s)	__wmemset
@@ -70,7 +63,6 @@
 libc_hidden_builtin_def (memset)
 
 #if IS_IN (libc)
-weak_alias (__bzero, bzero)
 libc_hidden_def (__wmemset)
 weak_alias (__wmemset, wmemset)
 libc_hidden_weak (wmemset)
diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile
index e7b413edad..4274bfdd0d 100644
--- a/sysdeps/x86_64/multiarch/Makefile
+++ b/sysdeps/x86_64/multiarch/Makefile
@@ -1,7 +1,6 @@
 ifeq ($(subdir),string)
 
 sysdep_routines += \
-  bzero \
   memchr-avx2 \
   memchr-avx2-rtm \
   memchr-evex \
diff --git a/sysdeps/x86_64/multiarch/bzero.c b/sysdeps/x86_64/multiarch/bzero.c
deleted file mode 100644
index 58a14b2c33..0000000000
--- a/sysdeps/x86_64/multiarch/bzero.c
+++ /dev/null
@@ -1,106 +0,0 @@
-/* Multiple versions of bzero.
-   All versions must be listed in ifunc-impl-list.c.
-   Copyright (C) 2022 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-/* Define multiple versions only for the definition in libc.  */
-#if IS_IN (libc)
-# define __bzero __redirect___bzero
-# include <string.h>
-# undef __bzero
-
-# define SYMBOL_NAME __bzero
-# include <init-arch.h>
-
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (sse2_unaligned)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (sse2_unaligned_erms)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned) attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_erms)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_rtm)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx2_unaligned_erms_rtm)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (evex_unaligned)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (evex_unaligned_erms)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx512_unaligned)
-  attribute_hidden;
-extern __typeof (REDIRECT_NAME) OPTIMIZE1 (avx512_unaligned_erms)
-  attribute_hidden;
-
-static inline void *
-IFUNC_SELECTOR (void)
-{
-  const struct cpu_features* cpu_features = __get_cpu_features ();
-
-  if (CPU_FEATURE_USABLE_P (cpu_features, AVX512F)
-      && !CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_AVX512))
-    {
-      if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)
-          && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)
-          && CPU_FEATURE_USABLE_P (cpu_features, BMI2))
-	{
-	  if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
-	    return OPTIMIZE1 (avx512_unaligned_erms);
-
-	  return OPTIMIZE1 (avx512_unaligned);
-	}
-    }
-
-  if (CPU_FEATURE_USABLE_P (cpu_features, AVX2))
-    {
-      if (CPU_FEATURE_USABLE_P (cpu_features, AVX512VL)
-          && CPU_FEATURE_USABLE_P (cpu_features, AVX512BW)
-          && CPU_FEATURE_USABLE_P (cpu_features, BMI2))
-	{
-	  if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
-	    return OPTIMIZE1 (evex_unaligned_erms);
-
-	  return OPTIMIZE1 (evex_unaligned);
-	}
-
-      if (CPU_FEATURE_USABLE_P (cpu_features, RTM))
-	{
-	  if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
-	    return OPTIMIZE1 (avx2_unaligned_erms_rtm);
-
-	  return OPTIMIZE1 (avx2_unaligned_rtm);
-	}
-
-      if (!CPU_FEATURES_ARCH_P (cpu_features, Prefer_No_VZEROUPPER))
-	{
-	  if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
-	    return OPTIMIZE1 (avx2_unaligned_erms);
-
-	  return OPTIMIZE1 (avx2_unaligned);
-	}
-    }
-
-  if (CPU_FEATURE_USABLE_P (cpu_features, ERMS))
-    return OPTIMIZE1 (sse2_unaligned_erms);
-
-  return OPTIMIZE1 (sse2_unaligned);
-}
-
-libc_ifunc_redirected (__redirect___bzero, __bzero, IFUNC_SELECTOR ());
-
-weak_alias (__bzero, bzero)
-#endif
diff --git a/sysdeps/x86_64/multiarch/ifunc-impl-list.c b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
index a594f4176e..68a56797d4 100644
--- a/sysdeps/x86_64/multiarch/ifunc-impl-list.c
+++ b/sysdeps/x86_64/multiarch/ifunc-impl-list.c
@@ -300,48 +300,6 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
 			      __memset_avx512_no_vzeroupper)
 	     )
 
-  /* Support sysdeps/x86_64/multiarch/bzero.c.  */
-  IFUNC_IMPL (i, name, bzero,
-	      IFUNC_IMPL_ADD (array, i, bzero, 1,
-			      __bzero_sse2_unaligned)
-	      IFUNC_IMPL_ADD (array, i, bzero, 1,
-			      __bzero_sse2_unaligned_erms)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      CPU_FEATURE_USABLE (AVX2),
-			      __bzero_avx2_unaligned)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      CPU_FEATURE_USABLE (AVX2),
-			      __bzero_avx2_unaligned_erms)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      (CPU_FEATURE_USABLE (AVX2)
-			       && CPU_FEATURE_USABLE (RTM)),
-			      __bzero_avx2_unaligned_rtm)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      (CPU_FEATURE_USABLE (AVX2)
-			       && CPU_FEATURE_USABLE (RTM)),
-			      __bzero_avx2_unaligned_erms_rtm)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      (CPU_FEATURE_USABLE (AVX512VL)
-			       && CPU_FEATURE_USABLE (AVX512BW)
-			       && CPU_FEATURE_USABLE (BMI2)),
-			      __bzero_evex_unaligned)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      (CPU_FEATURE_USABLE (AVX512VL)
-			       && CPU_FEATURE_USABLE (AVX512BW)
-			       && CPU_FEATURE_USABLE (BMI2)),
-			      __bzero_evex_unaligned_erms)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      (CPU_FEATURE_USABLE (AVX512VL)
-			       && CPU_FEATURE_USABLE (AVX512BW)
-			       && CPU_FEATURE_USABLE (BMI2)),
-			      __bzero_avx512_unaligned_erms)
-	      IFUNC_IMPL_ADD (array, i, bzero,
-			      (CPU_FEATURE_USABLE (AVX512VL)
-			       && CPU_FEATURE_USABLE (AVX512BW)
-			       && CPU_FEATURE_USABLE (BMI2)),
-			      __bzero_avx512_unaligned)
-	     )
-
   /* Support sysdeps/x86_64/multiarch/rawmemchr.c.  */
   IFUNC_IMPL (i, name, rawmemchr,
 	      IFUNC_IMPL_ADD (array, i, rawmemchr,
diff --git a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
index 5a5ee6f672..8ac3e479bb 100644
--- a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
+++ b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms-rtm.S
@@ -5,7 +5,6 @@
 
 #define SECTION(p) p##.avx.rtm
 #define MEMSET_SYMBOL(p,s)	p##_avx2_##s##_rtm
-#define BZERO_SYMBOL(p,s)	p##_avx2_##s##_rtm
 #define WMEMSET_SYMBOL(p,s)	p##_avx2_##s##_rtm
 
 #include "memset-avx2-unaligned-erms.S"
diff --git a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
index a093a2831f..c0bf2875d0 100644
--- a/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
@@ -14,9 +14,6 @@
   vmovd d, %xmm0; \
   movq r, %rax;
 
-# define BZERO_ZERO_VEC0() \
-  vpxor %xmm0, %xmm0, %xmm0
-
 # define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
   MEMSET_SET_VEC0_AND_SET_RETURN(d, r)
 
@@ -32,9 +29,6 @@
 # ifndef MEMSET_SYMBOL
 #  define MEMSET_SYMBOL(p,s)	p##_avx2_##s
 # endif
-# ifndef BZERO_SYMBOL
-#  define BZERO_SYMBOL(p,s)	p##_avx2_##s
-# endif
 # ifndef WMEMSET_SYMBOL
 #  define WMEMSET_SYMBOL(p,s)	p##_avx2_##s
 # endif
diff --git a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
index 727c92133a..5241216a77 100644
--- a/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
@@ -19,9 +19,6 @@
   vpbroadcastb d, %VEC0; \
   movq r, %rax
 
-# define BZERO_ZERO_VEC0() \
-  vpxorq %XMM0, %XMM0, %XMM0
-
 # define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
   vpbroadcastd d, %VEC0; \
   movq r, %rax
diff --git a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
index 5d8fa78f05..6370021506 100644
--- a/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-evex-unaligned-erms.S
@@ -19,9 +19,6 @@
   vpbroadcastb d, %VEC0; \
   movq r, %rax
 
-# define BZERO_ZERO_VEC0() \
-  vpxorq %XMM0, %XMM0, %XMM0
-
 # define WMEMSET_SET_VEC0_AND_SET_RETURN(d, r) \
   vpbroadcastd d, %VEC0; \
   movq r, %rax
diff --git a/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
index 329c58ee46..684cc248d7 100644
--- a/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-sse2-unaligned-erms.S
@@ -22,7 +22,6 @@
 
 #if IS_IN (libc)
 # define MEMSET_SYMBOL(p,s)	p##_sse2_##s
-# define BZERO_SYMBOL(p,s)	MEMSET_SYMBOL (p, s)
 # define WMEMSET_SYMBOL(p,s)	p##_sse2_##s
 
 # ifdef SHARED
@@ -30,10 +29,6 @@
 #  define libc_hidden_builtin_def(name)
 # endif
 
-# undef weak_alias
-# define weak_alias(original, alias) \
-	.weak bzero; bzero = __bzero
-
 # undef strong_alias
 # define strong_alias(ignored1, ignored2)
 #endif
diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
index 7c94fcdae1..a018077df0 100644
--- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
+++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S
@@ -1,4 +1,4 @@
-/* memset/bzero with unaligned store and rep stosb
+/* memset with unaligned store and rep stosb
    Copyright (C) 2016-2022 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -26,10 +26,6 @@
 
 #include <sysdep.h>
 
-#ifndef BZERO_SYMBOL
-# define BZERO_SYMBOL(p,s)		MEMSET_SYMBOL (p, s)
-#endif
-
 #ifndef MEMSET_CHK_SYMBOL
 # define MEMSET_CHK_SYMBOL(p,s)		MEMSET_SYMBOL(p, s)
 #endif
@@ -133,31 +129,6 @@ ENTRY (WMEMSET_SYMBOL (__wmemset, unaligned))
 END (WMEMSET_SYMBOL (__wmemset, unaligned))
 #endif
 
-ENTRY (BZERO_SYMBOL(__bzero, unaligned))
-#if VEC_SIZE > 16
-	BZERO_ZERO_VEC0 ()
-#endif
-	mov	%RDI_LP, %RAX_LP
-	mov	%RSI_LP, %RDX_LP
-#ifndef USE_LESS_VEC_MASK_STORE
-	xorl	%esi, %esi
-#endif
-	cmp	$VEC_SIZE, %RDX_LP
-	jb	L(less_vec_no_vdup)
-#ifdef USE_LESS_VEC_MASK_STORE
-	xorl	%esi, %esi
-#endif
-#if VEC_SIZE <= 16
-	BZERO_ZERO_VEC0 ()
-#endif
-	cmp	$(VEC_SIZE * 2), %RDX_LP
-	ja	L(more_2x_vec)
-	/* From VEC and to 2 * VEC.  No branch when size == VEC_SIZE.  */
-	VMOVU	%VEC(0), (%rdi)
-	VMOVU	%VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
-	VZEROUPPER_RETURN
-END (BZERO_SYMBOL(__bzero, unaligned))
-
 #if defined SHARED && IS_IN (libc)
 ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned))
 	cmp	%RDX_LP, %RCX_LP
@@ -215,31 +186,6 @@ END (__memset_erms)
 END (MEMSET_SYMBOL (__memset, erms))
 # endif
 
-ENTRY_P2ALIGN (BZERO_SYMBOL(__bzero, unaligned_erms), 6)
-# if VEC_SIZE > 16
-	BZERO_ZERO_VEC0 ()
-# endif
-	mov	%RDI_LP, %RAX_LP
-	mov	%RSI_LP, %RDX_LP
-# ifndef USE_LESS_VEC_MASK_STORE
-	xorl	%esi, %esi
-# endif
-	cmp	$VEC_SIZE, %RDX_LP
-	jb	L(less_vec_no_vdup)
-# ifdef USE_LESS_VEC_MASK_STORE
-	xorl	%esi, %esi
-# endif
-# if VEC_SIZE <= 16
-	BZERO_ZERO_VEC0 ()
-# endif
-	cmp	$(VEC_SIZE * 2), %RDX_LP
-	ja	L(stosb_more_2x_vec)
-	/* From VEC and to 2 * VEC.  No branch when size == VEC_SIZE.  */
-	VMOVU	%VEC(0), (%rdi)
-	VMOVU	%VEC(0), (VEC_SIZE * -1)(%rdi, %rdx)
-	VZEROUPPER_RETURN
-END (BZERO_SYMBOL(__bzero, unaligned_erms))
-
 # if defined SHARED && IS_IN (libc)
 ENTRY_CHK (MEMSET_CHK_SYMBOL (__memset_chk, unaligned_erms))
 	cmp	%RDX_LP, %RCX_LP
-- 
2.32.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (11 preceding siblings ...)
  2022-02-10 19:58 ` [PATCH 12/12] x86_64: " Adhemerval Zanella
@ 2022-02-14 14:29 ` Florian Weimer
  2022-02-14 14:41   ` Adhemerval Zanella
  2022-02-21 16:39 ` Szabolcs Nagy
  13 siblings, 1 reply; 17+ messages in thread
From: Florian Weimer @ 2022-02-14 14:29 UTC (permalink / raw)
  To: Adhemerval Zanella via Libc-alpha
  Cc: Wilco Dijkstra, H . J . Lu, Noah Goldstein, Adhemerval Zanella

* Adhemerval Zanella via Libc-alpha:

> On a recent Linux distro (Ubuntu 21.04), I see only 1 'bcmp' call
> (which is already aliased to memcmp):

Clang turns memcmp for equality (so basically __memcmpeq) into bcmp.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
  2022-02-14 14:29 ` [PATCH 00/12] Remove bcopy and " Florian Weimer
@ 2022-02-14 14:41   ` Adhemerval Zanella
  0 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-14 14:41 UTC (permalink / raw)
  To: Florian Weimer, Adhemerval Zanella via Libc-alpha
  Cc: Wilco Dijkstra, H . J . Lu, Noah Goldstein



On 14/02/2022 11:29, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> On a recent Linux distro (Ubuntu 21.04), I see only 1 'bcmp' call
>> (which is already aliased to memcmp):
> 
> Clang turns memcmp for equality (so basically __memcmpeq) into bcmp.

bcmp has the advantage we can alias to memcmp, so there is no need to
actually provide all the ifunc machinery. 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
  2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
                   ` (12 preceding siblings ...)
  2022-02-14 14:29 ` [PATCH 00/12] Remove bcopy and " Florian Weimer
@ 2022-02-21 16:39 ` Szabolcs Nagy
  2022-02-22 15:40   ` Adhemerval Zanella
  13 siblings, 1 reply; 17+ messages in thread
From: Szabolcs Nagy @ 2022-02-21 16:39 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein

The 02/10/2022 16:58, Adhemerval Zanella via Libc-alpha wrote:
> Both symbols are marked as legacy in POSIX.1-2001 and removed on
> POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE
> or _DEFAULT_SOURCE.
> 
> Most architectures just route bcopy/bzero to internal memmove/memset
> implementation, however some do implement iFUNC variants when memset
> or memmove are also provided through iFUNC.
> 
> However, gcc already replaces bcopy with a memmove and bzero with memset
> on default configuration (to actually get a bstring libc call the code
> requires to omit string.h inclusion and built with --fno-builtin), so
> it is highly unlikely programs are actually calling libc bcopy or
> bzero symbols.
...
> So there is point in keeping such optimization.
> 
> Adhemerval Zanella (12):
>   ia64: Remove bcopy
>   powerpc: Remove bcopy optimizations
>   i386: Remove bcopy optimizations
>   x86_64: Remove bcopy optimizations
>   alpha: Remove bzero optimization
>   ia64: Remove bzero optimization
>   Remove bzero optimization
>   powerpc: Remove powerpc32 bzero optimizations
>   powerpc: Remove powerpc64 bzero optimizations
>   s390: Remove bzero optimizations
>   i686: Remove bzero optimizations
>   x86_64: Remove bzero optimizations

i see this does not affect aarch64, but i agree with the principle.

(there was a comment about the x86 bzero code that if __memsetzero
is accepted then it's easier to rename the bzero optimization instead
of removing and readding.)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 00/12] Remove bcopy and bzero optimizations
  2022-02-21 16:39 ` Szabolcs Nagy
@ 2022-02-22 15:40   ` Adhemerval Zanella
  0 siblings, 0 replies; 17+ messages in thread
From: Adhemerval Zanella @ 2022-02-22 15:40 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha, Wilco Dijkstra, H . J . Lu, Noah Goldstein



On 21/02/2022 13:39, Szabolcs Nagy wrote:
> The 02/10/2022 16:58, Adhemerval Zanella via Libc-alpha wrote:
>> Both symbols are marked as legacy in POSIX.1-2001 and removed on
>> POSIX.1-2008, although the prototypes are defined for _GNU_SOURCE
>> or _DEFAULT_SOURCE.
>>
>> Most architectures just route bcopy/bzero to internal memmove/memset
>> implementation, however some do implement iFUNC variants when memset
>> or memmove are also provided through iFUNC.
>>
>> However, gcc already replaces bcopy with a memmove and bzero with memset
>> on default configuration (to actually get a bstring libc call the code
>> requires to omit string.h inclusion and built with --fno-builtin), so
>> it is highly unlikely programs are actually calling libc bcopy or
>> bzero symbols.
> ...
>> So there is point in keeping such optimization.
>>
>> Adhemerval Zanella (12):
>>   ia64: Remove bcopy
>>   powerpc: Remove bcopy optimizations
>>   i386: Remove bcopy optimizations
>>   x86_64: Remove bcopy optimizations
>>   alpha: Remove bzero optimization
>>   ia64: Remove bzero optimization
>>   Remove bzero optimization
>>   powerpc: Remove powerpc32 bzero optimizations
>>   powerpc: Remove powerpc64 bzero optimizations
>>   s390: Remove bzero optimizations
>>   i686: Remove bzero optimizations
>>   x86_64: Remove bzero optimizations
> 
> i see this does not affect aarch64, but i agree with the principle.
> 
> (there was a comment about the x86 bzero code that if __memsetzero
> is accepted then it's easier to rename the bzero optimization instead
> of removing and readding.)

I will exclude the last patch that touches x86_64 and commit the rest.
From last discussions on both maillist and weekly we still need to
get consensus on __memsetzero addition.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-02-22 15:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10 19:58 [PATCH 00/12] Remove bcopy and bzero optimizations Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 01/12] ia64: Remove bcopy Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 02/12] powerpc: Remove bcopy optimizations Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 03/12] i386: " Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 04/12] x86_64: " Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 05/12] alpha: Remove bzero optimization Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 06/12] ia64: " Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 07/12] " Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 08/12] powerpc: Remove powerpc32 bzero optimizations Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 09/12] powerpc: Remove powerpc64 " Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 10/12] s390: Remove " Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 11/12] i686: " Adhemerval Zanella
2022-02-10 19:58 ` [PATCH 12/12] x86_64: " Adhemerval Zanella
2022-02-14 14:29 ` [PATCH 00/12] Remove bcopy and " Florian Weimer
2022-02-14 14:41   ` Adhemerval Zanella
2022-02-21 16:39 ` Szabolcs Nagy
2022-02-22 15:40   ` Adhemerval Zanella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).