public inbox for libc-ports@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: 405/440/464/476 support and optimizations
@ 2010-09-02 17:34 Luis Machado
  2010-09-03 14:45 ` Ryan Arnold
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Machado @ 2010-09-02 17:34 UTC (permalink / raw)
  To: libc-ports; +Cc: rsa, Todd Iglehart, Josh Boyer

Hi,

This patch adds powerpc 405/440/464/476 platforms to ports and adds 3
memory (memcpy,memcmp,memset) optimizations and 4 string function
(strcmp,strncmp,strcpy,strlen) optimizations (provided by Todd, copied),
placed under 405, so all those platforms can use those optimized
functions.

The patch also adds the required Makefile, sysdeps structure and Implies
files.

Is this OK?

Regards,
Luis

2010-09-02  Todd Iglehart  <iglehart@us.ibm.com>
	    Ryan Arnold  <rsa@us.ibm.com>
	    Luis Machado  <luisgpm@br.ibm.com>

	* sysdeps/powerpc/dl-procinfo.c: New file.
	* sysdeps/powerpc/dl-procinfo.h: New file.
	* sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
	* sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
	* sysdeps/powerpc/powerpc32/405/memset.S: New file.
	* sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
	* sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
	* sysdeps/powerpc/powerpc32/405/strlen.S: New file.
	* sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
	* sysdeps/powerpc/powerpc32/440/Implies: New file.
	* sysdeps/powerpc/powerpc32/464/Implies: New file.
	* sysdeps/powerpc/powerpc32/476/Implies: New file.
	* sysdeps/powerpc/powerpc32/Makefile: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.

diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
new file mode 100644
index 0000000..60fb465
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.c
@@ -0,0 +1,96 @@
+/* Data for processor capability information.  PowerPC version.
+   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+/* This information must be kept in sync with the _DL_HWCAP_COUNT and
+   _DL_PLATFORM_COUNT definitions in procinfo.h.
+
+   If anything should be added here check whether the size of each string
+   is still ok with the given array size.
+
+   All the #ifdefs in the definitions are quite irritating but
+   necessary if we want to avoid duplicating the information.  There
+   are three different modes:
+
+   - PROCINFO_DECL is defined.  This means we are only interested in
+     declarations.
+
+   - PROCINFO_DECL is not defined:
+
+     + if SHARED is defined the file is included in an array
+       initializer.  The .element = { ... } syntax is needed.
+
+     + if SHARED is not defined a normal array initialization is
+       needed.
+  */
+
+#ifndef PROCINFO_CLASS
+# define PROCINFO_CLASS
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+  ._dl_powerpc_cap_flags
+#else
+PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
+#endif
+#ifndef PROCINFO_DECL
+= {
+    "vsx",
+    "arch_2_06", "power6x", "dfp", "pa6t",
+    "arch_2_05", "ic_snoop", "smt", "booke",
+    "cellbe", "power5+", "power5", "power4",
+    "notb", "efpdouble", "efpsingle", "spe",
+    "ucache", "4xxmac", "mmu", "fpu",
+    "altivec", "ppc601", "ppc64", "ppc32",
+  }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+  ._dl_powerpc_platforms
+#else
+PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
+#endif
+#ifndef PROCINFO_DECL
+= {
+    [PPC_PLATFORM_POWER4] = "power4",
+    [PPC_PLATFORM_PPC970] = "ppc970",
+    [PPC_PLATFORM_POWER5] = "power5",
+    [PPC_PLATFORM_POWER5_PLUS] = "power5+",
+    [PPC_PLATFORM_POWER6] = "power6",
+    [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
+    [PPC_PLATFORM_POWER6X] = "power6x",
+    [PPC_PLATFORM_POWER7] = "power7",
+    [PPC_PLATFORM_PPC405] = "ppc405",
+    [PPC_PLATFORM_PPC440] = "ppc440",
+    [PPC_PLATFORM_PPC464] = "ppc464",
+    [PPC_PLATFORM_PPC476] = "ppc476"
+  }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#undef PROCINFO_DECL
+#undef PROCINFO_CLASS
diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
new file mode 100644
index 0000000..87279de
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.h
@@ -0,0 +1,168 @@
+/* Processor capability information handling macros.  PowerPC version.
+   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_PROCINFO_H
+#define _DL_PROCINFO_H	1
+
+#include <ldsodefs.h>
+#include <sysdep.h>		/* This defines the PPC_FEATURE_* macros.  */
+
+/* There are 25 bits used, but they are bits 7..31.  */
+#define _DL_HWCAP_FIRST		7
+#define _DL_HWCAP_COUNT		32
+
+/* These bits influence library search.  */
+#define HWCAP_IMPORTANT		(PPC_FEATURE_HAS_ALTIVEC \
+				+ PPC_FEATURE_HAS_DFP)
+
+#define _DL_PLATFORMS_COUNT	12
+
+#define _DL_FIRST_PLATFORM	32
+/* Mask to filter out platforms.  */
+#define _DL_HWCAP_PLATFORM      (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
+				<< _DL_FIRST_PLATFORM)
+
+/* Platform bits (relative to _DL_FIRST_PLATFORM).  */
+#define PPC_PLATFORM_POWER4	      0
+#define PPC_PLATFORM_PPC970	      1
+#define PPC_PLATFORM_POWER5	      2
+#define PPC_PLATFORM_POWER5_PLUS      3
+#define PPC_PLATFORM_POWER6	      4
+#define PPC_PLATFORM_CELL_BE	      5
+#define PPC_PLATFORM_POWER6X	      6
+#define PPC_PLATFORM_POWER7	      7
+#define PPC_PLATFORM_PPC405	      8
+#define PPC_PLATFORM_PPC440	      9
+#define PPC_PLATFORM_PPC464	      10
+#define PPC_PLATFORM_PPC476	      11
+
+static inline const char *
+__attribute__ ((unused))
+_dl_hwcap_string (int idx)
+{
+  return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
+}
+
+static inline const char *
+__attribute__ ((unused))
+_dl_platform_string (int idx)
+{
+  return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
+}
+
+static inline int
+__attribute__ ((unused))
+_dl_string_hwcap (const char *str)
+{
+  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+    if (strcmp (str, _dl_hwcap_string (i)) == 0)
+      return i;
+  return -1;
+}
+
+static inline int
+__attribute__ ((unused, always_inline))
+_dl_string_platform (const char *str)
+{
+  if (str == NULL)
+    return -1;
+
+  if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
+    {
+      int ret;
+      str += 5;
+      switch (*str)
+	{
+	case '4':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
+	  break;
+	case '5':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
+	  if (str[1] == '+')
+	    {
+	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
+	      ++str;
+	    }
+	  break;
+	case '6':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
+	  if (str[1] == 'x')
+	    {
+	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
+	      ++str;
+	    }
+	  break;
+	case '7':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
+	  break;
+	default:
+	  return -1;
+	}
+      if (str[1] == '\0')
+	return ret;
+    }
+  else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
+		    3) == 0)
+    {
+      if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
+			   + 3) == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
+    }
+
+  return -1;
+}
+
+#ifdef IS_IN_rtld
+static inline int
+__attribute__ ((unused))
+_dl_procinfo (int word)
+{
+  _dl_printf ("AT_HWCAP:       ");
+
+  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+    if (word & (1 << i))
+      _dl_printf (" %s", _dl_hwcap_string (i));
+
+  _dl_printf ("\n");
+
+  return 0;
+}
+#endif
+
+#endif /* dl-procinfo.h */
diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
new file mode 100644
index 0000000..c0314e6
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
@@ -0,0 +1,132 @@
+/* Optimized memcmp implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcmp
+
+	r3:source1 address, return equality
+	r4:source2 address
+	r5:byte count
+
+	Check 2 words from src1 and src2. If unequal jump to end and
+	return src1 > src2 or src1 < src2.
+	If count = zero check bytes before zero counter and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM (memcmp), 5, 0)
+	srwi.	r6,r5,5
+	beq	L(preword2_count_loop)
+	mtctr	r6
+	clrlwi	r5,r5,27
+
+L(word8_compare_loop):
+	lwz	r10,0(r3)
+	lwz	r6,4(r3)
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	lwz	r10,8(r3)
+	lwz	r6,12(r3)
+	lwz	r8,8(r4)
+	lwz	r9,12(r4)
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	lwz	r10,16(r3)
+	lwz	r6,20(r3)
+	lwz	r8,16(r4)
+	lwz	r9,20(r4)
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	lwz	r10,24(r3)
+	lwz	r6,28(r3)
+	addi	r3,r3,0x20
+	lwz	r8,24(r4)
+	lwz	r9,28(r4)
+	addi	r4,r4,0x20
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	bdnz	L(word8_compare_loop)
+
+L(preword2_count_loop):
+	srwi.	r6,r5,3
+	beq	L(prebyte_count_loop)
+	mtctr	r6
+	clrlwi  r5,r5,29
+
+L(word2_count_loop):
+	lwz	r10,0(r3)
+	lwz	r6,4(r3)
+	addi	r3,r3,0x08
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	addi	r4,r4,0x08
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	bdnz	L(word2_count_loop)
+
+L(prebyte_count_loop):
+	addi	r5,r5,1
+	mtctr	r5
+	bdz	L(end_memcmp)
+
+L(byte_count_loop):
+	lbz	r6,0(r3)
+	addi	r3,r3,0x01
+	lbz	r8,0(r4)
+	addi	r4,r4,0x01
+	cmplw	cr5,r8,r6
+	bne	cr5,L(st2)
+	bdnz	L(byte_count_loop)
+
+L(end_memcmp):
+	addi	r3,r0,0
+	blr
+
+L(l_r):
+	addi	r3,r0,1
+	blr
+
+L(st1):
+	blt	cr1,L(l_r)
+	addi	r3,r0,-1
+	blr
+
+L(st2):
+	blt	cr5,L(l_r)
+	addi	r3,r0,-1
+	blr
+END (BP_SYM (memcmp))
+libc_hidden_builtin_def (memcmp)
+weak_alias (memcmp,bcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
new file mode 100644
index 0000000..777d3db
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
@@ -0,0 +1,134 @@
+/* Optimized memcpy implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcpy
+
+	r0:return address
+	r3:destination address
+	r4:source address
+	r5:byte count
+
+	Save return address in r0.
+	If destinationn and source are unaligned and copy count is greater than 256
+	then copy 0-3 bytes to make destination aligned.
+	If 32 or more bytes to copy we use 32 byte copy loop.
+	Finaly we copy 0-31 extra bytes. */
+
+EALIGN (BP_SYM (memcpy), 5, 0)
+/* Check if bytes to copy are greater than 256 and if
+	source and destination are unaligned */
+	cmpwi	r5,0x0100
+	addi	r0,r3,0
+	ble	L(string_count_loop)
+	neg	r6,r3
+	clrlwi. r6,r6,30
+	beq	L(string_count_loop)
+	neg	r6,r4
+	clrlwi. r6,r6,30
+	beq	L(string_count_loop)
+	mtctr	r6
+	subf	r5,r6,r5
+
+L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
+	lbz	r8,0x0(r4)
+	addi	r4,r4,1
+	stb	r8,0x0(r3)
+	addi	r3,r3,1
+	bdnz	L(unaligned_bytecopy_loop)
+	srwi.	r7,r5,5
+	beq	L(preword2_count_loop)
+	mtctr	r7
+
+L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+	lwz	r6,0(r4)
+	lwz	r7,4(r4)
+	lwz	r8,8(r4)
+	lwz	r9,12(r4)
+	subi	r5,r5,0x20
+	stw	r6,0(r3)
+	stw	r7,4(r3)
+	stw	r8,8(r3)
+	stw	r9,12(r3)
+	lwz	r6,16(r4)
+	lwz	r7,20(r4)
+	lwz	r8,24(r4)
+	lwz	r9,28(r4)
+	addi	r4,r4,0x20
+	stw	r6,16(r3)
+	stw	r7,20(r3)
+	stw	r8,24(r3)
+	stw	r9,28(r3)
+	addi	r3,r3,0x20
+	bdnz	L(word8_count_loop_no_dcbt)
+
+L(preword2_count_loop): /* Copy remaining 0-31 bytes */
+	clrlwi. r12,r5,27
+	beq	L(end_memcpy)
+	mtxer	r12
+	lswx	r5,0,r4
+	stswx	r5,0,r3
+	mr	 r3,r0
+	blr
+
+L(string_count_loop): /* Copy odd 0-31 bytes */
+	clrlwi. r12,r5,28
+	add	r3,r3,r5
+	add	r4,r4,r5
+	beq	L(pre_string_copy)
+	mtxer	r12
+	subf	r4,r12,r4
+	subf	r3,r12,r3
+	lswx	r6,0,r4
+	stswx	r6,0,r3
+
+L(pre_string_copy): /* Check how many 32 byte chunck to copy */
+	srwi.	r7,r5,4
+	beq	L(end_memcpy)
+	mtctr	r7
+
+L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+	lwz	r6,-4(r4)
+	lwz	r7,-8(r4)
+	lwz	r8,-12(r4)
+	lwzu	r9,-16(r4)
+	stw	r6,-4(r3)
+	stw	r7,-8(r3)
+	stw	r8,-12(r3)
+	stwu	r9,-16(r3)
+	bdz	L(end_memcpy)
+	lwz	r6,-4(r4)
+	lwz	r7,-8(r4)
+	lwz	r8,-12(r4)
+	lwzu	r9,-16(r4)
+	stw	r6,-4(r3)
+	stw	r7,-8(r3)
+	stw	r8,-12(r3)
+	stwu	r9,-16(r3)
+	bdnz	L(word4_count_loop_no_dcbt)
+
+L(end_memcpy):
+	mr	 r3,r0
+	blr
+END (BP_SYM (memcpy))
+libc_hidden_builtin_def (memcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
new file mode 100644
index 0000000..10b0f6e
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memset.S
@@ -0,0 +1,156 @@
+/* Optimized memset implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memset
+
+	r3:destination address and return address
+	r4:source integer to copy
+	r5:byte count
+	r11:sources integer to copy in all 32 bits of reg
+	r12:temp return address
+
+	Save return address in r12
+	If destinationn is unaligned and count is greater tha 255 bytes
+	set 0-3 bytes to make destination aligned
+	If count is greater tha 255 bytes and setting zero to memory
+	use dbcz to set memeory when we can
+	otherwsie do the follwoing
+	If 16 or more words to set we use 16 word copy loop.
+	Finaly we set 0-15 extra bytes with string store. */
+
+EALIGN (BP_SYM (memset), 5, 0)
+	rlwinm	r11,r4,0,24,31
+	rlwimi	r11,r4,8,16,23
+	rlwimi	r11,r11,16,0,15
+	addi	r12,r3,0
+	cmpwi	r5,0x00FF
+	ble	L(preword8_count_loop)
+	cmpwi	r4,0x00
+	beq	L(use_dcbz)
+	neg	r6,r3
+	clrlwi.	r6,r6,30
+	beq	L(preword8_count_loop)
+	addi	r8,0,1
+	mtctr	r6
+	subi	r3,r3,1
+
+L(unaligned_bytecopy_loop):
+	stbu	r11,0x1(r3)
+	subf.	r5,r8,r5
+	beq	L(end_memset)
+	bdnz	L(unaligned_bytecopy_loop)
+	addi	r3,r3,1
+
+L(preword8_count_loop):
+	srwi.	r6,r5,4
+	beq	L(preword2_count_loop)
+	mtctr	r6
+	addi	r3,r3,-4
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+
+L(word8_count_loop_no_dcbt):
+	stwu	r8,4(r3)
+	stwu	r9,4(r3)
+	subi	r5,r5,0x10
+	stwu	r10,4(r3)
+	stwu	r11,4(r3)
+	bdnz	L(word8_count_loop_no_dcbt)
+	addi	r3,r3,4
+
+L(preword2_count_loop):
+	clrlwi.	r7,r5,28
+	beq	L(end_memset)
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+	mtxer	r7
+	stswx	r8,0,r3
+
+L(end_memset):
+	addi	r3,r12,0
+	blr
+
+L(use_dcbz):
+	neg	r6,r3
+	clrlwi.	r7,r6,28
+	beq	L(skip_string_loop)
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+	subf	r5,r7,r5
+	mtxer	r7
+	stswx	r8,0,r3
+	add	r3,r3,r7
+
+L(skip_string_loop):
+	clrlwi	r8,r6,25
+	srwi.	r8,r8,4
+	beq	L(dcbz_pre_loop)
+	mtctr	r8
+
+L(word_loop):
+	stw	r11,0(r3)
+	subi	r5,r5,0x10
+	stw	r11,4(r3)
+	stw	r11,8(r3)
+	stw	r11,12(r3)
+	addi	r3,r3,0x10
+	bdnz	L(word_loop)
+
+L(dcbz_pre_loop):
+	srwi	r6,r5,7
+	mtctr	r6
+	addi	r7,0,0
+
+L(dcbz_loop):
+	dcbz	r3,r7
+	addi	r3,r3,0x80
+	subi	r5,r5,0x80
+	bdnz	L(dcbz_loop)
+	srwi.	r6,r5,4
+	beq	L(postword2_count_loop)
+	mtctr	r6
+
+L(postword8_count_loop):
+	stw	r11,0(r3)
+	subi	r5,r5,0x10
+	stw	r11,4(r3)
+	stw	r11,8(r3)
+	stw	r11,12(r3)
+	addi	r3,r3,0x10
+	bdnz	L(postword8_count_loop)
+
+L(postword2_count_loop):
+	clrlwi.	r7,r5,28
+	beq	L(end_memset)
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+	mtxer	r7
+	stswx	r8,0,r3
+	b	L(end_memset)
+END (BP_SYM (memset))
+libc_hidden_builtin_def (memset)
diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
new file mode 100644
index 0000000..79a80f1
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
@@ -0,0 +1,138 @@
+/* Optimized strcmp implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcmp
+
+	Register Use
+	r0:temp return equality
+	r3:source1 address, return equality
+	r4:source2 address
+
+	Implementation description
+	Check 2 words from src1 and src2. If unequal jump to end and
+	return src1 > src2 or src1 < src2.
+	If null check bytes before null and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strcmp),5,0)
+	neg	r7,r3
+	clrlwi	r7,r7,20
+	neg	r8,r4
+	clrlwi	r8,r8,20
+	srwi.	r7,r7,5
+	beq	L(byte_loop)
+	srwi.	r8,r8,5
+	beq	L(byte_loop)
+	cmplw	r7,r8
+	mtctr	r7
+	ble	L(big_loop)
+	mtctr	r8
+
+L(big_loop):
+	lwz	r5,0(r3)
+	lwz	r6,4(r3)
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	lwz	r5,8(r3)
+	lwz	r6,12(r3)
+	lwz	r8,8(r4)
+	lwz	r9,12(r4)
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	lwz	r5,16(r3)
+	lwz	r6,20(r3)
+	lwz	r8,16(r4)
+	lwz	r9,20(r4)
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	lwz	r5,24(r3)
+	lwz	r6,28(r3)
+	addi	r3,r3,0x20
+	lwz	r8,24(r4)
+	lwz	r9,28(r4)
+	addi	r4,r4,0x20
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	bdnz	L(big_loop)
+	b	L(byte_loop)
+
+L(end_check):
+	subfic	r12,r12,4
+	blt	L(end_check2)
+	rlwinm	r12,r12,3,0,31
+	srw	r5,r5,r12
+	srw	r8,r8,r12
+	cmplw	r5,r8
+	bne	L(st1)
+	b	L(end_strcmp)
+
+L(end_check2):
+	addi	r12,r12,4
+	cmplw	r5,r8
+	rlwinm	r12,r12,3,0,31
+	bne	L(st1)
+	srw	r6,r6,r12
+	srw	r9,r9,r12
+	cmplw	r6,r9
+	bne	L(st1)
+
+L(end_strcmp):
+	addi	r3,r0,0
+	blr
+
+L(st1):
+	mfcr	r3
+	blr
+
+L(byte_loop):
+	lbz	r5,0(r3)
+	addi	r3,r3,1
+	lbz	r6,0(r4)
+	addi	r4,r4,1
+	cmplw	r5,r6
+	bne	L(st1)
+	cmpwi	r5,0
+	beq	L(end_strcmp)
+	b	L(byte_loop)
+END (BP_SYM (strcmp))
+libc_hidden_builtin_def (strcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
new file mode 100644
index 0000000..f289118
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
@@ -0,0 +1,111 @@
+/* Optimized strcpy implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcpy
+
+	Register Use
+	r3:destination and return address
+	r4:source address
+	r10:temp destination address
+
+	Implementation description
+	Loop by checking 2 words at a time, with dlmzb. Check if there is a null
+	in the 2 words. If there is a null jump to end checking to determine
+	where in the last 8 bytes it is. Copy the appropriate bytes of the last
+	8 according to the null position. */
+
+EALIGN (BP_SYM (strcpy), 5, 0)
+	neg	r7,r4
+	subi	r4,r4,1
+	clrlwi.	r8,r7,29
+	subi	r10,r3,1
+	beq	L(pre_word8_loop)
+	mtctr	r8
+
+L(loop):
+	lbzu	r5,0x01(r4)
+	cmpi	cr5,r5,0x0
+	stbu	r5,0x01(r10)
+	beq	cr5,L(end_strcpy)
+	bdnz	L(loop)
+
+L(pre_word8_loop):
+	subi	r4,r4,3
+	subi	r10,r10,3
+
+L(word8_loop):
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	b	L(word8_loop)
+
+L(last_bytes_copy):
+	stwu	r5,0x04(r10)
+	subi	r11,r11,4
+	mtctr	r11
+	addi	r10,r10,3
+	subi	r4,r4,1
+
+L(last_bytes_copy_loop):
+	lbzu	r5,0x01(r4)
+	stbu	r5,0x01(r10)
+	bdnz	L(last_bytes_copy_loop)
+	blr
+
+L(byte_copy):
+	blt	L(last_bytes_copy)
+	mtctr	r11
+	addi	r10,r10,3
+	subi	r4,r4,5
+
+L(last_bytes_copy_loop2):
+	lbzu	r5,0x01(r4)
+	stbu	r5,0x01(r10)
+	bdnz	L(last_bytes_copy_loop2)
+
+L(end_strcpy):
+	blr
+END (BP_SYM (strcpy))
+libc_hidden_builtin_def (strcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
new file mode 100644
index 0000000..5da0e0b
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strlen.S
@@ -0,0 +1,79 @@
+/* Optimized strlen implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strlen
+
+	Register Use
+	r3:source address and return length of string
+	r4:byte counter
+
+	Implementation description
+	Load 2 words at a time and count bytes, if we find null we subtract one from
+	the count and return the count value. We need to subtract one because
+	we don't count the null character as a byte. */
+
+EALIGN (BP_SYM (strlen),5,0)
+	neg	r7,r3
+	clrlwi.	r8,r7,29
+	addi	r4,0,0
+	beq	L(byte_count_loop)
+	mtctr	r8
+
+L(loop):
+	lbz	r5,0(r3)
+	cmpi	cr5,r5,0x0
+	addi	r3,r3,0x1
+	addi	r4,r4,0x1
+	beq	cr5,L(end_strlen)
+	bdnz	L(loop)
+
+L(byte_count_loop):
+	lwz	r5,0(r3)
+	lwz	r6,4(r3)
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	lwz	r5,8(r3)
+	lwz	r6,12(r3)
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	lwz	r5,16(r3)
+	lwz	r6,20(r3)
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	lwz	r5,24(r3)
+	lwz	r6,28(r3)
+	addi	r3,r3,0x20
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	b	L(byte_count_loop)
+
+L(end_strlen):
+	addi	r3,r4,-1
+	blr
+END (BP_SYM (strlen))
+libc_hidden_builtin_def (strlen)
diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
new file mode 100644
index 0000000..658c1b1
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
@@ -0,0 +1,132 @@
+/* Optimized strncmp implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strncmp
+
+	Register Use
+	r0:temp return equality
+	r3:source1 address, return equality
+	r4:source2 address
+	r5:byte count
+
+	Implementation description
+	Touch in 3 lines of D-cache.
+	If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
+	Check 2 words from src1 and src2. If unequal jump to end and
+	return src1 > src2 or src1 < src2.
+	If null check bytes before null and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If count = zero check bytes before zero counter and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strncmp),5,0)
+	neg	r7,r3
+	clrlwi	r7,r7,20
+	neg	r8,r4
+	clrlwi	r8,r8,20
+	srwi.	r7,r7,3
+	beq	L(prebyte_count_loop)
+	srwi.	r8,r8,3
+	beq	L(prebyte_count_loop)
+	cmplw	r7,r8
+	mtctr	r7
+	ble	L(preword2_count_loop)
+	mtctr	r8
+
+L(preword2_count_loop):
+	srwi.	r6,r5,3
+	beq	L(prebyte_count_loop)
+	mfctr	r7
+	cmplw	r6,r7
+	bgt	L(set_count_loop)
+	mtctr	r6
+	clrlwi	r5,r5,29
+
+L(word2_count_loop):
+	lwz	r10,0(r3)
+	lwz	r6,4(r3)
+	addi	r3,r3,0x08
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	addi	r4,r4,0x08
+	dlmzb.	r12,r10,r6
+	bne	L(end_check)
+	cmplw	r10,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	bdnz	L(word2_count_loop)
+
+L(prebyte_count_loop):
+	addi	r5,r5,1
+	mtctr	r5
+	bdz	L(end_strncmp)
+
+L(byte_count_loop):
+	lbz	r6,0(r3)
+	addi	r3,r3,1
+	lbz	r7,0(r4)
+	addi	r4,r4,1
+	cmplw	r6,r7
+	bne	L(st1)
+	cmpwi	r6,0
+	beq	L(end_strncmp)
+	bdnz	L(byte_count_loop)
+	b	L(end_strncmp)
+
+L(set_count_loop):
+	slwi	r7,r7,3
+	subf	r5,r7,r5
+	b	L(word2_count_loop)
+
+L(end_check):
+	subfic	r12,r12,4
+	blt	L(end_check2)
+	rlwinm	r12,r12,3,0,31
+	srw	r10,r10,r12
+	srw	r8,r8,r12
+	cmplw	r10,r8
+	bne	L(st1)
+	b	L(end_strncmp)
+
+L(end_check2):
+	addi	r12,r12,4
+	cmplw	r10,r8
+	rlwinm	r12,r12,3,0,31
+	bne	L(st1)
+	srw	r6,r6,r12
+	srw	r9,r9,r12
+	cmplw	r6,r9
+	bne	L(st1)
+
+L(end_strncmp):
+	addi	r3,r0,0
+	blr
+
+L(st1):
+	mfcr	r3
+	blr
+END (BP_SYM (strncmp))
+libc_hidden_builtin_def (strncmp)
diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
new file mode 100644
index 0000000..3d235de
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/Makefile
@@ -0,0 +1,8 @@
+# Some Powerpc32 variants assume soft-fp is the default even though there is
+# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
+
+ifeq ($(with-fp),yes)
++cflags += -mhard-float
+ASFLAGS += -mhard-float
+sysdep-LDFLAGS += -mhard-float
+endif
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..80f9170
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/476/fpu
+powerpc/powerpc32/476


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2010-09-02 17:34 [PATCH] powerpc: 405/440/464/476 support and optimizations Luis Machado
@ 2010-09-03 14:45 ` Ryan Arnold
  2010-09-03 15:00   ` Luis Machado
  0 siblings, 1 reply; 19+ messages in thread
From: Ryan Arnold @ 2010-09-03 14:45 UTC (permalink / raw)
  To: luisgpm; +Cc: libc-ports, rsa, Todd Iglehart, Josh Boyer

On Thu, Sep 2, 2010 at 12:34 PM, Luis Machado
<luisgpm@linux.vnet.ibm.com> wrote:
> Hi,
>
> This patch adds powerpc 405/440/464/476 platforms to ports and adds 3
> memory (memcpy,memcmp,memset) optimizations and 4 string function
> (strcmp,strncmp,strcpy,strlen) optimizations (provided by Todd, copied),
> placed under 405, so all those platforms can use those optimized
> functions.
>
> The patch also adds the required Makefile, sysdeps structure and Implies
> files.
>
> Is this OK?
>
> Regards,
> Luis
>
> 2010-09-02  Todd Iglehart  <iglehart@us.ibm.com>
>            Ryan Arnold  <rsa@us.ibm.com>
>            Luis Machado  <luisgpm@br.ibm.com>
>

Hi Luis,

Todd doesn't have FSF copyright assignment so he shouldn't be on the ChangeLog.

> diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
> new file mode 100644
> index 0000000..c0314e6
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
> @@ -0,0 +1,132 @@
> +/* Optimized memcmp implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   Contributed by Todd Iglehart <iglehart@us.ibm.com>.
> +   This file is part of the GNU C Library.

Since Todd doesn't have copyright assignment these changes are
contributed to the FSF by IBM without author/contributor attribution.

You can simply attribute the changes to him in the email leaving his
name out of the sources per FSF policy and submit them on IBM's
behalf.

Ryan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2010-09-03 14:45 ` Ryan Arnold
@ 2010-09-03 15:00   ` Luis Machado
  2010-10-04 18:54     ` Luis Machado
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Machado @ 2010-09-03 15:00 UTC (permalink / raw)
  To: Ryan Arnold; +Cc: libc-ports, rsa, Todd Iglehart, Josh Boyer

> Since Todd doesn't have copyright assignment these changes are
> contributed to the FSF by IBM without author/contributor attribution.
> 
> You can simply attribute the changes to him in the email leaving his
> name out of the sources per FSF policy and submit them on IBM's
> behalf.
> 
> Ryan

Thanks.

Follows the updated patch without Todd's name on the sources.

Luis


2010-09-03  Luis Machado  <luisgpm@br.ibm.com>

	* sysdeps/powerpc/dl-procinfo.c: New file.
	* sysdeps/powerpc/dl-procinfo.h: New file.
	* sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
	* sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
	* sysdeps/powerpc/powerpc32/405/memset.S: New file.
	* sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
	* sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
	* sysdeps/powerpc/powerpc32/405/strlen.S: New file.
	* sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
	* sysdeps/powerpc/powerpc32/440/Implies: New file.
	* sysdeps/powerpc/powerpc32/464/Implies: New file.
	* sysdeps/powerpc/powerpc32/476/Implies: New file.
	* sysdeps/powerpc/powerpc32/Makefile: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.

diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
new file mode 100644
index 0000000..60fb465
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.c
@@ -0,0 +1,96 @@
+/* Data for processor capability information.  PowerPC version.
+   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+/* This information must be kept in sync with the _DL_HWCAP_COUNT and
+   _DL_PLATFORM_COUNT definitions in procinfo.h.
+
+   If anything should be added here check whether the size of each string
+   is still ok with the given array size.
+
+   All the #ifdefs in the definitions are quite irritating but
+   necessary if we want to avoid duplicating the information.  There
+   are three different modes:
+
+   - PROCINFO_DECL is defined.  This means we are only interested in
+     declarations.
+
+   - PROCINFO_DECL is not defined:
+
+     + if SHARED is defined the file is included in an array
+       initializer.  The .element = { ... } syntax is needed.
+
+     + if SHARED is not defined a normal array initialization is
+       needed.
+  */
+
+#ifndef PROCINFO_CLASS
+# define PROCINFO_CLASS
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+  ._dl_powerpc_cap_flags
+#else
+PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
+#endif
+#ifndef PROCINFO_DECL
+= {
+    "vsx",
+    "arch_2_06", "power6x", "dfp", "pa6t",
+    "arch_2_05", "ic_snoop", "smt", "booke",
+    "cellbe", "power5+", "power5", "power4",
+    "notb", "efpdouble", "efpsingle", "spe",
+    "ucache", "4xxmac", "mmu", "fpu",
+    "altivec", "ppc601", "ppc64", "ppc32",
+  }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+  ._dl_powerpc_platforms
+#else
+PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
+#endif
+#ifndef PROCINFO_DECL
+= {
+    [PPC_PLATFORM_POWER4] = "power4",
+    [PPC_PLATFORM_PPC970] = "ppc970",
+    [PPC_PLATFORM_POWER5] = "power5",
+    [PPC_PLATFORM_POWER5_PLUS] = "power5+",
+    [PPC_PLATFORM_POWER6] = "power6",
+    [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
+    [PPC_PLATFORM_POWER6X] = "power6x",
+    [PPC_PLATFORM_POWER7] = "power7",
+    [PPC_PLATFORM_PPC405] = "ppc405",
+    [PPC_PLATFORM_PPC440] = "ppc440",
+    [PPC_PLATFORM_PPC464] = "ppc464",
+    [PPC_PLATFORM_PPC476] = "ppc476"
+  }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#undef PROCINFO_DECL
+#undef PROCINFO_CLASS
diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
new file mode 100644
index 0000000..87279de
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.h
@@ -0,0 +1,168 @@
+/* Processor capability information handling macros.  PowerPC version.
+   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_PROCINFO_H
+#define _DL_PROCINFO_H	1
+
+#include <ldsodefs.h>
+#include <sysdep.h>		/* This defines the PPC_FEATURE_* macros.  */
+
+/* There are 25 bits used, but they are bits 7..31.  */
+#define _DL_HWCAP_FIRST		7
+#define _DL_HWCAP_COUNT		32
+
+/* These bits influence library search.  */
+#define HWCAP_IMPORTANT		(PPC_FEATURE_HAS_ALTIVEC \
+				+ PPC_FEATURE_HAS_DFP)
+
+#define _DL_PLATFORMS_COUNT	12
+
+#define _DL_FIRST_PLATFORM	32
+/* Mask to filter out platforms.  */
+#define _DL_HWCAP_PLATFORM      (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
+				<< _DL_FIRST_PLATFORM)
+
+/* Platform bits (relative to _DL_FIRST_PLATFORM).  */
+#define PPC_PLATFORM_POWER4	      0
+#define PPC_PLATFORM_PPC970	      1
+#define PPC_PLATFORM_POWER5	      2
+#define PPC_PLATFORM_POWER5_PLUS      3
+#define PPC_PLATFORM_POWER6	      4
+#define PPC_PLATFORM_CELL_BE	      5
+#define PPC_PLATFORM_POWER6X	      6
+#define PPC_PLATFORM_POWER7	      7
+#define PPC_PLATFORM_PPC405	      8
+#define PPC_PLATFORM_PPC440	      9
+#define PPC_PLATFORM_PPC464	      10
+#define PPC_PLATFORM_PPC476	      11
+
+static inline const char *
+__attribute__ ((unused))
+_dl_hwcap_string (int idx)
+{
+  return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
+}
+
+static inline const char *
+__attribute__ ((unused))
+_dl_platform_string (int idx)
+{
+  return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
+}
+
+static inline int
+__attribute__ ((unused))
+_dl_string_hwcap (const char *str)
+{
+  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+    if (strcmp (str, _dl_hwcap_string (i)) == 0)
+      return i;
+  return -1;
+}
+
+static inline int
+__attribute__ ((unused, always_inline))
+_dl_string_platform (const char *str)
+{
+  if (str == NULL)
+    return -1;
+
+  if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
+    {
+      int ret;
+      str += 5;
+      switch (*str)
+	{
+	case '4':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
+	  break;
+	case '5':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
+	  if (str[1] == '+')
+	    {
+	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
+	      ++str;
+	    }
+	  break;
+	case '6':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
+	  if (str[1] == 'x')
+	    {
+	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
+	      ++str;
+	    }
+	  break;
+	case '7':
+	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
+	  break;
+	default:
+	  return -1;
+	}
+      if (str[1] == '\0')
+	return ret;
+    }
+  else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
+		    3) == 0)
+    {
+      if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
+			   + 3) == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
+      else if (strcmp (str + 3,
+		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
+	       == 0)
+	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
+    }
+
+  return -1;
+}
+
+#ifdef IS_IN_rtld
+static inline int
+__attribute__ ((unused))
+_dl_procinfo (int word)
+{
+  _dl_printf ("AT_HWCAP:       ");
+
+  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+    if (word & (1 << i))
+      _dl_printf (" %s", _dl_hwcap_string (i));
+
+  _dl_printf ("\n");
+
+  return 0;
+}
+#endif
+
+#endif /* dl-procinfo.h */
diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
new file mode 100644
index 0000000..653d3b5
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
@@ -0,0 +1,131 @@
+/* Optimized memcmp implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcmp
+
+	r3:source1 address, return equality
+	r4:source2 address
+	r5:byte count
+
+	Check 2 words from src1 and src2. If unequal jump to end and
+	return src1 > src2 or src1 < src2.
+	If count = zero check bytes before zero counter and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM (memcmp), 5, 0)
+	srwi.	r6,r5,5
+	beq	L(preword2_count_loop)
+	mtctr	r6
+	clrlwi	r5,r5,27
+
+L(word8_compare_loop):
+	lwz	r10,0(r3)
+	lwz	r6,4(r3)
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	lwz	r10,8(r3)
+	lwz	r6,12(r3)
+	lwz	r8,8(r4)
+	lwz	r9,12(r4)
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	lwz	r10,16(r3)
+	lwz	r6,20(r3)
+	lwz	r8,16(r4)
+	lwz	r9,20(r4)
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	lwz	r10,24(r3)
+	lwz	r6,28(r3)
+	addi	r3,r3,0x20
+	lwz	r8,24(r4)
+	lwz	r9,28(r4)
+	addi	r4,r4,0x20
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	bdnz	L(word8_compare_loop)
+
+L(preword2_count_loop):
+	srwi.	r6,r5,3
+	beq	L(prebyte_count_loop)
+	mtctr	r6
+	clrlwi  r5,r5,29
+
+L(word2_count_loop):
+	lwz	r10,0(r3)
+	lwz	r6,4(r3)
+	addi	r3,r3,0x08
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	addi	r4,r4,0x08
+	cmplw	cr5,r8,r10
+	cmplw	cr1,r9,r6
+	bne	cr5,L(st2)
+	bne	cr1,L(st1)
+	bdnz	L(word2_count_loop)
+
+L(prebyte_count_loop):
+	addi	r5,r5,1
+	mtctr	r5
+	bdz	L(end_memcmp)
+
+L(byte_count_loop):
+	lbz	r6,0(r3)
+	addi	r3,r3,0x01
+	lbz	r8,0(r4)
+	addi	r4,r4,0x01
+	cmplw	cr5,r8,r6
+	bne	cr5,L(st2)
+	bdnz	L(byte_count_loop)
+
+L(end_memcmp):
+	addi	r3,r0,0
+	blr
+
+L(l_r):
+	addi	r3,r0,1
+	blr
+
+L(st1):
+	blt	cr1,L(l_r)
+	addi	r3,r0,-1
+	blr
+
+L(st2):
+	blt	cr5,L(l_r)
+	addi	r3,r0,-1
+	blr
+END (BP_SYM (memcmp))
+libc_hidden_builtin_def (memcmp)
+weak_alias (memcmp,bcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
new file mode 100644
index 0000000..a654c73
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
@@ -0,0 +1,133 @@
+/* Optimized memcpy implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcpy
+
+	r0:return address
+	r3:destination address
+	r4:source address
+	r5:byte count
+
+	Save return address in r0.
+	If destinationn and source are unaligned and copy count is greater than 256
+	then copy 0-3 bytes to make destination aligned.
+	If 32 or more bytes to copy we use 32 byte copy loop.
+	Finaly we copy 0-31 extra bytes. */
+
+EALIGN (BP_SYM (memcpy), 5, 0)
+/* Check if bytes to copy are greater than 256 and if
+	source and destination are unaligned */
+	cmpwi	r5,0x0100
+	addi	r0,r3,0
+	ble	L(string_count_loop)
+	neg	r6,r3
+	clrlwi. r6,r6,30
+	beq	L(string_count_loop)
+	neg	r6,r4
+	clrlwi. r6,r6,30
+	beq	L(string_count_loop)
+	mtctr	r6
+	subf	r5,r6,r5
+
+L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
+	lbz	r8,0x0(r4)
+	addi	r4,r4,1
+	stb	r8,0x0(r3)
+	addi	r3,r3,1
+	bdnz	L(unaligned_bytecopy_loop)
+	srwi.	r7,r5,5
+	beq	L(preword2_count_loop)
+	mtctr	r7
+
+L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+	lwz	r6,0(r4)
+	lwz	r7,4(r4)
+	lwz	r8,8(r4)
+	lwz	r9,12(r4)
+	subi	r5,r5,0x20
+	stw	r6,0(r3)
+	stw	r7,4(r3)
+	stw	r8,8(r3)
+	stw	r9,12(r3)
+	lwz	r6,16(r4)
+	lwz	r7,20(r4)
+	lwz	r8,24(r4)
+	lwz	r9,28(r4)
+	addi	r4,r4,0x20
+	stw	r6,16(r3)
+	stw	r7,20(r3)
+	stw	r8,24(r3)
+	stw	r9,28(r3)
+	addi	r3,r3,0x20
+	bdnz	L(word8_count_loop_no_dcbt)
+
+L(preword2_count_loop): /* Copy remaining 0-31 bytes */
+	clrlwi. r12,r5,27
+	beq	L(end_memcpy)
+	mtxer	r12
+	lswx	r5,0,r4
+	stswx	r5,0,r3
+	mr	 r3,r0
+	blr
+
+L(string_count_loop): /* Copy odd 0-31 bytes */
+	clrlwi. r12,r5,28
+	add	r3,r3,r5
+	add	r4,r4,r5
+	beq	L(pre_string_copy)
+	mtxer	r12
+	subf	r4,r12,r4
+	subf	r3,r12,r3
+	lswx	r6,0,r4
+	stswx	r6,0,r3
+
+L(pre_string_copy): /* Check how many 32 byte chunck to copy */
+	srwi.	r7,r5,4
+	beq	L(end_memcpy)
+	mtctr	r7
+
+L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+	lwz	r6,-4(r4)
+	lwz	r7,-8(r4)
+	lwz	r8,-12(r4)
+	lwzu	r9,-16(r4)
+	stw	r6,-4(r3)
+	stw	r7,-8(r3)
+	stw	r8,-12(r3)
+	stwu	r9,-16(r3)
+	bdz	L(end_memcpy)
+	lwz	r6,-4(r4)
+	lwz	r7,-8(r4)
+	lwz	r8,-12(r4)
+	lwzu	r9,-16(r4)
+	stw	r6,-4(r3)
+	stw	r7,-8(r3)
+	stw	r8,-12(r3)
+	stwu	r9,-16(r3)
+	bdnz	L(word4_count_loop_no_dcbt)
+
+L(end_memcpy):
+	mr	 r3,r0
+	blr
+END (BP_SYM (memcpy))
+libc_hidden_builtin_def (memcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
new file mode 100644
index 0000000..69d5d4c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memset.S
@@ -0,0 +1,155 @@
+/* Optimized memset implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memset
+
+	r3:destination address and return address
+	r4:source integer to copy
+	r5:byte count
+	r11:sources integer to copy in all 32 bits of reg
+	r12:temp return address
+
+	Save return address in r12
+	If destinationn is unaligned and count is greater tha 255 bytes
+	set 0-3 bytes to make destination aligned
+	If count is greater tha 255 bytes and setting zero to memory
+	use dbcz to set memeory when we can
+	otherwsie do the follwoing
+	If 16 or more words to set we use 16 word copy loop.
+	Finaly we set 0-15 extra bytes with string store. */
+
+EALIGN (BP_SYM (memset), 5, 0)
+	rlwinm	r11,r4,0,24,31
+	rlwimi	r11,r4,8,16,23
+	rlwimi	r11,r11,16,0,15
+	addi	r12,r3,0
+	cmpwi	r5,0x00FF
+	ble	L(preword8_count_loop)
+	cmpwi	r4,0x00
+	beq	L(use_dcbz)
+	neg	r6,r3
+	clrlwi.	r6,r6,30
+	beq	L(preword8_count_loop)
+	addi	r8,0,1
+	mtctr	r6
+	subi	r3,r3,1
+
+L(unaligned_bytecopy_loop):
+	stbu	r11,0x1(r3)
+	subf.	r5,r8,r5
+	beq	L(end_memset)
+	bdnz	L(unaligned_bytecopy_loop)
+	addi	r3,r3,1
+
+L(preword8_count_loop):
+	srwi.	r6,r5,4
+	beq	L(preword2_count_loop)
+	mtctr	r6
+	addi	r3,r3,-4
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+
+L(word8_count_loop_no_dcbt):
+	stwu	r8,4(r3)
+	stwu	r9,4(r3)
+	subi	r5,r5,0x10
+	stwu	r10,4(r3)
+	stwu	r11,4(r3)
+	bdnz	L(word8_count_loop_no_dcbt)
+	addi	r3,r3,4
+
+L(preword2_count_loop):
+	clrlwi.	r7,r5,28
+	beq	L(end_memset)
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+	mtxer	r7
+	stswx	r8,0,r3
+
+L(end_memset):
+	addi	r3,r12,0
+	blr
+
+L(use_dcbz):
+	neg	r6,r3
+	clrlwi.	r7,r6,28
+	beq	L(skip_string_loop)
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+	subf	r5,r7,r5
+	mtxer	r7
+	stswx	r8,0,r3
+	add	r3,r3,r7
+
+L(skip_string_loop):
+	clrlwi	r8,r6,25
+	srwi.	r8,r8,4
+	beq	L(dcbz_pre_loop)
+	mtctr	r8
+
+L(word_loop):
+	stw	r11,0(r3)
+	subi	r5,r5,0x10
+	stw	r11,4(r3)
+	stw	r11,8(r3)
+	stw	r11,12(r3)
+	addi	r3,r3,0x10
+	bdnz	L(word_loop)
+
+L(dcbz_pre_loop):
+	srwi	r6,r5,7
+	mtctr	r6
+	addi	r7,0,0
+
+L(dcbz_loop):
+	dcbz	r3,r7
+	addi	r3,r3,0x80
+	subi	r5,r5,0x80
+	bdnz	L(dcbz_loop)
+	srwi.	r6,r5,4
+	beq	L(postword2_count_loop)
+	mtctr	r6
+
+L(postword8_count_loop):
+	stw	r11,0(r3)
+	subi	r5,r5,0x10
+	stw	r11,4(r3)
+	stw	r11,8(r3)
+	stw	r11,12(r3)
+	addi	r3,r3,0x10
+	bdnz	L(postword8_count_loop)
+
+L(postword2_count_loop):
+	clrlwi.	r7,r5,28
+	beq	L(end_memset)
+	mr	r8,r11
+	mr	r9,r11
+	mr	r10,r11
+	mtxer	r7
+	stswx	r8,0,r3
+	b	L(end_memset)
+END (BP_SYM (memset))
+libc_hidden_builtin_def (memset)
diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
new file mode 100644
index 0000000..6eb5b5a
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
@@ -0,0 +1,137 @@
+/* Optimized strcmp implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcmp
+
+	Register Use
+	r0:temp return equality
+	r3:source1 address, return equality
+	r4:source2 address
+
+	Implementation description
+	Check 2 words from src1 and src2. If unequal jump to end and
+	return src1 > src2 or src1 < src2.
+	If null check bytes before null and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strcmp),5,0)
+	neg	r7,r3
+	clrlwi	r7,r7,20
+	neg	r8,r4
+	clrlwi	r8,r8,20
+	srwi.	r7,r7,5
+	beq	L(byte_loop)
+	srwi.	r8,r8,5
+	beq	L(byte_loop)
+	cmplw	r7,r8
+	mtctr	r7
+	ble	L(big_loop)
+	mtctr	r8
+
+L(big_loop):
+	lwz	r5,0(r3)
+	lwz	r6,4(r3)
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	lwz	r5,8(r3)
+	lwz	r6,12(r3)
+	lwz	r8,8(r4)
+	lwz	r9,12(r4)
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	lwz	r5,16(r3)
+	lwz	r6,20(r3)
+	lwz	r8,16(r4)
+	lwz	r9,20(r4)
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	lwz	r5,24(r3)
+	lwz	r6,28(r3)
+	addi	r3,r3,0x20
+	lwz	r8,24(r4)
+	lwz	r9,28(r4)
+	addi	r4,r4,0x20
+	dlmzb.	r12,r5,r6
+	bne	L(end_check)
+	cmplw	r5,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	bdnz	L(big_loop)
+	b	L(byte_loop)
+
+L(end_check):
+	subfic	r12,r12,4
+	blt	L(end_check2)
+	rlwinm	r12,r12,3,0,31
+	srw	r5,r5,r12
+	srw	r8,r8,r12
+	cmplw	r5,r8
+	bne	L(st1)
+	b	L(end_strcmp)
+
+L(end_check2):
+	addi	r12,r12,4
+	cmplw	r5,r8
+	rlwinm	r12,r12,3,0,31
+	bne	L(st1)
+	srw	r6,r6,r12
+	srw	r9,r9,r12
+	cmplw	r6,r9
+	bne	L(st1)
+
+L(end_strcmp):
+	addi	r3,r0,0
+	blr
+
+L(st1):
+	mfcr	r3
+	blr
+
+L(byte_loop):
+	lbz	r5,0(r3)
+	addi	r3,r3,1
+	lbz	r6,0(r4)
+	addi	r4,r4,1
+	cmplw	r5,r6
+	bne	L(st1)
+	cmpwi	r5,0
+	beq	L(end_strcmp)
+	b	L(byte_loop)
+END (BP_SYM (strcmp))
+libc_hidden_builtin_def (strcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
new file mode 100644
index 0000000..025ac16
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
@@ -0,0 +1,110 @@
+/* Optimized strcpy implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcpy
+
+	Register Use
+	r3:destination and return address
+	r4:source address
+	r10:temp destination address
+
+	Implementation description
+	Loop by checking 2 words at a time, with dlmzb. Check if there is a null
+	in the 2 words. If there is a null jump to end checking to determine
+	where in the last 8 bytes it is. Copy the appropriate bytes of the last
+	8 according to the null position. */
+
+EALIGN (BP_SYM (strcpy), 5, 0)
+	neg	r7,r4
+	subi	r4,r4,1
+	clrlwi.	r8,r7,29
+	subi	r10,r3,1
+	beq	L(pre_word8_loop)
+	mtctr	r8
+
+L(loop):
+	lbzu	r5,0x01(r4)
+	cmpi	cr5,r5,0x0
+	stbu	r5,0x01(r10)
+	beq	cr5,L(end_strcpy)
+	bdnz	L(loop)
+
+L(pre_word8_loop):
+	subi	r4,r4,3
+	subi	r10,r10,3
+
+L(word8_loop):
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	lwzu	r5,0x04(r4)
+	lwzu	r6,0x04(r4)
+	dlmzb.	r11,r5,r6
+	bne	L(byte_copy)
+	stwu	r5,0x04(r10)
+	stwu	r6,0x04(r10)
+	b	L(word8_loop)
+
+L(last_bytes_copy):
+	stwu	r5,0x04(r10)
+	subi	r11,r11,4
+	mtctr	r11
+	addi	r10,r10,3
+	subi	r4,r4,1
+
+L(last_bytes_copy_loop):
+	lbzu	r5,0x01(r4)
+	stbu	r5,0x01(r10)
+	bdnz	L(last_bytes_copy_loop)
+	blr
+
+L(byte_copy):
+	blt	L(last_bytes_copy)
+	mtctr	r11
+	addi	r10,r10,3
+	subi	r4,r4,5
+
+L(last_bytes_copy_loop2):
+	lbzu	r5,0x01(r4)
+	stbu	r5,0x01(r10)
+	bdnz	L(last_bytes_copy_loop2)
+
+L(end_strcpy):
+	blr
+END (BP_SYM (strcpy))
+libc_hidden_builtin_def (strcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
new file mode 100644
index 0000000..146b582
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strlen.S
@@ -0,0 +1,78 @@
+/* Optimized strlen implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strlen
+
+	Register Use
+	r3:source address and return length of string
+	r4:byte counter
+
+	Implementation description
+	Load 2 words at a time and count bytes, if we find null we subtract one from
+	the count and return the count value. We need to subtract one because
+	we don't count the null character as a byte. */
+
+EALIGN (BP_SYM (strlen),5,0)
+	neg	r7,r3
+	clrlwi.	r8,r7,29
+	addi	r4,0,0
+	beq	L(byte_count_loop)
+	mtctr	r8
+
+L(loop):
+	lbz	r5,0(r3)
+	cmpi	cr5,r5,0x0
+	addi	r3,r3,0x1
+	addi	r4,r4,0x1
+	beq	cr5,L(end_strlen)
+	bdnz	L(loop)
+
+L(byte_count_loop):
+	lwz	r5,0(r3)
+	lwz	r6,4(r3)
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	lwz	r5,8(r3)
+	lwz	r6,12(r3)
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	lwz	r5,16(r3)
+	lwz	r6,20(r3)
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	lwz	r5,24(r3)
+	lwz	r6,28(r3)
+	addi	r3,r3,0x20
+	dlmzb.	r12,r5,r6
+	add	r4,r4,r12
+	bne	L(end_strlen)
+	b	L(byte_count_loop)
+
+L(end_strlen):
+	addi	r3,r4,-1
+	blr
+END (BP_SYM (strlen))
+libc_hidden_builtin_def (strlen)
diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
new file mode 100644
index 0000000..c1beb23
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
@@ -0,0 +1,131 @@
+/* Optimized strncmp implementation for PowerPC476.
+   Copyright (C) 2010 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+   02110-1301 USA.  */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strncmp
+
+	Register Use
+	r0:temp return equality
+	r3:source1 address, return equality
+	r4:source2 address
+	r5:byte count
+
+	Implementation description
+	Touch in 3 lines of D-cache.
+	If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
+	Check 2 words from src1 and src2. If unequal jump to end and
+	return src1 > src2 or src1 < src2.
+	If null check bytes before null and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If count = zero check bytes before zero counter and then jump to end and
+	return src1 > src2, src1 < src2 or src1 = src2.
+	If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strncmp),5,0)
+	neg	r7,r3
+	clrlwi	r7,r7,20
+	neg	r8,r4
+	clrlwi	r8,r8,20
+	srwi.	r7,r7,3
+	beq	L(prebyte_count_loop)
+	srwi.	r8,r8,3
+	beq	L(prebyte_count_loop)
+	cmplw	r7,r8
+	mtctr	r7
+	ble	L(preword2_count_loop)
+	mtctr	r8
+
+L(preword2_count_loop):
+	srwi.	r6,r5,3
+	beq	L(prebyte_count_loop)
+	mfctr	r7
+	cmplw	r6,r7
+	bgt	L(set_count_loop)
+	mtctr	r6
+	clrlwi	r5,r5,29
+
+L(word2_count_loop):
+	lwz	r10,0(r3)
+	lwz	r6,4(r3)
+	addi	r3,r3,0x08
+	lwz	r8,0(r4)
+	lwz	r9,4(r4)
+	addi	r4,r4,0x08
+	dlmzb.	r12,r10,r6
+	bne	L(end_check)
+	cmplw	r10,r8
+	bne	L(st1)
+	cmplw	r6,r9
+	bne	L(st1)
+	bdnz	L(word2_count_loop)
+
+L(prebyte_count_loop):
+	addi	r5,r5,1
+	mtctr	r5
+	bdz	L(end_strncmp)
+
+L(byte_count_loop):
+	lbz	r6,0(r3)
+	addi	r3,r3,1
+	lbz	r7,0(r4)
+	addi	r4,r4,1
+	cmplw	r6,r7
+	bne	L(st1)
+	cmpwi	r6,0
+	beq	L(end_strncmp)
+	bdnz	L(byte_count_loop)
+	b	L(end_strncmp)
+
+L(set_count_loop):
+	slwi	r7,r7,3
+	subf	r5,r7,r5
+	b	L(word2_count_loop)
+
+L(end_check):
+	subfic	r12,r12,4
+	blt	L(end_check2)
+	rlwinm	r12,r12,3,0,31
+	srw	r10,r10,r12
+	srw	r8,r8,r12
+	cmplw	r10,r8
+	bne	L(st1)
+	b	L(end_strncmp)
+
+L(end_check2):
+	addi	r12,r12,4
+	cmplw	r10,r8
+	rlwinm	r12,r12,3,0,31
+	bne	L(st1)
+	srw	r6,r6,r12
+	srw	r9,r9,r12
+	cmplw	r6,r9
+	bne	L(st1)
+
+L(end_strncmp):
+	addi	r3,r0,0
+	blr
+
+L(st1):
+	mfcr	r3
+	blr
+END (BP_SYM (strncmp))
+libc_hidden_builtin_def (strncmp)
diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
new file mode 100644
index 0000000..3d235de
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/Makefile
@@ -0,0 +1,8 @@
+# Some Powerpc32 variants assume soft-fp is the default even though there is
+# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
+
+ifeq ($(with-fp),yes)
++cflags += -mhard-float
+ASFLAGS += -mhard-float
+sysdep-LDFLAGS += -mhard-float
+endif
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..80f9170
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/476/fpu
+powerpc/powerpc32/476


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2010-09-03 15:00   ` Luis Machado
@ 2010-10-04 18:54     ` Luis Machado
  2010-12-13 20:26       ` Ryan Arnold
  0 siblings, 1 reply; 19+ messages in thread
From: Luis Machado @ 2010-10-04 18:54 UTC (permalink / raw)
  To: Ryan Arnold; +Cc: libc-ports, rsa, Todd Iglehart, Josh Boyer

Ping?

On Fri, 2010-09-03 at 12:00 -0300, Luis Machado wrote:
> > Since Todd doesn't have copyright assignment these changes are
> > contributed to the FSF by IBM without author/contributor attribution.
> > 
> > You can simply attribute the changes to him in the email leaving his
> > name out of the sources per FSF policy and submit them on IBM's
> > behalf.
> > 
> > Ryan
> 
> Thanks.
> 
> Follows the updated patch without Todd's name on the sources.
> 
> Luis
> 
> 
> 2010-09-03  Luis Machado  <luisgpm@br.ibm.com>
> 
> 	* sysdeps/powerpc/dl-procinfo.c: New file.
> 	* sysdeps/powerpc/dl-procinfo.h: New file.
> 	* sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
> 	* sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
> 	* sysdeps/powerpc/powerpc32/405/memset.S: New file.
> 	* sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
> 	* sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
> 	* sysdeps/powerpc/powerpc32/405/strlen.S: New file.
> 	* sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
> 	* sysdeps/powerpc/powerpc32/440/Implies: New file.
> 	* sysdeps/powerpc/powerpc32/464/Implies: New file.
> 	* sysdeps/powerpc/powerpc32/476/Implies: New file.
> 	* sysdeps/powerpc/powerpc32/Makefile: New file.
> 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
> 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
> 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
> 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.
> 
> diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
> new file mode 100644
> index 0000000..60fb465
> --- /dev/null
> +++ b/sysdeps/powerpc/dl-procinfo.c
> @@ -0,0 +1,96 @@
> +/* Data for processor capability information.  PowerPC version.
> +   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> +   02111-1307 USA.  */
> +
> +/* This information must be kept in sync with the _DL_HWCAP_COUNT and
> +   _DL_PLATFORM_COUNT definitions in procinfo.h.
> +
> +   If anything should be added here check whether the size of each string
> +   is still ok with the given array size.
> +
> +   All the #ifdefs in the definitions are quite irritating but
> +   necessary if we want to avoid duplicating the information.  There
> +   are three different modes:
> +
> +   - PROCINFO_DECL is defined.  This means we are only interested in
> +     declarations.
> +
> +   - PROCINFO_DECL is not defined:
> +
> +     + if SHARED is defined the file is included in an array
> +       initializer.  The .element = { ... } syntax is needed.
> +
> +     + if SHARED is not defined a normal array initialization is
> +       needed.
> +  */
> +
> +#ifndef PROCINFO_CLASS
> +# define PROCINFO_CLASS
> +#endif
> +
> +#if !defined PROCINFO_DECL && defined SHARED
> +  ._dl_powerpc_cap_flags
> +#else
> +PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
> +#endif
> +#ifndef PROCINFO_DECL
> += {
> +    "vsx",
> +    "arch_2_06", "power6x", "dfp", "pa6t",
> +    "arch_2_05", "ic_snoop", "smt", "booke",
> +    "cellbe", "power5+", "power5", "power4",
> +    "notb", "efpdouble", "efpsingle", "spe",
> +    "ucache", "4xxmac", "mmu", "fpu",
> +    "altivec", "ppc601", "ppc64", "ppc32",
> +  }
> +#endif
> +#if !defined SHARED || defined PROCINFO_DECL
> +;
> +#else
> +,
> +#endif
> +
> +#if !defined PROCINFO_DECL && defined SHARED
> +  ._dl_powerpc_platforms
> +#else
> +PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
> +#endif
> +#ifndef PROCINFO_DECL
> += {
> +    [PPC_PLATFORM_POWER4] = "power4",
> +    [PPC_PLATFORM_PPC970] = "ppc970",
> +    [PPC_PLATFORM_POWER5] = "power5",
> +    [PPC_PLATFORM_POWER5_PLUS] = "power5+",
> +    [PPC_PLATFORM_POWER6] = "power6",
> +    [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
> +    [PPC_PLATFORM_POWER6X] = "power6x",
> +    [PPC_PLATFORM_POWER7] = "power7",
> +    [PPC_PLATFORM_PPC405] = "ppc405",
> +    [PPC_PLATFORM_PPC440] = "ppc440",
> +    [PPC_PLATFORM_PPC464] = "ppc464",
> +    [PPC_PLATFORM_PPC476] = "ppc476"
> +  }
> +#endif
> +#if !defined SHARED || defined PROCINFO_DECL
> +;
> +#else
> +,
> +#endif
> +
> +#undef PROCINFO_DECL
> +#undef PROCINFO_CLASS
> diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
> new file mode 100644
> index 0000000..87279de
> --- /dev/null
> +++ b/sysdeps/powerpc/dl-procinfo.h
> @@ -0,0 +1,168 @@
> +/* Processor capability information handling macros.  PowerPC version.
> +   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> +   02111-1307 USA.  */
> +
> +#ifndef _DL_PROCINFO_H
> +#define _DL_PROCINFO_H	1
> +
> +#include <ldsodefs.h>
> +#include <sysdep.h>		/* This defines the PPC_FEATURE_* macros.  */
> +
> +/* There are 25 bits used, but they are bits 7..31.  */
> +#define _DL_HWCAP_FIRST		7
> +#define _DL_HWCAP_COUNT		32
> +
> +/* These bits influence library search.  */
> +#define HWCAP_IMPORTANT		(PPC_FEATURE_HAS_ALTIVEC \
> +				+ PPC_FEATURE_HAS_DFP)
> +
> +#define _DL_PLATFORMS_COUNT	12
> +
> +#define _DL_FIRST_PLATFORM	32
> +/* Mask to filter out platforms.  */
> +#define _DL_HWCAP_PLATFORM      (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
> +				<< _DL_FIRST_PLATFORM)
> +
> +/* Platform bits (relative to _DL_FIRST_PLATFORM).  */
> +#define PPC_PLATFORM_POWER4	      0
> +#define PPC_PLATFORM_PPC970	      1
> +#define PPC_PLATFORM_POWER5	      2
> +#define PPC_PLATFORM_POWER5_PLUS      3
> +#define PPC_PLATFORM_POWER6	      4
> +#define PPC_PLATFORM_CELL_BE	      5
> +#define PPC_PLATFORM_POWER6X	      6
> +#define PPC_PLATFORM_POWER7	      7
> +#define PPC_PLATFORM_PPC405	      8
> +#define PPC_PLATFORM_PPC440	      9
> +#define PPC_PLATFORM_PPC464	      10
> +#define PPC_PLATFORM_PPC476	      11
> +
> +static inline const char *
> +__attribute__ ((unused))
> +_dl_hwcap_string (int idx)
> +{
> +  return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
> +}
> +
> +static inline const char *
> +__attribute__ ((unused))
> +_dl_platform_string (int idx)
> +{
> +  return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
> +}
> +
> +static inline int
> +__attribute__ ((unused))
> +_dl_string_hwcap (const char *str)
> +{
> +  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> +    if (strcmp (str, _dl_hwcap_string (i)) == 0)
> +      return i;
> +  return -1;
> +}
> +
> +static inline int
> +__attribute__ ((unused, always_inline))
> +_dl_string_platform (const char *str)
> +{
> +  if (str == NULL)
> +    return -1;
> +
> +  if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
> +    {
> +      int ret;
> +      str += 5;
> +      switch (*str)
> +	{
> +	case '4':
> +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
> +	  break;
> +	case '5':
> +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
> +	  if (str[1] == '+')
> +	    {
> +	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
> +	      ++str;
> +	    }
> +	  break;
> +	case '6':
> +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
> +	  if (str[1] == 'x')
> +	    {
> +	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
> +	      ++str;
> +	    }
> +	  break;
> +	case '7':
> +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
> +	  break;
> +	default:
> +	  return -1;
> +	}
> +      if (str[1] == '\0')
> +	return ret;
> +    }
> +  else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
> +		    3) == 0)
> +    {
> +      if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
> +			   + 3) == 0)
> +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
> +      else if (strcmp (str + 3,
> +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
> +	       == 0)
> +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
> +      else if (strcmp (str + 3,
> +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
> +	       == 0)
> +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
> +      else if (strcmp (str + 3,
> +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
> +	       == 0)
> +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
> +      else if (strcmp (str + 3,
> +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
> +	       == 0)
> +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
> +      else if (strcmp (str + 3,
> +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
> +	       == 0)
> +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
> +    }
> +
> +  return -1;
> +}
> +
> +#ifdef IS_IN_rtld
> +static inline int
> +__attribute__ ((unused))
> +_dl_procinfo (int word)
> +{
> +  _dl_printf ("AT_HWCAP:       ");
> +
> +  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> +    if (word & (1 << i))
> +      _dl_printf (" %s", _dl_hwcap_string (i));
> +
> +  _dl_printf ("\n");
> +
> +  return 0;
> +}
> +#endif
> +
> +#endif /* dl-procinfo.h */
> diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
> new file mode 100644
> index 0000000..653d3b5
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
> @@ -0,0 +1,131 @@
> +/* Optimized memcmp implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> +   02110-1301 USA.  */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* memcmp
> +
> +	r3:source1 address, return equality
> +	r4:source2 address
> +	r5:byte count
> +
> +	Check 2 words from src1 and src2. If unequal jump to end and
> +	return src1 > src2 or src1 < src2.
> +	If count = zero check bytes before zero counter and then jump to end and
> +	return src1 > src2, src1 < src2 or src1 = src2.
> +	If src1 = src2 and no null, repeat. */
> +
> +EALIGN (BP_SYM (memcmp), 5, 0)
> +	srwi.	r6,r5,5
> +	beq	L(preword2_count_loop)
> +	mtctr	r6
> +	clrlwi	r5,r5,27
> +
> +L(word8_compare_loop):
> +	lwz	r10,0(r3)
> +	lwz	r6,4(r3)
> +	lwz	r8,0(r4)
> +	lwz	r9,4(r4)
> +	cmplw	cr5,r8,r10
> +	cmplw	cr1,r9,r6
> +	bne	cr5,L(st2)
> +	bne	cr1,L(st1)
> +	lwz	r10,8(r3)
> +	lwz	r6,12(r3)
> +	lwz	r8,8(r4)
> +	lwz	r9,12(r4)
> +	cmplw	cr5,r8,r10
> +	cmplw	cr1,r9,r6
> +	bne	cr5,L(st2)
> +	bne	cr1,L(st1)
> +	lwz	r10,16(r3)
> +	lwz	r6,20(r3)
> +	lwz	r8,16(r4)
> +	lwz	r9,20(r4)
> +	cmplw	cr5,r8,r10
> +	cmplw	cr1,r9,r6
> +	bne	cr5,L(st2)
> +	bne	cr1,L(st1)
> +	lwz	r10,24(r3)
> +	lwz	r6,28(r3)
> +	addi	r3,r3,0x20
> +	lwz	r8,24(r4)
> +	lwz	r9,28(r4)
> +	addi	r4,r4,0x20
> +	cmplw	cr5,r8,r10
> +	cmplw	cr1,r9,r6
> +	bne	cr5,L(st2)
> +	bne	cr1,L(st1)
> +	bdnz	L(word8_compare_loop)
> +
> +L(preword2_count_loop):
> +	srwi.	r6,r5,3
> +	beq	L(prebyte_count_loop)
> +	mtctr	r6
> +	clrlwi  r5,r5,29
> +
> +L(word2_count_loop):
> +	lwz	r10,0(r3)
> +	lwz	r6,4(r3)
> +	addi	r3,r3,0x08
> +	lwz	r8,0(r4)
> +	lwz	r9,4(r4)
> +	addi	r4,r4,0x08
> +	cmplw	cr5,r8,r10
> +	cmplw	cr1,r9,r6
> +	bne	cr5,L(st2)
> +	bne	cr1,L(st1)
> +	bdnz	L(word2_count_loop)
> +
> +L(prebyte_count_loop):
> +	addi	r5,r5,1
> +	mtctr	r5
> +	bdz	L(end_memcmp)
> +
> +L(byte_count_loop):
> +	lbz	r6,0(r3)
> +	addi	r3,r3,0x01
> +	lbz	r8,0(r4)
> +	addi	r4,r4,0x01
> +	cmplw	cr5,r8,r6
> +	bne	cr5,L(st2)
> +	bdnz	L(byte_count_loop)
> +
> +L(end_memcmp):
> +	addi	r3,r0,0
> +	blr
> +
> +L(l_r):
> +	addi	r3,r0,1
> +	blr
> +
> +L(st1):
> +	blt	cr1,L(l_r)
> +	addi	r3,r0,-1
> +	blr
> +
> +L(st2):
> +	blt	cr5,L(l_r)
> +	addi	r3,r0,-1
> +	blr
> +END (BP_SYM (memcmp))
> +libc_hidden_builtin_def (memcmp)
> +weak_alias (memcmp,bcmp)
> diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
> new file mode 100644
> index 0000000..a654c73
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
> @@ -0,0 +1,133 @@
> +/* Optimized memcpy implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> +   02110-1301 USA.  */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* memcpy
> +
> +	r0:return address
> +	r3:destination address
> +	r4:source address
> +	r5:byte count
> +
> +	Save return address in r0.
> +	If destinationn and source are unaligned and copy count is greater than 256
> +	then copy 0-3 bytes to make destination aligned.
> +	If 32 or more bytes to copy we use 32 byte copy loop.
> +	Finaly we copy 0-31 extra bytes. */
> +
> +EALIGN (BP_SYM (memcpy), 5, 0)
> +/* Check if bytes to copy are greater than 256 and if
> +	source and destination are unaligned */
> +	cmpwi	r5,0x0100
> +	addi	r0,r3,0
> +	ble	L(string_count_loop)
> +	neg	r6,r3
> +	clrlwi. r6,r6,30
> +	beq	L(string_count_loop)
> +	neg	r6,r4
> +	clrlwi. r6,r6,30
> +	beq	L(string_count_loop)
> +	mtctr	r6
> +	subf	r5,r6,r5
> +
> +L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
> +	lbz	r8,0x0(r4)
> +	addi	r4,r4,1
> +	stb	r8,0x0(r3)
> +	addi	r3,r3,1
> +	bdnz	L(unaligned_bytecopy_loop)
> +	srwi.	r7,r5,5
> +	beq	L(preword2_count_loop)
> +	mtctr	r7
> +
> +L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> +	lwz	r6,0(r4)
> +	lwz	r7,4(r4)
> +	lwz	r8,8(r4)
> +	lwz	r9,12(r4)
> +	subi	r5,r5,0x20
> +	stw	r6,0(r3)
> +	stw	r7,4(r3)
> +	stw	r8,8(r3)
> +	stw	r9,12(r3)
> +	lwz	r6,16(r4)
> +	lwz	r7,20(r4)
> +	lwz	r8,24(r4)
> +	lwz	r9,28(r4)
> +	addi	r4,r4,0x20
> +	stw	r6,16(r3)
> +	stw	r7,20(r3)
> +	stw	r8,24(r3)
> +	stw	r9,28(r3)
> +	addi	r3,r3,0x20
> +	bdnz	L(word8_count_loop_no_dcbt)
> +
> +L(preword2_count_loop): /* Copy remaining 0-31 bytes */
> +	clrlwi. r12,r5,27
> +	beq	L(end_memcpy)
> +	mtxer	r12
> +	lswx	r5,0,r4
> +	stswx	r5,0,r3
> +	mr	 r3,r0
> +	blr
> +
> +L(string_count_loop): /* Copy odd 0-31 bytes */
> +	clrlwi. r12,r5,28
> +	add	r3,r3,r5
> +	add	r4,r4,r5
> +	beq	L(pre_string_copy)
> +	mtxer	r12
> +	subf	r4,r12,r4
> +	subf	r3,r12,r3
> +	lswx	r6,0,r4
> +	stswx	r6,0,r3
> +
> +L(pre_string_copy): /* Check how many 32 byte chunck to copy */
> +	srwi.	r7,r5,4
> +	beq	L(end_memcpy)
> +	mtctr	r7
> +
> +L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> +	lwz	r6,-4(r4)
> +	lwz	r7,-8(r4)
> +	lwz	r8,-12(r4)
> +	lwzu	r9,-16(r4)
> +	stw	r6,-4(r3)
> +	stw	r7,-8(r3)
> +	stw	r8,-12(r3)
> +	stwu	r9,-16(r3)
> +	bdz	L(end_memcpy)
> +	lwz	r6,-4(r4)
> +	lwz	r7,-8(r4)
> +	lwz	r8,-12(r4)
> +	lwzu	r9,-16(r4)
> +	stw	r6,-4(r3)
> +	stw	r7,-8(r3)
> +	stw	r8,-12(r3)
> +	stwu	r9,-16(r3)
> +	bdnz	L(word4_count_loop_no_dcbt)
> +
> +L(end_memcpy):
> +	mr	 r3,r0
> +	blr
> +END (BP_SYM (memcpy))
> +libc_hidden_builtin_def (memcpy)
> diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
> new file mode 100644
> index 0000000..69d5d4c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memset.S
> @@ -0,0 +1,155 @@
> +/* Optimized memset implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> +   02110-1301 USA.  */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* memset
> +
> +	r3:destination address and return address
> +	r4:source integer to copy
> +	r5:byte count
> +	r11:sources integer to copy in all 32 bits of reg
> +	r12:temp return address
> +
> +	Save return address in r12
> +	If destinationn is unaligned and count is greater tha 255 bytes
> +	set 0-3 bytes to make destination aligned
> +	If count is greater tha 255 bytes and setting zero to memory
> +	use dbcz to set memeory when we can
> +	otherwsie do the follwoing
> +	If 16 or more words to set we use 16 word copy loop.
> +	Finaly we set 0-15 extra bytes with string store. */
> +
> +EALIGN (BP_SYM (memset), 5, 0)
> +	rlwinm	r11,r4,0,24,31
> +	rlwimi	r11,r4,8,16,23
> +	rlwimi	r11,r11,16,0,15
> +	addi	r12,r3,0
> +	cmpwi	r5,0x00FF
> +	ble	L(preword8_count_loop)
> +	cmpwi	r4,0x00
> +	beq	L(use_dcbz)
> +	neg	r6,r3
> +	clrlwi.	r6,r6,30
> +	beq	L(preword8_count_loop)
> +	addi	r8,0,1
> +	mtctr	r6
> +	subi	r3,r3,1
> +
> +L(unaligned_bytecopy_loop):
> +	stbu	r11,0x1(r3)
> +	subf.	r5,r8,r5
> +	beq	L(end_memset)
> +	bdnz	L(unaligned_bytecopy_loop)
> +	addi	r3,r3,1
> +
> +L(preword8_count_loop):
> +	srwi.	r6,r5,4
> +	beq	L(preword2_count_loop)
> +	mtctr	r6
> +	addi	r3,r3,-4
> +	mr	r8,r11
> +	mr	r9,r11
> +	mr	r10,r11
> +
> +L(word8_count_loop_no_dcbt):
> +	stwu	r8,4(r3)
> +	stwu	r9,4(r3)
> +	subi	r5,r5,0x10
> +	stwu	r10,4(r3)
> +	stwu	r11,4(r3)
> +	bdnz	L(word8_count_loop_no_dcbt)
> +	addi	r3,r3,4
> +
> +L(preword2_count_loop):
> +	clrlwi.	r7,r5,28
> +	beq	L(end_memset)
> +	mr	r8,r11
> +	mr	r9,r11
> +	mr	r10,r11
> +	mtxer	r7
> +	stswx	r8,0,r3
> +
> +L(end_memset):
> +	addi	r3,r12,0
> +	blr
> +
> +L(use_dcbz):
> +	neg	r6,r3
> +	clrlwi.	r7,r6,28
> +	beq	L(skip_string_loop)
> +	mr	r8,r11
> +	mr	r9,r11
> +	mr	r10,r11
> +	subf	r5,r7,r5
> +	mtxer	r7
> +	stswx	r8,0,r3
> +	add	r3,r3,r7
> +
> +L(skip_string_loop):
> +	clrlwi	r8,r6,25
> +	srwi.	r8,r8,4
> +	beq	L(dcbz_pre_loop)
> +	mtctr	r8
> +
> +L(word_loop):
> +	stw	r11,0(r3)
> +	subi	r5,r5,0x10
> +	stw	r11,4(r3)
> +	stw	r11,8(r3)
> +	stw	r11,12(r3)
> +	addi	r3,r3,0x10
> +	bdnz	L(word_loop)
> +
> +L(dcbz_pre_loop):
> +	srwi	r6,r5,7
> +	mtctr	r6
> +	addi	r7,0,0
> +
> +L(dcbz_loop):
> +	dcbz	r3,r7
> +	addi	r3,r3,0x80
> +	subi	r5,r5,0x80
> +	bdnz	L(dcbz_loop)
> +	srwi.	r6,r5,4
> +	beq	L(postword2_count_loop)
> +	mtctr	r6
> +
> +L(postword8_count_loop):
> +	stw	r11,0(r3)
> +	subi	r5,r5,0x10
> +	stw	r11,4(r3)
> +	stw	r11,8(r3)
> +	stw	r11,12(r3)
> +	addi	r3,r3,0x10
> +	bdnz	L(postword8_count_loop)
> +
> +L(postword2_count_loop):
> +	clrlwi.	r7,r5,28
> +	beq	L(end_memset)
> +	mr	r8,r11
> +	mr	r9,r11
> +	mr	r10,r11
> +	mtxer	r7
> +	stswx	r8,0,r3
> +	b	L(end_memset)
> +END (BP_SYM (memset))
> +libc_hidden_builtin_def (memset)
> diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
> new file mode 100644
> index 0000000..6eb5b5a
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
> @@ -0,0 +1,137 @@
> +/* Optimized strcmp implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> +   02110-1301 USA.  */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strcmp
> +
> +	Register Use
> +	r0:temp return equality
> +	r3:source1 address, return equality
> +	r4:source2 address
> +
> +	Implementation description
> +	Check 2 words from src1 and src2. If unequal jump to end and
> +	return src1 > src2 or src1 < src2.
> +	If null check bytes before null and then jump to end and
> +	return src1 > src2, src1 < src2 or src1 = src2.
> +	If src1 = src2 and no null, repeat. */
> +
> +EALIGN (BP_SYM(strcmp),5,0)
> +	neg	r7,r3
> +	clrlwi	r7,r7,20
> +	neg	r8,r4
> +	clrlwi	r8,r8,20
> +	srwi.	r7,r7,5
> +	beq	L(byte_loop)
> +	srwi.	r8,r8,5
> +	beq	L(byte_loop)
> +	cmplw	r7,r8
> +	mtctr	r7
> +	ble	L(big_loop)
> +	mtctr	r8
> +
> +L(big_loop):
> +	lwz	r5,0(r3)
> +	lwz	r6,4(r3)
> +	lwz	r8,0(r4)
> +	lwz	r9,4(r4)
> +	dlmzb.	r12,r5,r6
> +	bne	L(end_check)
> +	cmplw	r5,r8
> +	bne	L(st1)
> +	cmplw	r6,r9
> +	bne	L(st1)
> +	lwz	r5,8(r3)
> +	lwz	r6,12(r3)
> +	lwz	r8,8(r4)
> +	lwz	r9,12(r4)
> +	dlmzb.	r12,r5,r6
> +	bne	L(end_check)
> +	cmplw	r5,r8
> +	bne	L(st1)
> +	cmplw	r6,r9
> +	bne	L(st1)
> +	lwz	r5,16(r3)
> +	lwz	r6,20(r3)
> +	lwz	r8,16(r4)
> +	lwz	r9,20(r4)
> +	dlmzb.	r12,r5,r6
> +	bne	L(end_check)
> +	cmplw	r5,r8
> +	bne	L(st1)
> +	cmplw	r6,r9
> +	bne	L(st1)
> +	lwz	r5,24(r3)
> +	lwz	r6,28(r3)
> +	addi	r3,r3,0x20
> +	lwz	r8,24(r4)
> +	lwz	r9,28(r4)
> +	addi	r4,r4,0x20
> +	dlmzb.	r12,r5,r6
> +	bne	L(end_check)
> +	cmplw	r5,r8
> +	bne	L(st1)
> +	cmplw	r6,r9
> +	bne	L(st1)
> +	bdnz	L(big_loop)
> +	b	L(byte_loop)
> +
> +L(end_check):
> +	subfic	r12,r12,4
> +	blt	L(end_check2)
> +	rlwinm	r12,r12,3,0,31
> +	srw	r5,r5,r12
> +	srw	r8,r8,r12
> +	cmplw	r5,r8
> +	bne	L(st1)
> +	b	L(end_strcmp)
> +
> +L(end_check2):
> +	addi	r12,r12,4
> +	cmplw	r5,r8
> +	rlwinm	r12,r12,3,0,31
> +	bne	L(st1)
> +	srw	r6,r6,r12
> +	srw	r9,r9,r12
> +	cmplw	r6,r9
> +	bne	L(st1)
> +
> +L(end_strcmp):
> +	addi	r3,r0,0
> +	blr
> +
> +L(st1):
> +	mfcr	r3
> +	blr
> +
> +L(byte_loop):
> +	lbz	r5,0(r3)
> +	addi	r3,r3,1
> +	lbz	r6,0(r4)
> +	addi	r4,r4,1
> +	cmplw	r5,r6
> +	bne	L(st1)
> +	cmpwi	r5,0
> +	beq	L(end_strcmp)
> +	b	L(byte_loop)
> +END (BP_SYM (strcmp))
> +libc_hidden_builtin_def (strcmp)
> diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
> new file mode 100644
> index 0000000..025ac16
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
> @@ -0,0 +1,110 @@
> +/* Optimized strcpy implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> +   02110-1301 USA.  */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strcpy
> +
> +	Register Use
> +	r3:destination and return address
> +	r4:source address
> +	r10:temp destination address
> +
> +	Implementation description
> +	Loop by checking 2 words at a time, with dlmzb. Check if there is a null
> +	in the 2 words. If there is a null jump to end checking to determine
> +	where in the last 8 bytes it is. Copy the appropriate bytes of the last
> +	8 according to the null position. */
> +
> +EALIGN (BP_SYM (strcpy), 5, 0)
> +	neg	r7,r4
> +	subi	r4,r4,1
> +	clrlwi.	r8,r7,29
> +	subi	r10,r3,1
> +	beq	L(pre_word8_loop)
> +	mtctr	r8
> +
> +L(loop):
> +	lbzu	r5,0x01(r4)
> +	cmpi	cr5,r5,0x0
> +	stbu	r5,0x01(r10)
> +	beq	cr5,L(end_strcpy)
> +	bdnz	L(loop)
> +
> +L(pre_word8_loop):
> +	subi	r4,r4,3
> +	subi	r10,r10,3
> +
> +L(word8_loop):
> +	lwzu	r5,0x04(r4)
> +	lwzu	r6,0x04(r4)
> +	dlmzb.	r11,r5,r6
> +	bne	L(byte_copy)
> +	stwu	r5,0x04(r10)
> +	stwu	r6,0x04(r10)
> +	lwzu	r5,0x04(r4)
> +	lwzu	r6,0x04(r4)
> +	dlmzb.	r11,r5,r6
> +	bne	L(byte_copy)
> +	stwu	r5,0x04(r10)
> +	stwu	r6,0x04(r10)
> +	lwzu	r5,0x04(r4)
> +	lwzu	r6,0x04(r4)
> +	dlmzb.	r11,r5,r6
> +	bne	L(byte_copy)
> +	stwu	r5,0x04(r10)
> +	stwu	r6,0x04(r10)
> +	lwzu	r5,0x04(r4)
> +	lwzu	r6,0x04(r4)
> +	dlmzb.	r11,r5,r6
> +	bne	L(byte_copy)
> +	stwu	r5,0x04(r10)
> +	stwu	r6,0x04(r10)
> +	b	L(word8_loop)
> +
> +L(last_bytes_copy):
> +	stwu	r5,0x04(r10)
> +	subi	r11,r11,4
> +	mtctr	r11
> +	addi	r10,r10,3
> +	subi	r4,r4,1
> +
> +L(last_bytes_copy_loop):
> +	lbzu	r5,0x01(r4)
> +	stbu	r5,0x01(r10)
> +	bdnz	L(last_bytes_copy_loop)
> +	blr
> +
> +L(byte_copy):
> +	blt	L(last_bytes_copy)
> +	mtctr	r11
> +	addi	r10,r10,3
> +	subi	r4,r4,5
> +
> +L(last_bytes_copy_loop2):
> +	lbzu	r5,0x01(r4)
> +	stbu	r5,0x01(r10)
> +	bdnz	L(last_bytes_copy_loop2)
> +
> +L(end_strcpy):
> +	blr
> +END (BP_SYM (strcpy))
> +libc_hidden_builtin_def (strcpy)
> diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
> new file mode 100644
> index 0000000..146b582
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strlen.S
> @@ -0,0 +1,78 @@
> +/* Optimized strlen implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> +   02110-1301 USA.  */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strlen
> +
> +	Register Use
> +	r3:source address and return length of string
> +	r4:byte counter
> +
> +	Implementation description
> +	Load 2 words at a time and count bytes, if we find null we subtract one from
> +	the count and return the count value. We need to subtract one because
> +	we don't count the null character as a byte. */
> +
> +EALIGN (BP_SYM (strlen),5,0)
> +	neg	r7,r3
> +	clrlwi.	r8,r7,29
> +	addi	r4,0,0
> +	beq	L(byte_count_loop)
> +	mtctr	r8
> +
> +L(loop):
> +	lbz	r5,0(r3)
> +	cmpi	cr5,r5,0x0
> +	addi	r3,r3,0x1
> +	addi	r4,r4,0x1
> +	beq	cr5,L(end_strlen)
> +	bdnz	L(loop)
> +
> +L(byte_count_loop):
> +	lwz	r5,0(r3)
> +	lwz	r6,4(r3)
> +	dlmzb.	r12,r5,r6
> +	add	r4,r4,r12
> +	bne	L(end_strlen)
> +	lwz	r5,8(r3)
> +	lwz	r6,12(r3)
> +	dlmzb.	r12,r5,r6
> +	add	r4,r4,r12
> +	bne	L(end_strlen)
> +	lwz	r5,16(r3)
> +	lwz	r6,20(r3)
> +	dlmzb.	r12,r5,r6
> +	add	r4,r4,r12
> +	bne	L(end_strlen)
> +	lwz	r5,24(r3)
> +	lwz	r6,28(r3)
> +	addi	r3,r3,0x20
> +	dlmzb.	r12,r5,r6
> +	add	r4,r4,r12
> +	bne	L(end_strlen)
> +	b	L(byte_count_loop)
> +
> +L(end_strlen):
> +	addi	r3,r4,-1
> +	blr
> +END (BP_SYM (strlen))
> +libc_hidden_builtin_def (strlen)
> diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
> new file mode 100644
> index 0000000..c1beb23
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
> @@ -0,0 +1,131 @@
> +/* Optimized strncmp implementation for PowerPC476.
> +   Copyright (C) 2010 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, write to the Free
> +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> +   02110-1301 USA.  */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strncmp
> +
> +	Register Use
> +	r0:temp return equality
> +	r3:source1 address, return equality
> +	r4:source2 address
> +	r5:byte count
> +
> +	Implementation description
> +	Touch in 3 lines of D-cache.
> +	If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
> +	Check 2 words from src1 and src2. If unequal jump to end and
> +	return src1 > src2 or src1 < src2.
> +	If null check bytes before null and then jump to end and
> +	return src1 > src2, src1 < src2 or src1 = src2.
> +	If count = zero check bytes before zero counter and then jump to end and
> +	return src1 > src2, src1 < src2 or src1 = src2.
> +	If src1 = src2 and no null, repeat. */
> +
> +EALIGN (BP_SYM(strncmp),5,0)
> +	neg	r7,r3
> +	clrlwi	r7,r7,20
> +	neg	r8,r4
> +	clrlwi	r8,r8,20
> +	srwi.	r7,r7,3
> +	beq	L(prebyte_count_loop)
> +	srwi.	r8,r8,3
> +	beq	L(prebyte_count_loop)
> +	cmplw	r7,r8
> +	mtctr	r7
> +	ble	L(preword2_count_loop)
> +	mtctr	r8
> +
> +L(preword2_count_loop):
> +	srwi.	r6,r5,3
> +	beq	L(prebyte_count_loop)
> +	mfctr	r7
> +	cmplw	r6,r7
> +	bgt	L(set_count_loop)
> +	mtctr	r6
> +	clrlwi	r5,r5,29
> +
> +L(word2_count_loop):
> +	lwz	r10,0(r3)
> +	lwz	r6,4(r3)
> +	addi	r3,r3,0x08
> +	lwz	r8,0(r4)
> +	lwz	r9,4(r4)
> +	addi	r4,r4,0x08
> +	dlmzb.	r12,r10,r6
> +	bne	L(end_check)
> +	cmplw	r10,r8
> +	bne	L(st1)
> +	cmplw	r6,r9
> +	bne	L(st1)
> +	bdnz	L(word2_count_loop)
> +
> +L(prebyte_count_loop):
> +	addi	r5,r5,1
> +	mtctr	r5
> +	bdz	L(end_strncmp)
> +
> +L(byte_count_loop):
> +	lbz	r6,0(r3)
> +	addi	r3,r3,1
> +	lbz	r7,0(r4)
> +	addi	r4,r4,1
> +	cmplw	r6,r7
> +	bne	L(st1)
> +	cmpwi	r6,0
> +	beq	L(end_strncmp)
> +	bdnz	L(byte_count_loop)
> +	b	L(end_strncmp)
> +
> +L(set_count_loop):
> +	slwi	r7,r7,3
> +	subf	r5,r7,r5
> +	b	L(word2_count_loop)
> +
> +L(end_check):
> +	subfic	r12,r12,4
> +	blt	L(end_check2)
> +	rlwinm	r12,r12,3,0,31
> +	srw	r10,r10,r12
> +	srw	r8,r8,r12
> +	cmplw	r10,r8
> +	bne	L(st1)
> +	b	L(end_strncmp)
> +
> +L(end_check2):
> +	addi	r12,r12,4
> +	cmplw	r10,r8
> +	rlwinm	r12,r12,3,0,31
> +	bne	L(st1)
> +	srw	r6,r6,r12
> +	srw	r9,r9,r12
> +	cmplw	r6,r9
> +	bne	L(st1)
> +
> +L(end_strncmp):
> +	addi	r3,r0,0
> +	blr
> +
> +L(st1):
> +	mfcr	r3
> +	blr
> +END (BP_SYM (strncmp))
> +libc_hidden_builtin_def (strncmp)
> diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
> new file mode 100644
> index 0000000..70c0d2e
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/440/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/405/fpu
> +powerpc/powerpc32/405
> diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
> new file mode 100644
> index 0000000..c3e52c5
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/464/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/440/fpu
> +powerpc/powerpc32/440
> diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
> new file mode 100644
> index 0000000..2829f9c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/476/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/464/fpu
> +powerpc/powerpc32/464
> diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
> new file mode 100644
> index 0000000..3d235de
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/Makefile
> @@ -0,0 +1,8 @@
> +# Some Powerpc32 variants assume soft-fp is the default even though there is
> +# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
> +
> +ifeq ($(with-fp),yes)
> ++cflags += -mhard-float
> +ASFLAGS += -mhard-float
> +sysdep-LDFLAGS += -mhard-float
> +endif
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> new file mode 100644
> index 0000000..70c0d2e
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/405/fpu
> +powerpc/powerpc32/405
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> new file mode 100644
> index 0000000..c3e52c5
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/440/fpu
> +powerpc/powerpc32/440
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> new file mode 100644
> index 0000000..2829f9c
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/464/fpu
> +powerpc/powerpc32/464
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> new file mode 100644
> index 0000000..80f9170
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/476/fpu
> +powerpc/powerpc32/476
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2010-10-04 18:54     ` Luis Machado
@ 2010-12-13 20:26       ` Ryan Arnold
  2011-01-18 13:16         ` Ryan Arnold
  0 siblings, 1 reply; 19+ messages in thread
From: Ryan Arnold @ 2010-12-13 20:26 UTC (permalink / raw)
  To: luisgpm; +Cc: Ryan Arnold, libc-ports, Todd Iglehart, Josh Boyer

On Mon, 2010-10-04 at 15:53 -0300, Luis Machado wrote:
> Ping?
> 
> On Fri, 2010-09-03 at 12:00 -0300, Luis Machado wrote:
> > > Since Todd doesn't have copyright assignment these changes are
> > > contributed to the FSF by IBM without author/contributor attribution.
> > > 
> > > You can simply attribute the changes to him in the email leaving his
> > > name out of the sources per FSF policy and submit them on IBM's
> > > behalf.
> > > 
> > > Ryan
> > 
> > Thanks.
> > 
> > Follows the updated patch without Todd's name on the sources.
> > 
> > Luis
> > 
> > 
> > 2010-09-03  Luis Machado  <luisgpm@br.ibm.com>
> > 
> > 	* sysdeps/powerpc/dl-procinfo.c: New file.
> > 	* sysdeps/powerpc/dl-procinfo.h: New file.
> > 	* sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
> > 	* sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
> > 	* sysdeps/powerpc/powerpc32/405/memset.S: New file.
> > 	* sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
> > 	* sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
> > 	* sysdeps/powerpc/powerpc32/405/strlen.S: New file.
> > 	* sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
> > 	* sysdeps/powerpc/powerpc32/440/Implies: New file.
> > 	* sysdeps/powerpc/powerpc32/464/Implies: New file.
> > 	* sysdeps/powerpc/powerpc32/476/Implies: New file.
> > 	* sysdeps/powerpc/powerpc32/Makefile: New file.
> > 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
> > 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
> > 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
> > 	* sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.
> > 
> > diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
> > new file mode 100644
> > index 0000000..60fb465
> > --- /dev/null
> > +++ b/sysdeps/powerpc/dl-procinfo.c
> > @@ -0,0 +1,96 @@
> > +/* Data for processor capability information.  PowerPC version.
> > +   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> > +   02111-1307 USA.  */
> > +
> > +/* This information must be kept in sync with the _DL_HWCAP_COUNT and
> > +   _DL_PLATFORM_COUNT definitions in procinfo.h.
> > +
> > +   If anything should be added here check whether the size of each string
> > +   is still ok with the given array size.
> > +
> > +   All the #ifdefs in the definitions are quite irritating but
> > +   necessary if we want to avoid duplicating the information.  There
> > +   are three different modes:
> > +
> > +   - PROCINFO_DECL is defined.  This means we are only interested in
> > +     declarations.
> > +
> > +   - PROCINFO_DECL is not defined:
> > +
> > +     + if SHARED is defined the file is included in an array
> > +       initializer.  The .element = { ... } syntax is needed.
> > +
> > +     + if SHARED is not defined a normal array initialization is
> > +       needed.
> > +  */
> > +
> > +#ifndef PROCINFO_CLASS
> > +# define PROCINFO_CLASS
> > +#endif
> > +
> > +#if !defined PROCINFO_DECL && defined SHARED
> > +  ._dl_powerpc_cap_flags
> > +#else
> > +PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
> > +#endif
> > +#ifndef PROCINFO_DECL
> > += {
> > +    "vsx",
> > +    "arch_2_06", "power6x", "dfp", "pa6t",
> > +    "arch_2_05", "ic_snoop", "smt", "booke",
> > +    "cellbe", "power5+", "power5", "power4",
> > +    "notb", "efpdouble", "efpsingle", "spe",
> > +    "ucache", "4xxmac", "mmu", "fpu",
> > +    "altivec", "ppc601", "ppc64", "ppc32",
> > +  }
> > +#endif
> > +#if !defined SHARED || defined PROCINFO_DECL
> > +;
> > +#else
> > +,
> > +#endif
> > +
> > +#if !defined PROCINFO_DECL && defined SHARED
> > +  ._dl_powerpc_platforms
> > +#else
> > +PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
> > +#endif
> > +#ifndef PROCINFO_DECL
> > += {
> > +    [PPC_PLATFORM_POWER4] = "power4",
> > +    [PPC_PLATFORM_PPC970] = "ppc970",
> > +    [PPC_PLATFORM_POWER5] = "power5",
> > +    [PPC_PLATFORM_POWER5_PLUS] = "power5+",
> > +    [PPC_PLATFORM_POWER6] = "power6",
> > +    [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
> > +    [PPC_PLATFORM_POWER6X] = "power6x",
> > +    [PPC_PLATFORM_POWER7] = "power7",
> > +    [PPC_PLATFORM_PPC405] = "ppc405",
> > +    [PPC_PLATFORM_PPC440] = "ppc440",
> > +    [PPC_PLATFORM_PPC464] = "ppc464",
> > +    [PPC_PLATFORM_PPC476] = "ppc476"
> > +  }
> > +#endif
> > +#if !defined SHARED || defined PROCINFO_DECL
> > +;
> > +#else
> > +,
> > +#endif
> > +
> > +#undef PROCINFO_DECL
> > +#undef PROCINFO_CLASS
> > diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
> > new file mode 100644
> > index 0000000..87279de
> > --- /dev/null
> > +++ b/sysdeps/powerpc/dl-procinfo.h
> > @@ -0,0 +1,168 @@
> > +/* Processor capability information handling macros.  PowerPC version.
> > +   Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> > +   02111-1307 USA.  */
> > +
> > +#ifndef _DL_PROCINFO_H
> > +#define _DL_PROCINFO_H	1
> > +
> > +#include <ldsodefs.h>
> > +#include <sysdep.h>		/* This defines the PPC_FEATURE_* macros.  */
> > +
> > +/* There are 25 bits used, but they are bits 7..31.  */
> > +#define _DL_HWCAP_FIRST		7
> > +#define _DL_HWCAP_COUNT		32
> > +
> > +/* These bits influence library search.  */
> > +#define HWCAP_IMPORTANT		(PPC_FEATURE_HAS_ALTIVEC \
> > +				+ PPC_FEATURE_HAS_DFP)
> > +
> > +#define _DL_PLATFORMS_COUNT	12
> > +
> > +#define _DL_FIRST_PLATFORM	32
> > +/* Mask to filter out platforms.  */
> > +#define _DL_HWCAP_PLATFORM      (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
> > +				<< _DL_FIRST_PLATFORM)
> > +
> > +/* Platform bits (relative to _DL_FIRST_PLATFORM).  */
> > +#define PPC_PLATFORM_POWER4	      0
> > +#define PPC_PLATFORM_PPC970	      1
> > +#define PPC_PLATFORM_POWER5	      2
> > +#define PPC_PLATFORM_POWER5_PLUS      3
> > +#define PPC_PLATFORM_POWER6	      4
> > +#define PPC_PLATFORM_CELL_BE	      5
> > +#define PPC_PLATFORM_POWER6X	      6
> > +#define PPC_PLATFORM_POWER7	      7
> > +#define PPC_PLATFORM_PPC405	      8
> > +#define PPC_PLATFORM_PPC440	      9
> > +#define PPC_PLATFORM_PPC464	      10
> > +#define PPC_PLATFORM_PPC476	      11
> > +
> > +static inline const char *
> > +__attribute__ ((unused))
> > +_dl_hwcap_string (int idx)
> > +{
> > +  return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
> > +}
> > +
> > +static inline const char *
> > +__attribute__ ((unused))
> > +_dl_platform_string (int idx)
> > +{
> > +  return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
> > +}
> > +
> > +static inline int
> > +__attribute__ ((unused))
> > +_dl_string_hwcap (const char *str)
> > +{
> > +  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> > +    if (strcmp (str, _dl_hwcap_string (i)) == 0)
> > +      return i;
> > +  return -1;
> > +}
> > +
> > +static inline int
> > +__attribute__ ((unused, always_inline))
> > +_dl_string_platform (const char *str)
> > +{
> > +  if (str == NULL)
> > +    return -1;
> > +
> > +  if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
> > +    {
> > +      int ret;
> > +      str += 5;
> > +      switch (*str)
> > +	{
> > +	case '4':
> > +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
> > +	  break;
> > +	case '5':
> > +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
> > +	  if (str[1] == '+')
> > +	    {
> > +	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
> > +	      ++str;
> > +	    }
> > +	  break;
> > +	case '6':
> > +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
> > +	  if (str[1] == 'x')
> > +	    {
> > +	      ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
> > +	      ++str;
> > +	    }
> > +	  break;
> > +	case '7':
> > +	  ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
> > +	  break;
> > +	default:
> > +	  return -1;
> > +	}
> > +      if (str[1] == '\0')
> > +	return ret;
> > +    }
> > +  else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
> > +		    3) == 0)
> > +    {
> > +      if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
> > +			   + 3) == 0)
> > +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
> > +      else if (strcmp (str + 3,
> > +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
> > +	       == 0)
> > +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
> > +      else if (strcmp (str + 3,
> > +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
> > +	       == 0)
> > +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
> > +      else if (strcmp (str + 3,
> > +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
> > +	       == 0)
> > +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
> > +      else if (strcmp (str + 3,
> > +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
> > +	       == 0)
> > +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
> > +      else if (strcmp (str + 3,
> > +		       GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
> > +	       == 0)
> > +	return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
> > +    }
> > +
> > +  return -1;
> > +}
> > +
> > +#ifdef IS_IN_rtld
> > +static inline int
> > +__attribute__ ((unused))
> > +_dl_procinfo (int word)
> > +{
> > +  _dl_printf ("AT_HWCAP:       ");
> > +
> > +  for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> > +    if (word & (1 << i))
> > +      _dl_printf (" %s", _dl_hwcap_string (i));
> > +
> > +  _dl_printf ("\n");
> > +
> > +  return 0;
> > +}
> > +#endif
> > +
> > +#endif /* dl-procinfo.h */
> > diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
> > new file mode 100644
> > index 0000000..653d3b5
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
> > @@ -0,0 +1,131 @@
> > +/* Optimized memcmp implementation for PowerPC476.
> > +   Copyright (C) 2010 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > +   02110-1301 USA.  */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* memcmp
> > +
> > +	r3:source1 address, return equality
> > +	r4:source2 address
> > +	r5:byte count
> > +
> > +	Check 2 words from src1 and src2. If unequal jump to end and
> > +	return src1 > src2 or src1 < src2.
> > +	If count = zero check bytes before zero counter and then jump to end and
> > +	return src1 > src2, src1 < src2 or src1 = src2.
> > +	If src1 = src2 and no null, repeat. */
> > +
> > +EALIGN (BP_SYM (memcmp), 5, 0)
> > +	srwi.	r6,r5,5
> > +	beq	L(preword2_count_loop)
> > +	mtctr	r6
> > +	clrlwi	r5,r5,27
> > +
> > +L(word8_compare_loop):
> > +	lwz	r10,0(r3)
> > +	lwz	r6,4(r3)
> > +	lwz	r8,0(r4)
> > +	lwz	r9,4(r4)
> > +	cmplw	cr5,r8,r10
> > +	cmplw	cr1,r9,r6
> > +	bne	cr5,L(st2)
> > +	bne	cr1,L(st1)
> > +	lwz	r10,8(r3)
> > +	lwz	r6,12(r3)
> > +	lwz	r8,8(r4)
> > +	lwz	r9,12(r4)
> > +	cmplw	cr5,r8,r10
> > +	cmplw	cr1,r9,r6
> > +	bne	cr5,L(st2)
> > +	bne	cr1,L(st1)
> > +	lwz	r10,16(r3)
> > +	lwz	r6,20(r3)
> > +	lwz	r8,16(r4)
> > +	lwz	r9,20(r4)
> > +	cmplw	cr5,r8,r10
> > +	cmplw	cr1,r9,r6
> > +	bne	cr5,L(st2)
> > +	bne	cr1,L(st1)
> > +	lwz	r10,24(r3)
> > +	lwz	r6,28(r3)
> > +	addi	r3,r3,0x20
> > +	lwz	r8,24(r4)
> > +	lwz	r9,28(r4)
> > +	addi	r4,r4,0x20
> > +	cmplw	cr5,r8,r10
> > +	cmplw	cr1,r9,r6
> > +	bne	cr5,L(st2)
> > +	bne	cr1,L(st1)
> > +	bdnz	L(word8_compare_loop)
> > +
> > +L(preword2_count_loop):
> > +	srwi.	r6,r5,3
> > +	beq	L(prebyte_count_loop)
> > +	mtctr	r6
> > +	clrlwi  r5,r5,29
> > +
> > +L(word2_count_loop):
> > +	lwz	r10,0(r3)
> > +	lwz	r6,4(r3)
> > +	addi	r3,r3,0x08
> > +	lwz	r8,0(r4)
> > +	lwz	r9,4(r4)
> > +	addi	r4,r4,0x08
> > +	cmplw	cr5,r8,r10
> > +	cmplw	cr1,r9,r6
> > +	bne	cr5,L(st2)
> > +	bne	cr1,L(st1)
> > +	bdnz	L(word2_count_loop)
> > +
> > +L(prebyte_count_loop):
> > +	addi	r5,r5,1
> > +	mtctr	r5
> > +	bdz	L(end_memcmp)
> > +
> > +L(byte_count_loop):
> > +	lbz	r6,0(r3)
> > +	addi	r3,r3,0x01
> > +	lbz	r8,0(r4)
> > +	addi	r4,r4,0x01
> > +	cmplw	cr5,r8,r6
> > +	bne	cr5,L(st2)
> > +	bdnz	L(byte_count_loop)
> > +
> > +L(end_memcmp):
> > +	addi	r3,r0,0
> > +	blr
> > +
> > +L(l_r):
> > +	addi	r3,r0,1
> > +	blr
> > +
> > +L(st1):
> > +	blt	cr1,L(l_r)
> > +	addi	r3,r0,-1
> > +	blr
> > +
> > +L(st2):
> > +	blt	cr5,L(l_r)
> > +	addi	r3,r0,-1
> > +	blr
> > +END (BP_SYM (memcmp))
> > +libc_hidden_builtin_def (memcmp)
> > +weak_alias (memcmp,bcmp)
> > diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
> > new file mode 100644
> > index 0000000..a654c73
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
> > @@ -0,0 +1,133 @@
> > +/* Optimized memcpy implementation for PowerPC476.
> > +   Copyright (C) 2010 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > +   02110-1301 USA.  */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* memcpy
> > +
> > +	r0:return address
> > +	r3:destination address
> > +	r4:source address
> > +	r5:byte count
> > +
> > +	Save return address in r0.
> > +	If destinationn and source are unaligned and copy count is greater than 256
> > +	then copy 0-3 bytes to make destination aligned.
> > +	If 32 or more bytes to copy we use 32 byte copy loop.
> > +	Finaly we copy 0-31 extra bytes. */
> > +
> > +EALIGN (BP_SYM (memcpy), 5, 0)
> > +/* Check if bytes to copy are greater than 256 and if
> > +	source and destination are unaligned */
> > +	cmpwi	r5,0x0100
> > +	addi	r0,r3,0
> > +	ble	L(string_count_loop)
> > +	neg	r6,r3
> > +	clrlwi. r6,r6,30
> > +	beq	L(string_count_loop)
> > +	neg	r6,r4
> > +	clrlwi. r6,r6,30
> > +	beq	L(string_count_loop)
> > +	mtctr	r6
> > +	subf	r5,r6,r5
> > +
> > +L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
> > +	lbz	r8,0x0(r4)
> > +	addi	r4,r4,1
> > +	stb	r8,0x0(r3)
> > +	addi	r3,r3,1
> > +	bdnz	L(unaligned_bytecopy_loop)
> > +	srwi.	r7,r5,5
> > +	beq	L(preword2_count_loop)
> > +	mtctr	r7
> > +
> > +L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> > +	lwz	r6,0(r4)
> > +	lwz	r7,4(r4)
> > +	lwz	r8,8(r4)
> > +	lwz	r9,12(r4)
> > +	subi	r5,r5,0x20
> > +	stw	r6,0(r3)
> > +	stw	r7,4(r3)
> > +	stw	r8,8(r3)
> > +	stw	r9,12(r3)
> > +	lwz	r6,16(r4)
> > +	lwz	r7,20(r4)
> > +	lwz	r8,24(r4)
> > +	lwz	r9,28(r4)
> > +	addi	r4,r4,0x20
> > +	stw	r6,16(r3)
> > +	stw	r7,20(r3)
> > +	stw	r8,24(r3)
> > +	stw	r9,28(r3)
> > +	addi	r3,r3,0x20
> > +	bdnz	L(word8_count_loop_no_dcbt)
> > +
> > +L(preword2_count_loop): /* Copy remaining 0-31 bytes */
> > +	clrlwi. r12,r5,27
> > +	beq	L(end_memcpy)
> > +	mtxer	r12
> > +	lswx	r5,0,r4
> > +	stswx	r5,0,r3
> > +	mr	 r3,r0
> > +	blr
> > +
> > +L(string_count_loop): /* Copy odd 0-31 bytes */
> > +	clrlwi. r12,r5,28
> > +	add	r3,r3,r5
> > +	add	r4,r4,r5
> > +	beq	L(pre_string_copy)
> > +	mtxer	r12
> > +	subf	r4,r12,r4
> > +	subf	r3,r12,r3
> > +	lswx	r6,0,r4
> > +	stswx	r6,0,r3
> > +
> > +L(pre_string_copy): /* Check how many 32 byte chunck to copy */
> > +	srwi.	r7,r5,4
> > +	beq	L(end_memcpy)
> > +	mtctr	r7
> > +
> > +L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> > +	lwz	r6,-4(r4)
> > +	lwz	r7,-8(r4)
> > +	lwz	r8,-12(r4)
> > +	lwzu	r9,-16(r4)
> > +	stw	r6,-4(r3)
> > +	stw	r7,-8(r3)
> > +	stw	r8,-12(r3)
> > +	stwu	r9,-16(r3)
> > +	bdz	L(end_memcpy)
> > +	lwz	r6,-4(r4)
> > +	lwz	r7,-8(r4)
> > +	lwz	r8,-12(r4)
> > +	lwzu	r9,-16(r4)
> > +	stw	r6,-4(r3)
> > +	stw	r7,-8(r3)
> > +	stw	r8,-12(r3)
> > +	stwu	r9,-16(r3)
> > +	bdnz	L(word4_count_loop_no_dcbt)
> > +
> > +L(end_memcpy):
> > +	mr	 r3,r0
> > +	blr
> > +END (BP_SYM (memcpy))
> > +libc_hidden_builtin_def (memcpy)
> > diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
> > new file mode 100644
> > index 0000000..69d5d4c
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/memset.S
> > @@ -0,0 +1,155 @@
> > +/* Optimized memset implementation for PowerPC476.
> > +   Copyright (C) 2010 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > +   02110-1301 USA.  */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* memset
> > +
> > +	r3:destination address and return address
> > +	r4:source integer to copy
> > +	r5:byte count
> > +	r11:sources integer to copy in all 32 bits of reg
> > +	r12:temp return address
> > +
> > +	Save return address in r12
> > +	If destinationn is unaligned and count is greater tha 255 bytes
> > +	set 0-3 bytes to make destination aligned
> > +	If count is greater tha 255 bytes and setting zero to memory
> > +	use dbcz to set memeory when we can
> > +	otherwsie do the follwoing
> > +	If 16 or more words to set we use 16 word copy loop.
> > +	Finaly we set 0-15 extra bytes with string store. */
> > +
> > +EALIGN (BP_SYM (memset), 5, 0)
> > +	rlwinm	r11,r4,0,24,31
> > +	rlwimi	r11,r4,8,16,23
> > +	rlwimi	r11,r11,16,0,15
> > +	addi	r12,r3,0
> > +	cmpwi	r5,0x00FF
> > +	ble	L(preword8_count_loop)
> > +	cmpwi	r4,0x00
> > +	beq	L(use_dcbz)
> > +	neg	r6,r3
> > +	clrlwi.	r6,r6,30
> > +	beq	L(preword8_count_loop)
> > +	addi	r8,0,1
> > +	mtctr	r6
> > +	subi	r3,r3,1
> > +
> > +L(unaligned_bytecopy_loop):
> > +	stbu	r11,0x1(r3)
> > +	subf.	r5,r8,r5
> > +	beq	L(end_memset)
> > +	bdnz	L(unaligned_bytecopy_loop)
> > +	addi	r3,r3,1
> > +
> > +L(preword8_count_loop):
> > +	srwi.	r6,r5,4
> > +	beq	L(preword2_count_loop)
> > +	mtctr	r6
> > +	addi	r3,r3,-4
> > +	mr	r8,r11
> > +	mr	r9,r11
> > +	mr	r10,r11
> > +
> > +L(word8_count_loop_no_dcbt):
> > +	stwu	r8,4(r3)
> > +	stwu	r9,4(r3)
> > +	subi	r5,r5,0x10
> > +	stwu	r10,4(r3)
> > +	stwu	r11,4(r3)
> > +	bdnz	L(word8_count_loop_no_dcbt)
> > +	addi	r3,r3,4
> > +
> > +L(preword2_count_loop):
> > +	clrlwi.	r7,r5,28
> > +	beq	L(end_memset)
> > +	mr	r8,r11
> > +	mr	r9,r11
> > +	mr	r10,r11
> > +	mtxer	r7
> > +	stswx	r8,0,r3
> > +
> > +L(end_memset):
> > +	addi	r3,r12,0
> > +	blr
> > +
> > +L(use_dcbz):
> > +	neg	r6,r3
> > +	clrlwi.	r7,r6,28
> > +	beq	L(skip_string_loop)
> > +	mr	r8,r11
> > +	mr	r9,r11
> > +	mr	r10,r11
> > +	subf	r5,r7,r5
> > +	mtxer	r7
> > +	stswx	r8,0,r3
> > +	add	r3,r3,r7
> > +
> > +L(skip_string_loop):
> > +	clrlwi	r8,r6,25
> > +	srwi.	r8,r8,4
> > +	beq	L(dcbz_pre_loop)
> > +	mtctr	r8
> > +
> > +L(word_loop):
> > +	stw	r11,0(r3)
> > +	subi	r5,r5,0x10
> > +	stw	r11,4(r3)
> > +	stw	r11,8(r3)
> > +	stw	r11,12(r3)
> > +	addi	r3,r3,0x10
> > +	bdnz	L(word_loop)
> > +
> > +L(dcbz_pre_loop):
> > +	srwi	r6,r5,7
> > +	mtctr	r6
> > +	addi	r7,0,0
> > +
> > +L(dcbz_loop):
> > +	dcbz	r3,r7
> > +	addi	r3,r3,0x80
> > +	subi	r5,r5,0x80
> > +	bdnz	L(dcbz_loop)
> > +	srwi.	r6,r5,4
> > +	beq	L(postword2_count_loop)
> > +	mtctr	r6
> > +
> > +L(postword8_count_loop):
> > +	stw	r11,0(r3)
> > +	subi	r5,r5,0x10
> > +	stw	r11,4(r3)
> > +	stw	r11,8(r3)
> > +	stw	r11,12(r3)
> > +	addi	r3,r3,0x10
> > +	bdnz	L(postword8_count_loop)
> > +
> > +L(postword2_count_loop):
> > +	clrlwi.	r7,r5,28
> > +	beq	L(end_memset)
> > +	mr	r8,r11
> > +	mr	r9,r11
> > +	mr	r10,r11
> > +	mtxer	r7
> > +	stswx	r8,0,r3
> > +	b	L(end_memset)
> > +END (BP_SYM (memset))
> > +libc_hidden_builtin_def (memset)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
> > new file mode 100644
> > index 0000000..6eb5b5a
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
> > @@ -0,0 +1,137 @@
> > +/* Optimized strcmp implementation for PowerPC476.
> > +   Copyright (C) 2010 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > +   02110-1301 USA.  */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strcmp
> > +
> > +	Register Use
> > +	r0:temp return equality
> > +	r3:source1 address, return equality
> > +	r4:source2 address
> > +
> > +	Implementation description
> > +	Check 2 words from src1 and src2. If unequal jump to end and
> > +	return src1 > src2 or src1 < src2.
> > +	If null check bytes before null and then jump to end and
> > +	return src1 > src2, src1 < src2 or src1 = src2.
> > +	If src1 = src2 and no null, repeat. */
> > +
> > +EALIGN (BP_SYM(strcmp),5,0)
> > +	neg	r7,r3
> > +	clrlwi	r7,r7,20
> > +	neg	r8,r4
> > +	clrlwi	r8,r8,20
> > +	srwi.	r7,r7,5
> > +	beq	L(byte_loop)
> > +	srwi.	r8,r8,5
> > +	beq	L(byte_loop)
> > +	cmplw	r7,r8
> > +	mtctr	r7
> > +	ble	L(big_loop)
> > +	mtctr	r8
> > +
> > +L(big_loop):
> > +	lwz	r5,0(r3)
> > +	lwz	r6,4(r3)
> > +	lwz	r8,0(r4)
> > +	lwz	r9,4(r4)
> > +	dlmzb.	r12,r5,r6
> > +	bne	L(end_check)
> > +	cmplw	r5,r8
> > +	bne	L(st1)
> > +	cmplw	r6,r9
> > +	bne	L(st1)
> > +	lwz	r5,8(r3)
> > +	lwz	r6,12(r3)
> > +	lwz	r8,8(r4)
> > +	lwz	r9,12(r4)
> > +	dlmzb.	r12,r5,r6
> > +	bne	L(end_check)
> > +	cmplw	r5,r8
> > +	bne	L(st1)
> > +	cmplw	r6,r9
> > +	bne	L(st1)
> > +	lwz	r5,16(r3)
> > +	lwz	r6,20(r3)
> > +	lwz	r8,16(r4)
> > +	lwz	r9,20(r4)
> > +	dlmzb.	r12,r5,r6
> > +	bne	L(end_check)
> > +	cmplw	r5,r8
> > +	bne	L(st1)
> > +	cmplw	r6,r9
> > +	bne	L(st1)
> > +	lwz	r5,24(r3)
> > +	lwz	r6,28(r3)
> > +	addi	r3,r3,0x20
> > +	lwz	r8,24(r4)
> > +	lwz	r9,28(r4)
> > +	addi	r4,r4,0x20
> > +	dlmzb.	r12,r5,r6
> > +	bne	L(end_check)
> > +	cmplw	r5,r8
> > +	bne	L(st1)
> > +	cmplw	r6,r9
> > +	bne	L(st1)
> > +	bdnz	L(big_loop)
> > +	b	L(byte_loop)
> > +
> > +L(end_check):
> > +	subfic	r12,r12,4
> > +	blt	L(end_check2)
> > +	rlwinm	r12,r12,3,0,31
> > +	srw	r5,r5,r12
> > +	srw	r8,r8,r12
> > +	cmplw	r5,r8
> > +	bne	L(st1)
> > +	b	L(end_strcmp)
> > +
> > +L(end_check2):
> > +	addi	r12,r12,4
> > +	cmplw	r5,r8
> > +	rlwinm	r12,r12,3,0,31
> > +	bne	L(st1)
> > +	srw	r6,r6,r12
> > +	srw	r9,r9,r12
> > +	cmplw	r6,r9
> > +	bne	L(st1)
> > +
> > +L(end_strcmp):
> > +	addi	r3,r0,0
> > +	blr
> > +
> > +L(st1):
> > +	mfcr	r3
> > +	blr
> > +
> > +L(byte_loop):
> > +	lbz	r5,0(r3)
> > +	addi	r3,r3,1
> > +	lbz	r6,0(r4)
> > +	addi	r4,r4,1
> > +	cmplw	r5,r6
> > +	bne	L(st1)
> > +	cmpwi	r5,0
> > +	beq	L(end_strcmp)
> > +	b	L(byte_loop)
> > +END (BP_SYM (strcmp))
> > +libc_hidden_builtin_def (strcmp)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
> > new file mode 100644
> > index 0000000..025ac16
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
> > @@ -0,0 +1,110 @@
> > +/* Optimized strcpy implementation for PowerPC476.
> > +   Copyright (C) 2010 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > +   02110-1301 USA.  */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strcpy
> > +
> > +	Register Use
> > +	r3:destination and return address
> > +	r4:source address
> > +	r10:temp destination address
> > +
> > +	Implementation description
> > +	Loop by checking 2 words at a time, with dlmzb. Check if there is a null
> > +	in the 2 words. If there is a null jump to end checking to determine
> > +	where in the last 8 bytes it is. Copy the appropriate bytes of the last
> > +	8 according to the null position. */
> > +
> > +EALIGN (BP_SYM (strcpy), 5, 0)
> > +	neg	r7,r4
> > +	subi	r4,r4,1
> > +	clrlwi.	r8,r7,29
> > +	subi	r10,r3,1
> > +	beq	L(pre_word8_loop)
> > +	mtctr	r8
> > +
> > +L(loop):
> > +	lbzu	r5,0x01(r4)
> > +	cmpi	cr5,r5,0x0
> > +	stbu	r5,0x01(r10)
> > +	beq	cr5,L(end_strcpy)
> > +	bdnz	L(loop)
> > +
> > +L(pre_word8_loop):
> > +	subi	r4,r4,3
> > +	subi	r10,r10,3
> > +
> > +L(word8_loop):
> > +	lwzu	r5,0x04(r4)
> > +	lwzu	r6,0x04(r4)
> > +	dlmzb.	r11,r5,r6
> > +	bne	L(byte_copy)
> > +	stwu	r5,0x04(r10)
> > +	stwu	r6,0x04(r10)
> > +	lwzu	r5,0x04(r4)
> > +	lwzu	r6,0x04(r4)
> > +	dlmzb.	r11,r5,r6
> > +	bne	L(byte_copy)
> > +	stwu	r5,0x04(r10)
> > +	stwu	r6,0x04(r10)
> > +	lwzu	r5,0x04(r4)
> > +	lwzu	r6,0x04(r4)
> > +	dlmzb.	r11,r5,r6
> > +	bne	L(byte_copy)
> > +	stwu	r5,0x04(r10)
> > +	stwu	r6,0x04(r10)
> > +	lwzu	r5,0x04(r4)
> > +	lwzu	r6,0x04(r4)
> > +	dlmzb.	r11,r5,r6
> > +	bne	L(byte_copy)
> > +	stwu	r5,0x04(r10)
> > +	stwu	r6,0x04(r10)
> > +	b	L(word8_loop)
> > +
> > +L(last_bytes_copy):
> > +	stwu	r5,0x04(r10)
> > +	subi	r11,r11,4
> > +	mtctr	r11
> > +	addi	r10,r10,3
> > +	subi	r4,r4,1
> > +
> > +L(last_bytes_copy_loop):
> > +	lbzu	r5,0x01(r4)
> > +	stbu	r5,0x01(r10)
> > +	bdnz	L(last_bytes_copy_loop)
> > +	blr
> > +
> > +L(byte_copy):
> > +	blt	L(last_bytes_copy)
> > +	mtctr	r11
> > +	addi	r10,r10,3
> > +	subi	r4,r4,5
> > +
> > +L(last_bytes_copy_loop2):
> > +	lbzu	r5,0x01(r4)
> > +	stbu	r5,0x01(r10)
> > +	bdnz	L(last_bytes_copy_loop2)
> > +
> > +L(end_strcpy):
> > +	blr
> > +END (BP_SYM (strcpy))
> > +libc_hidden_builtin_def (strcpy)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
> > new file mode 100644
> > index 0000000..146b582
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strlen.S
> > @@ -0,0 +1,78 @@
> > +/* Optimized strlen implementation for PowerPC476.
> > +   Copyright (C) 2010 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > +   02110-1301 USA.  */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strlen
> > +
> > +	Register Use
> > +	r3:source address and return length of string
> > +	r4:byte counter
> > +
> > +	Implementation description
> > +	Load 2 words at a time and count bytes, if we find null we subtract one from
> > +	the count and return the count value. We need to subtract one because
> > +	we don't count the null character as a byte. */
> > +
> > +EALIGN (BP_SYM (strlen),5,0)
> > +	neg	r7,r3
> > +	clrlwi.	r8,r7,29
> > +	addi	r4,0,0
> > +	beq	L(byte_count_loop)
> > +	mtctr	r8
> > +
> > +L(loop):
> > +	lbz	r5,0(r3)
> > +	cmpi	cr5,r5,0x0
> > +	addi	r3,r3,0x1
> > +	addi	r4,r4,0x1
> > +	beq	cr5,L(end_strlen)
> > +	bdnz	L(loop)
> > +
> > +L(byte_count_loop):
> > +	lwz	r5,0(r3)
> > +	lwz	r6,4(r3)
> > +	dlmzb.	r12,r5,r6
> > +	add	r4,r4,r12
> > +	bne	L(end_strlen)
> > +	lwz	r5,8(r3)
> > +	lwz	r6,12(r3)
> > +	dlmzb.	r12,r5,r6
> > +	add	r4,r4,r12
> > +	bne	L(end_strlen)
> > +	lwz	r5,16(r3)
> > +	lwz	r6,20(r3)
> > +	dlmzb.	r12,r5,r6
> > +	add	r4,r4,r12
> > +	bne	L(end_strlen)
> > +	lwz	r5,24(r3)
> > +	lwz	r6,28(r3)
> > +	addi	r3,r3,0x20
> > +	dlmzb.	r12,r5,r6
> > +	add	r4,r4,r12
> > +	bne	L(end_strlen)
> > +	b	L(byte_count_loop)
> > +
> > +L(end_strlen):
> > +	addi	r3,r4,-1
> > +	blr
> > +END (BP_SYM (strlen))
> > +libc_hidden_builtin_def (strlen)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
> > new file mode 100644
> > index 0000000..c1beb23
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
> > @@ -0,0 +1,131 @@
> > +/* Optimized strncmp implementation for PowerPC476.
> > +   Copyright (C) 2010 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library; if not, write to the Free
> > +   Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > +   02110-1301 USA.  */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strncmp
> > +
> > +	Register Use
> > +	r0:temp return equality
> > +	r3:source1 address, return equality
> > +	r4:source2 address
> > +	r5:byte count
> > +
> > +	Implementation description
> > +	Touch in 3 lines of D-cache.
> > +	If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
> > +	Check 2 words from src1 and src2. If unequal jump to end and
> > +	return src1 > src2 or src1 < src2.
> > +	If null check bytes before null and then jump to end and
> > +	return src1 > src2, src1 < src2 or src1 = src2.
> > +	If count = zero check bytes before zero counter and then jump to end and
> > +	return src1 > src2, src1 < src2 or src1 = src2.
> > +	If src1 = src2 and no null, repeat. */
> > +
> > +EALIGN (BP_SYM(strncmp),5,0)
> > +	neg	r7,r3
> > +	clrlwi	r7,r7,20
> > +	neg	r8,r4
> > +	clrlwi	r8,r8,20
> > +	srwi.	r7,r7,3
> > +	beq	L(prebyte_count_loop)
> > +	srwi.	r8,r8,3
> > +	beq	L(prebyte_count_loop)
> > +	cmplw	r7,r8
> > +	mtctr	r7
> > +	ble	L(preword2_count_loop)
> > +	mtctr	r8
> > +
> > +L(preword2_count_loop):
> > +	srwi.	r6,r5,3
> > +	beq	L(prebyte_count_loop)
> > +	mfctr	r7
> > +	cmplw	r6,r7
> > +	bgt	L(set_count_loop)
> > +	mtctr	r6
> > +	clrlwi	r5,r5,29
> > +
> > +L(word2_count_loop):
> > +	lwz	r10,0(r3)
> > +	lwz	r6,4(r3)
> > +	addi	r3,r3,0x08
> > +	lwz	r8,0(r4)
> > +	lwz	r9,4(r4)
> > +	addi	r4,r4,0x08
> > +	dlmzb.	r12,r10,r6
> > +	bne	L(end_check)
> > +	cmplw	r10,r8
> > +	bne	L(st1)
> > +	cmplw	r6,r9
> > +	bne	L(st1)
> > +	bdnz	L(word2_count_loop)
> > +
> > +L(prebyte_count_loop):
> > +	addi	r5,r5,1
> > +	mtctr	r5
> > +	bdz	L(end_strncmp)
> > +
> > +L(byte_count_loop):
> > +	lbz	r6,0(r3)
> > +	addi	r3,r3,1
> > +	lbz	r7,0(r4)
> > +	addi	r4,r4,1
> > +	cmplw	r6,r7
> > +	bne	L(st1)
> > +	cmpwi	r6,0
> > +	beq	L(end_strncmp)
> > +	bdnz	L(byte_count_loop)
> > +	b	L(end_strncmp)
> > +
> > +L(set_count_loop):
> > +	slwi	r7,r7,3
> > +	subf	r5,r7,r5
> > +	b	L(word2_count_loop)
> > +
> > +L(end_check):
> > +	subfic	r12,r12,4
> > +	blt	L(end_check2)
> > +	rlwinm	r12,r12,3,0,31
> > +	srw	r10,r10,r12
> > +	srw	r8,r8,r12
> > +	cmplw	r10,r8
> > +	bne	L(st1)
> > +	b	L(end_strncmp)
> > +
> > +L(end_check2):
> > +	addi	r12,r12,4
> > +	cmplw	r10,r8
> > +	rlwinm	r12,r12,3,0,31
> > +	bne	L(st1)
> > +	srw	r6,r6,r12
> > +	srw	r9,r9,r12
> > +	cmplw	r6,r9
> > +	bne	L(st1)
> > +
> > +L(end_strncmp):
> > +	addi	r3,r0,0
> > +	blr
> > +
> > +L(st1):
> > +	mfcr	r3
> > +	blr
> > +END (BP_SYM (strncmp))
> > +libc_hidden_builtin_def (strncmp)
> > diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
> > new file mode 100644
> > index 0000000..70c0d2e
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/440/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/405/fpu
> > +powerpc/powerpc32/405
> > diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
> > new file mode 100644
> > index 0000000..c3e52c5
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/464/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/440/fpu
> > +powerpc/powerpc32/440
> > diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
> > new file mode 100644
> > index 0000000..2829f9c
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/476/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/464/fpu
> > +powerpc/powerpc32/464
> > diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
> > new file mode 100644
> > index 0000000..3d235de
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/Makefile
> > @@ -0,0 +1,8 @@
> > +# Some Powerpc32 variants assume soft-fp is the default even though there is
> > +# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
> > +
> > +ifeq ($(with-fp),yes)
> > ++cflags += -mhard-float
> > +ASFLAGS += -mhard-float
> > +sysdep-LDFLAGS += -mhard-float
> > +endif
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> > new file mode 100644
> > index 0000000..70c0d2e
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/405/fpu
> > +powerpc/powerpc32/405
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> > new file mode 100644
> > index 0000000..c3e52c5
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/440/fpu
> > +powerpc/powerpc32/440
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> > new file mode 100644
> > index 0000000..2829f9c
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/464/fpu
> > +powerpc/powerpc32/464
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> > new file mode 100644
> > index 0000000..80f9170
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/476/fpu
> > +powerpc/powerpc32/476
> > 
> > 

Sorry for the delinquent response.  This looks good to me and I think it
should be checked in.

I'd like for someone with a 405, 440, or 464 to test it further.  As far
as we know the code only uses instructions available on all of these
platforms.

I'd like to stress that it was authored by Todd Iglehart
<iglehart@us.ibm.com> and contributed by IBM.  Luis did the fixup and
authored the implies structure.

Ryan S. Arnold

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2010-12-13 20:26       ` Ryan Arnold
@ 2011-01-18 13:16         ` Ryan Arnold
  2011-01-25 21:32           ` Joseph S. Myers
  2012-01-18 20:31           ` acrux@cruxppc.org
  0 siblings, 2 replies; 19+ messages in thread
From: Ryan Arnold @ 2011-01-18 13:16 UTC (permalink / raw)
  To: libc-ports; +Cc: luisgpm, Todd Iglehart, Josh Boyer, rsa

On Mon, Dec 13, 2010 at 2:25 PM, Ryan Arnold <rsa@us.ibm.com> wrote:
> Sorry for the delinquent response.  This looks good to me and I think it
> should be checked in.
>
> I'd like for someone with a 405, 440, or 464 to test it further.  As far
> as we know the code only uses instructions available on all of these
> platforms.
>
> I'd like to stress that it was authored by Todd Iglehart
> <iglehart@us.ibm.com> and contributed by IBM.  Luis did the fixup and
> authored the implies structure.
>
> Ryan S. Arnold

I've checked this patch into glibc-ports under:

commit # a72cc2b29d00207fd8e2ee4612502339a14816b6

Just a general note on configuration; some of these processors have a
floating point unit but I believe all of them default to soft-fp.

GLIBC configure won't recognize --with-cpu=476fp even though the
compiler might recognize -mcpu=476fp.

If you want to configure a hard-fp build just pass --with-cpu=476
--with-fp instead and a new Makefile fragment will make sure that
-mhard-float is added to CFLAGS and ASFLAGS.

Ryan S. Arnold

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2011-01-18 13:16         ` Ryan Arnold
@ 2011-01-25 21:32           ` Joseph S. Myers
  2012-01-18 20:31           ` acrux@cruxppc.org
  1 sibling, 0 replies; 19+ messages in thread
From: Joseph S. Myers @ 2011-01-25 21:32 UTC (permalink / raw)
  To: Ryan Arnold; +Cc: libc-ports, luisgpm, Todd Iglehart, Josh Boyer, rsa

On Wed, 12 Jan 2011, Ryan Arnold wrote:

> I've checked this patch into glibc-ports under:
> 
> commit # a72cc2b29d00207fd8e2ee4612502339a14816b6

I've now moved the ChangeLog entry from the target-independent ChangeLog 
to ChangeLog.powerpc, where is where changes to powerpc sysdeps files in 
ports go.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2011-01-18 13:16         ` Ryan Arnold
  2011-01-25 21:32           ` Joseph S. Myers
@ 2012-01-18 20:31           ` acrux@cruxppc.org
  2012-01-19 19:35             ` Carlos O'Donell
  1 sibling, 1 reply; 19+ messages in thread
From: acrux@cruxppc.org @ 2012-01-18 20:31 UTC (permalink / raw)
  To: libc-ports


just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a Sam440ep[1]
(PPC440EP SoC [2]) but it remains stuck in this point:

[...]
gcc -m32 -nostdlib -nostartfiles -o
/home/999/new/work/src/build32/sunrpc/rpcgen -mhard-float
-Wl,-dynamic-linker=/lib/ld.so.1   -Wl,-z,combreloc -Wl,-z,relro
-Wl,--hash-style=both /home/999/new/work/src/build32/csu/crt1.o
/home/999/new/work/src/build32/csu/crti.o `gcc -m32 -mhard-float
--print-file-name=crtbegin.o`
/home/999/new/work/src/build32/sunrpc/rpc_main.o
/home/999/new/work/src/build32/sunrpc/rpc_hout.o
/home/999/new/work/src/build32/sunrpc/rpc_cout.o
/home/999/new/work/src/build32/sunrpc/rpc_parse.o
/home/999/new/work/src/build32/sunrpc/rpc_scan.o
/home/999/new/work/src/build32/sunrpc/rpc_util.o
/home/999/new/work/src/build32/sunrpc/rpc_svcout.o
/home/999/new/work/src/build32/sunrpc/rpc_clntout.o
/home/999/new/work/src/build32/sunrpc/rpc_tblout.o
/home/999/new/work/src/build32/sunrpc/rpc_sample.o 
-Wl,-rpath-link=/home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
/home/999/new/work/src/build32/libc.so.6
/home/999/new/work/src/build32/libc_nonshared.a -Wl,--as-needed
/home/999/new/work/src/build32/elf/ld.so -Wl,--no-as-needed -lgcc
-Wl,--as-needed -lgcc_s  -Wl,--no-as-needed `gcc -m32 -mhard-float
--print-file-name=crtend.o` /home/999/new/work/src/build32/csu/crtn.o
gcc -m32 rpcinfo.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline
-Wwrite-strings -fmerge-all-constants -mcpu=440 -mcpu=powerpc -pipe
-mhard-float -mnew-mnemonics -Wstrict-prototypes -mlong-double-128    
-I../include -I/home/999/new/work/src/build32/sunrpc
-I/home/999/new/work/src/build32 -I../sysdeps/powerpc/powerpc32/elf
-I../sysdeps/powerpc/elf
-I../ports/sysdeps/unix/sysv/linux/powerpc/powerpc32/440
-I../ports/sysdeps/powerpc/powerpc32/440
-I../ports/sysdeps/powerpc/powerpc32/405
-I../sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu
-I../sysdeps/powerpc/powerpc32/fpu
-I../nptl/sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../ports/sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../nptl/sysdeps/unix/sysv/linux/powerpc
-I../ports/sysdeps/unix/sysv/linux/powerpc
-I../sysdeps/unix/sysv/linux/powerpc -I../sysdeps/ieee754/ldbl-128ibm
-I../sysdeps/ieee754/ldbl-opt -I../nptl/sysdeps/unix/sysv/linux
-I../nptl/sysdeps/pthread -I../sysdeps/pthread
-I../ports/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux
-I../sysdeps/gnu -I../sysdeps/unix/common -I../sysdeps/unix/mman
-I../sysdeps/unix/inet -I../nptl/sysdeps/unix/sysv
-I../ports/sysdeps/unix/sysv -I../sysdeps/unix/sysv
-I../sysdeps/unix/powerpc -I../nptl/sysdeps/unix -I../ports/sysdeps/unix
-I../sysdeps/unix -I../sysdeps/posix -I../ports/sysdeps/powerpc/powerpc32
-I../sysdeps/powerpc/powerpc32 -I../sysdeps/wordsize-32
-I../sysdeps/powerpc/fpu -I../nptl/sysdeps/powerpc
-I../ports/sysdeps/powerpc -I../sysdeps/powerpc -I../sysdeps/ieee754/dbl-64
-I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754 -I../sysdeps/generic/elf
-I../sysdeps/generic -I../nptl -I../ports  -I.. -I../libio -I. -nostdinc
-isystem /usr/lib/gcc/powerpc-unknown-linux-gnu/4.5.3/include -isystem
/home/999/new/work/pkg/usr/include -D_LIBC_REENTRANT -include
../include/libc-symbols.h   -DNOT_IN_libc=1    -D_RPC_THREAD_SAFE_ -o
/home/999/new/work/src/build32/sunrpc/rpcinfo.o -MD -MP -MF
/home/999/new/work/src/build32/sunrpc/rpcinfo.o.dt -MT
/home/999/new/work/src/build32/sunrpc/rpcinfo.o
gcc -m32 -nostdlib -nostartfiles -o
/home/999/new/work/src/build32/sunrpc/rpcinfo -mhard-float
-Wl,-dynamic-linker=/lib/ld.so.1   -Wl,-z,combreloc -Wl,-z,relro
-Wl,--hash-style=both /home/999/new/work/src/build32/csu/crt1.o
/home/999/new/work/src/build32/csu/crti.o `gcc -m32 -mhard-float
--print-file-name=crtbegin.o`
/home/999/new/work/src/build32/sunrpc/rpcinfo.o 
-Wl,-rpath-link=/home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
/home/999/new/work/src/build32/libc.so.6
/home/999/new/work/src/build32/libc_nonshared.a -Wl,--as-needed
/home/999/new/work/src/build32/elf/ld.so -Wl,--no-as-needed -lgcc
-Wl,--as-needed -lgcc_s  -Wl,--no-as-needed `gcc -m32 -mhard-float
--print-file-name=crtend.o` /home/999/new/work/src/build32/csu/crtn.o
CPP='gcc -m32 -E -x c-header'  /home/999/new/work/src/build32/elf/ld.so.1
--library-path
/home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
/home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
rpcsvc/bootparam_prot.x -o
/home/999/new/work/src/build32/sunrpc/xbootparam_prot.T

Here my config.log: http://cruxppc.org/~acrux/config.log
Instead i successfully built glibc-2.13 without "--with-cpu=440 --with-fp" .
I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3, binutils-2.21.1,
glibc-2.12.2



cheers,
--nico

[1] http://www.acube-systems.biz/index.php?page=hardware&pid=2
[2]
http://myapm.apm.com/MyAMCC/jsp/public/productDetail/product_detail.jsp?productID=PPC440EP


Ryan S. Arnold wrote:
> 
> On Mon, Dec 13, 2010 at 2:25 PM, Ryan Arnold <rsa@us.ibm.com> wrote:
>> Sorry for the delinquent response.  This looks good to me and I think it
>> should be checked in.
>>
>> I'd like for someone with a 405, 440, or 464 to test it further.  As far
>> as we know the code only uses instructions available on all of these
>> platforms.
>>
>> I'd like to stress that it was authored by Todd Iglehart
>> <iglehart@us.ibm.com> and contributed by IBM.  Luis did the fixup and
>> authored the implies structure.
>>
>> Ryan S. Arnold
> 
> I've checked this patch into glibc-ports under:
> 
> commit # a72cc2b29d00207fd8e2ee4612502339a14816b6
> 
> Just a general note on configuration; some of these processors have a
> floating point unit but I believe all of them default to soft-fp.
> 
> GLIBC configure won't recognize --with-cpu=476fp even though the
> compiler might recognize -mcpu=476fp.
> 
> If you want to configure a hard-fp build just pass --with-cpu=476
> --with-fp instead and a new Makefile fragment will make sure that
> -mhard-float is added to CFLAGS and ASFLAGS.
> 
> Ryan S. Arnold
> 
> 

-- 
View this message in context: http://old.nabble.com/-PATCH--powerpc%3A-405-440-464-476-support-and-optimizations-tp29607194p33163939.html
Sent from the Sourceware - libc-ports mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-18 20:31           ` acrux@cruxppc.org
@ 2012-01-19 19:35             ` Carlos O'Donell
  2012-01-20 14:24               ` acrux
  0 siblings, 1 reply; 19+ messages in thread
From: Carlos O'Donell @ 2012-01-19 19:35 UTC (permalink / raw)
  To: acrux@cruxppc.org; +Cc: libc-ports

On Wed, Jan 18, 2012 at 3:30 PM, acrux@cruxppc.org <acrux@linuxmail.org> wrote:
>
> just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a Sam440ep[1]
> (PPC440EP SoC [2]) but it remains stuck in this point:
> CPP='gcc -m32 -E -x c-header'  /home/999/new/work/src/build32/elf/ld.so.1
> --library-path
> /home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
> /home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
> rpcsvc/bootparam_prot.x -o
> /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T

This is the first use of the newly build dynamic loader.

A failure here means that the dynamic loader has not been correctly compiled.

You should debug this to figure out what is going wrong in the loader.

> Here my config.log: http://cruxppc.org/~acrux/config.log
> Instead i successfully built glibc-2.13 without "--with-cpu=440 --with-fp" .
> I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3, binutils-2.21.1,
> glibc-2.12.2

Have you tested your compiler? What were the test results?

Did you test binutils? What were the test results?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-19 19:35             ` Carlos O'Donell
@ 2012-01-20 14:24               ` acrux
  2012-01-20 15:52                 ` Ryan S. Arnold
  0 siblings, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-20 14:24 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: libc-ports

On Thu, 19 Jan 2012 14:34:50 -0500
"Carlos O'Donell" <carlos@systemhalted.org> wrote:

> On Wed, Jan 18, 2012 at 3:30 PM, acrux@cruxppc.org
> <acrux@linuxmail.org> wrote:
> >
> > just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a
> > Sam440ep[1] (PPC440EP SoC [2]) but it remains stuck in this point:
> > CPP='gcc -m32 -E -x c-header'
> >  /home/999/new/work/src/build32/elf/ld.so.1
> > --library-path
> > /home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
> > /home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
> > rpcsvc/bootparam_prot.x -o
> > /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
> 
> This is the first use of the newly build dynamic loader.
> 
> A failure here means that the dynamic loader has not been correctly
> compiled.
> 
> You should debug this to figure out what is going wrong in the loader.
> 

i know, but i've no resource and i guess not enough skill to debug and
fix it. 

As i received a borda with a 440EP SoC, just only for fun, i
tested a build "--with-cpu=440 --with-fp".
I'd like to know if somobody really tested these features.

> > Here my config.log: http://cruxppc.org/~acrux/config.log
> > Instead i successfully built glibc-2.13 without "--with-cpu=440
> > --with-fp" . I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3,
> > binutils-2.21.1, glibc-2.12.2
> 
> Have you tested your compiler? What were the test results?
> 
> Did you test binutils? What were the test results?
> 

they finished their testsuites with only the well know failures for
their own releases. Thus i can say they are good.

best,
--nico
-- 
GNU/Linux on Power Architecture
CRUX PPC - http://cruxppc.org/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-20 14:24               ` acrux
@ 2012-01-20 15:52                 ` Ryan S. Arnold
  2012-01-20 18:03                   ` Carlos O'Donell
  2012-01-23  0:41                   ` acrux
  0 siblings, 2 replies; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-20 15:52 UTC (permalink / raw)
  To: acrux; +Cc: Carlos O'Donell, libc-ports

On Fri, Jan 20, 2012 at 8:24 AM, acrux <acrux_it@libero.it> wrote:
> On Thu, 19 Jan 2012 14:34:50 -0500
> "Carlos O'Donell" <carlos@systemhalted.org> wrote:
>
>> On Wed, Jan 18, 2012 at 3:30 PM, acrux@cruxppc.org
>> <acrux@linuxmail.org> wrote:
>> >
>> > just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a
>> > Sam440ep[1] (PPC440EP SoC [2]) but it remains stuck in this point:
>> > CPP='gcc -m32 -E -x c-header'
>> >  /home/999/new/work/src/build32/elf/ld.so.1
>> > --library-path
>> > /home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
>> > /home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
>> > rpcsvc/bootparam_prot.x -o
>> > /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
>>
>> This is the first use of the newly build dynamic loader.
>>
>> A failure here means that the dynamic loader has not been correctly
>> compiled.
>>
>> You should debug this to figure out what is going wrong in the loader.
>>
>
> i know, but i've no resource and i guess not enough skill to debug and
> fix it.
>
> As i received a borda with a 440EP SoC, just only for fun, i
> tested a build "--with-cpu=440 --with-fp".
> I'd like to know if somobody really tested these features.
>
>> > Here my config.log: http://cruxppc.org/~acrux/config.log
>> > Instead i successfully built glibc-2.13 without "--with-cpu=440
>> > --with-fp" . I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3,
>> > binutils-2.21.1, glibc-2.12.2
>>
>> Have you tested your compiler? What were the test results?
>>
>> Did you test binutils? What were the test results?
>>
>
> they finished their testsuites with only the well know failures for
> their own releases. Thus i can say they are good.

Integration can often expose problems, but I suspect something else is
going on here...

Does the machine you're building this toolchain on understand the
440fp instruction set?  If not then the loader is likely encountering
a sigill.

The solution to this situation is to enable cross compiling prior to
running configure.  This will prevent the build from attempting to run
any of the code.

echo "cross-compiling=yes" >> configparms

If that's not the problem I would suggest trying to build using 440 without fp.

If that works then I suspect that the problem is related to the string
routine optimizations that one of my guys put in for the 476
processor.  It was requested that we provide it to the entire 4xx
series since the instructions used (allegedly) weren't unique to the
476.

It'd be interesting to run a debugger against the loader at this point
and identify whether you're encountering a sigill or a sigsegv.

Ryan S. Arnold

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-20 15:52                 ` Ryan S. Arnold
@ 2012-01-20 18:03                   ` Carlos O'Donell
  2012-01-23  0:41                   ` acrux
  1 sibling, 0 replies; 19+ messages in thread
From: Carlos O'Donell @ 2012-01-20 18:03 UTC (permalink / raw)
  To: Ryan S. Arnold; +Cc: acrux, libc-ports

On Fri, Jan 20, 2012 at 10:52 AM, Ryan S. Arnold <ryan.arnold@gmail.com> wrote:
> Does the machine you're building this toolchain on understand the
> 440fp instruction set?  If not then the loader is likely encountering
> a sigill.

There are some known problems in this area which Mentor Graphics ESD
has been fixing and submitting upstream.

They mainly have to do with the graphics and string instructions that
aren't uniformly supported and gcc doesn't know when not to use some
of these instructions.

If your goal is to succeed at compiling things for your target then I
would start here:
http://www.mentor.com/embedded-software/sourcery-tools/sourcery-codebench/platforms/power-gnulinux

or just command-line tools for free, click under "Power Architecture
Processors"->"Download the GNU/Linux release"
http://www.mentor.com/embedded-software/sourcery-tools/sourcery-codebench/editions/lite-edition/

They are all based on the open-source tools but with enhancements to
fix problems like the one you ran into.

I'm part of the team that produces these toolchains so I'm biased :-)

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-20 15:52                 ` Ryan S. Arnold
  2012-01-20 18:03                   ` Carlos O'Donell
@ 2012-01-23  0:41                   ` acrux
  2012-01-23 15:48                     ` Ryan S. Arnold
  1 sibling, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-23  0:41 UTC (permalink / raw)
  To: Ryan S. Arnold; +Cc: Carlos O'Donell, libc-ports

On Fri, 20 Jan 2012 09:52:22 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:

_omissis__
> 
> Integration can often expose problems, but I suspect something else is
> going on here...
> 
> Does the machine you're building this toolchain on understand the
> 440fp instruction set?  If not then the loader is likely encountering
> a sigill.
> 
> The solution to this situation is to enable cross compiling prior to
> running configure.  This will prevent the build from attempting to run
> any of the code.
> 
> echo "cross-compiling=yes" >> configparms
> 
> If that's not the problem I would suggest trying to build using 440 without fp.
> 

hi,
also without fp it makes no difference and i still have the same problem.
Just to be sure about the compiler[1]... i simply removed those seven files (the optimized routines) and  i successfully built it "--with-cpu=440 --with-fp".  Indeed i've always been able to build everything, even,  with very aggressive cflags[2]  without any kind of issues.


> If that works then I suspect that the problem is related to the string
> routine optimizations that one of my guys put in for the 476
> processor.  It was requested that we provide it to the entire 4xx
> series since the instructions used (allegedly) weren't unique to the
> 476.
> 
> It'd be interesting to run a debugger against the loader at this point
> and identify whether you're encountering a sigill or a sigsegv.
> 

btw, i guess a SIGSEGV because it simply stuck there and cpu goes idle.
If you need i can provide to you an ssh access on this little board [3].


best,
--nico


[1]
binutils-2.21.1, gcc-4.5.3, glibc-2.12.2
gmp-5.0.2-2, mpfr-3.1.0-p3, mpc-0.9
ppl-0.11.2, cloog-ppl-0.15.11

[2]
"-O3 -mcpu=440fp -mmulhw -mdlmzb -pipe -fsigned-char -mpowerpc-gfxopt -fpeel-loops -ftracer -fgraphite-identity -floop-parallelize-all -ftree-loop-linear -ftree-loop-distribution -funroll-loops -floop-interchange -floop-strip-mine -floop-block"

[3]
root@sam4x0:~# uname -a
Linux sam4x0 3.1.10 #1 PREEMPT Fri Jan 20 21:30:46 CET 2012 ppc 440EP Rev. C GNU/Linux
root@sam4x0:~# lscpu
Architecture:          ppc
Byte Order:            Big Endian
CPU(s):                1
On-line CPU(s) list:   0
Model:                 acube,sam440ep
BogoMIPS:              1333.33
L1d cache:             32K
L1i cache:             32K
root@sam4x0:~# cat /proc/cpuinfo
processor       : 0
cpu             : 440EP Rev. C
clock           : 666.666660MHz
revision        : 24.212 (pvr 4222 18d4)
bogomips        : 1333.33
timebase        : 666666660
platform        : Sam440ep
model           : acube,sam440ep
Memory          : 1023 MB
root@sam4x0:~# lspci
00:00.0 Bridge: IBM Device 027f
00:0a.0 PCI bridge: Pericom Semiconductor PCI to PCI Bridge (rev 02)
00:0c.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV280 [Radeon 9200 PRO] (rev 01)
00:0c.1 Display controller: Advanced Micro Devices [AMD] nee ATI RV280 [Radeon 9200 PRO] (Secondary) (rev 01)
00:0e.0 Mass storage controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
01:04.0 Multimedia audio controller: Cirrus Logic Crystal CS4281 PCI Audio (rev 01)
01:05.0 USB controller: NEC Corporation USB (rev 43)
01:05.1 USB controller: NEC Corporation USB (rev 43)
01:05.2 USB controller: NEC Corporation USB 2.0 (rev 04)


-- 
acrux <acrux@cruxppc.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-23  0:41                   ` acrux
@ 2012-01-23 15:48                     ` Ryan S. Arnold
  2012-01-24 16:47                       ` acrux
  0 siblings, 1 reply; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-23 15:48 UTC (permalink / raw)
  To: acrux; +Cc: Carlos O'Donell, libc-ports

On Sun, Jan 22, 2012 at 6:42 PM, acrux <acrux_it@libero.it> wrote:
>> If that works then I suspect that the problem is related to the string
>> routine optimizations that one of my guys put in for the 476
>> processor.  It was requested that we provide it to the entire 4xx
>> series since the instructions used (allegedly) weren't unique to the
>> 476.
>>
>> It'd be interesting to run a debugger against the loader at this point
>> and identify whether you're encountering a sigill or a sigsegv.
>>
>
> btw, i guess a SIGSEGV because it simply stuck there and cpu goes idle.
> If you need i can provide to you an ssh access on this little board [3].

Hi Nico, I think you should debug the loader and get a backtrace.
You'll use the instructions here:

http://sourceware.org/glibc/wiki/Debugging/Loader_Debugging#Debugging_With_An_Alternate_Loader

I've made it easy for you.  Here's two scripts with your build
directory already embedded.  What you need to do is invoke GDB using
the new loader if you can (so that you don't get library mismatching)
and then tell GDB (with the .gdb script) to debug the loader.

Here's the .gdb script you'll use:

rpcgen.gdb:
------------------------------------
set environment gcc -m32 C -E -x c-header
break _dl_main_dispatch
run --library-path
/home/999/new/work/src/build32/:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/math:\
/home/999/new/work/src/build32/elf:\
/home/999/new/work/src/build32/dlfcn:\
/home/999/new/work/src/build32/nss:\
/home/999/new/work/src/build32/nis:\
/home/999/new/work/src/build32/rt:\
/home/999/new/work/src/build32/resolv:\
/home/999/new/work/src/build32/crypt:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/nptl_db \
/home/999/new/work/src/build32/sunrpc/rpcgen -Y
/home/999/new/work/src/scripts  -c
/home/999/new/work/src/build32/sunrpc/rpcsvc/bootparam_prot.x -o
/home/999/new/work/src/build32/sunrpc/xbootparam_prot.T


Here's the shell script which will invoke GDB:

debug_rpcgen.sh:
-------------------------------------
#!/bin/bash

ulimit -c unlimited
GLIBC="/home/999/new/work/src/build32/"

CPP='gcc -m32 -E -x c-header' \
${GLIBC}/elf/ld.so.1 --library-path \
${GLIBC}:\
${GLIBC}/math:\
${GLIBC}/elf:\
${GLIBC}/dlfcn:\
${GLIBC}/nss:\
${GLIBC}/nis:\
${GLIBC}/rt:\
${GLIBC}/resolv:\
${GLIBC}/crypt:\
${GLIBC}/nptl:\
${GLIBC}/nptl_db: \
/usr/bin/gdb -x rpcgen.gdb -d home/999/new/work/src/build32/elf/ld.so.1

So try running debug_rpcgen.sh first.

If it works the loader should be breaking in the loader on
_dl_main_dispatch.  You can simply (gdb) continue at this point and
the loader should crash whereby gdb will trap and show you where it
crashed and why (segfault or sigill).

If debug_rpcgen.sh crashes immediately without GDB coming up it means
that the loader itself is crashing in the string routines (which is
the most likely scenario).  If that is the case you should try running
the debugger with the system loader instead.  You may get some library
mismatch warnings but do the following:

CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb -d
home/999/new/work/src/build32/elf/ld.so.1

If it successfully traps in _dl_main_dispatch do what I mentioned
above to see where the loader is crashing.

Let me know where/why it is crashing.

Ryan S. Arnold

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-23 15:48                     ` Ryan S. Arnold
@ 2012-01-24 16:47                       ` acrux
  2012-01-24 17:20                         ` Ryan S. Arnold
  0 siblings, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-24 16:47 UTC (permalink / raw)
  To: Ryan S. Arnold; +Cc: libc-ports

On Mon, 23 Jan 2012 09:48:19 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:

> On Sun, Jan 22, 2012 at 6:42 PM, acrux <acrux_it@libero.it> wrote:
> >> If that works then I suspect that the problem is related to the string
> >> routine optimizations that one of my guys put in for the 476
> >> processor.  It was requested that we provide it to the entire 4xx
> >> series since the instructions used (allegedly) weren't unique to the
> >> 476.
> >>
> >> It'd be interesting to run a debugger against the loader at this point
> >> and identify whether you're encountering a sigill or a sigsegv.
> >>
> >
> > btw, i guess a SIGSEGV because it simply stuck there and cpu goes idle.
> > If you need i can provide to you an ssh access on this little board [3].
> 
> Hi Nico, I think you should debug the loader and get a backtrace.
> You'll use the instructions here:
> 
> http://sourceware.org/glibc/wiki/Debugging/Loader_Debugging#Debugging_With_An_Alternate_Loader
> 
> I've made it easy for you.  Here's two scripts with your build
> directory already embedded.  What you need to do is invoke GDB using
> the new loader if you can (so that you don't get library mismatching)
> and then tell GDB (with the .gdb script) to debug the loader.
> 
> Here's the .gdb script you'll use:
> 
> rpcgen.gdb:
> ------------------------------------
> set environment gcc -m32 C -E -x c-header
> break _dl_main_dispatch
> run --library-path
> /home/999/new/work/src/build32/:\
> /home/999/new/work/src/build32/nptl:\
> /home/999/new/work/src/build32/math:\
> /home/999/new/work/src/build32/elf:\
> /home/999/new/work/src/build32/dlfcn:\
> /home/999/new/work/src/build32/nss:\
> /home/999/new/work/src/build32/nis:\
> /home/999/new/work/src/build32/rt:\
> /home/999/new/work/src/build32/resolv:\
> /home/999/new/work/src/build32/crypt:\
> /home/999/new/work/src/build32/nptl:\
> /home/999/new/work/src/build32/nptl_db \
> /home/999/new/work/src/build32/sunrpc/rpcgen -Y
> /home/999/new/work/src/scripts  -c
> /home/999/new/work/src/build32/sunrpc/rpcsvc/bootparam_prot.x -o
> /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
> 
> 
> Here's the shell script which will invoke GDB:
> 
> debug_rpcgen.sh:
> -------------------------------------
> #!/bin/bash
> 
> ulimit -c unlimited
> GLIBC="/home/999/new/work/src/build32/"
> 
> CPP='gcc -m32 -E -x c-header' \
> ${GLIBC}/elf/ld.so.1 --library-path \
> ${GLIBC}:\
> ${GLIBC}/math:\
> ${GLIBC}/elf:\
> ${GLIBC}/dlfcn:\
> ${GLIBC}/nss:\
> ${GLIBC}/nis:\
> ${GLIBC}/rt:\
> ${GLIBC}/resolv:\
> ${GLIBC}/crypt:\
> ${GLIBC}/nptl:\
> ${GLIBC}/nptl_db: \
> /usr/bin/gdb -x rpcgen.gdb -d home/999/new/work/src/build32/elf/ld.so.1
> 
> So try running debug_rpcgen.sh first.
> 

hi Ryan,
thanks. I just did some minor fix to your scripts to have the correct path for my built

# cat > rpcgen.gdb << "EOF"
set environment gcc -m32 C -E -x c-header
break _dl_main_dispatch
run --library-path \
/home/999/new/work/src/build32/:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/math:\
/home/999/new/work/src/build32/elf:\
/home/999/new/work/src/build32/dlfcn:\
/home/999/new/work/src/build32/nss:\
/home/999/new/work/src/build32/nis:\
/home/999/new/work/src/build32/rt:\
/home/999/new/work/src/build32/resolv:\
/home/999/new/work/src/build32/crypt:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/nptl_db \
/home/999/new/work/src/build32/sunrpc/rpcgen -Y \
/home/999/new/work/src/glibc-2.13/scripts -c \
/home/999/new/work/src/glibc-2.13/sunrpc/rpcsvc/bootparam_prot.x -o \
/home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
EOF

# cat > debug_rpcgen.sh << "EOF"
#!/bin/bash

ulimit -c unlimited
GLIBC="/home/999/new/work/src/build32/"

CPP='gcc -m32 -E -x c-header' \
${GLIBC}/elf/ld.so.1 --library-path \
${GLIBC}:\
${GLIBC}/math:\
${GLIBC}/elf:\
${GLIBC}/dlfcn:\
${GLIBC}/nss:\
${GLIBC}/nis:\
${GLIBC}/rt:\
${GLIBC}/resolv:\
${GLIBC}/crypt:\
${GLIBC}/nptl:\
${GLIBC}/nptl_db \
/usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
EOF
 
> If it works the loader should be breaking in the loader on
> _dl_main_dispatch.  You can simply (gdb) continue at this point and
> the loader should crash whereby gdb will trap and show you where it
> crashed and why (segfault or sigill).
> 
> If debug_rpcgen.sh crashes immediately without GDB coming up it means
> that the loader itself is crashing in the string routines (which is

it's right, it simply stuck.

> the most likely scenario).  If that is the case you should try running
> the debugger with the system loader instead.  You may get some library
> mismatch warnings but do the following:
> 
> CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb -d
> home/999/new/work/src/build32/elf/ld.so.1
> 

# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
Breakpoint 1 at 0x163c0

Breakpoint 1, 0x204b73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
(gdb) info proc mapping
process 2575
cmdline = '/home/999/new/work/src/build32/elf/ld.so.1'
cwd = '/home/999/ryan'
exe = '/home/999/new/work/src/build32/elf/ld.so'
Mapped address spaces:

        Start Addr   End Addr       Size     Offset objfile
          0x100000   0x102000     0x2000          0           [vdso]
         0xfe82000  0xffd8000   0x156000          0      /home/999/new/work/src/build32/libc.so
         0xffd8000  0xffe8000    0x10000   0x156000      /home/999/new/work/src/build32/libc.so
         0xffe8000  0xffea000     0x2000   0x156000      /home/999/new/work/src/build32/libc.so
         0xffea000  0xffed000     0x3000   0x158000      /home/999/new/work/src/build32/libc.so
         0xffed000  0xfff0000     0x3000          0
        0x10000000 0x10014000    0x14000          0      /home/999/new/work/src/build32/sunrpc/rpcgen
        0x10023000 0x10024000     0x1000    0x13000      /home/999/new/work/src/build32/sunrpc/rpcgen
        0x10024000 0x10025000     0x1000    0x14000      /home/999/new/work/src/build32/sunrpc/rpcgen
        0x204a1000 0x204bf000    0x1e000          0      /home/999/new/work/src/build32/elf/ld.so
        0x204ce000 0x204cf000     0x1000    0x1d000      /home/999/new/work/src/build32/elf/ld.so
        0x204cf000 0x204d1000     0x2000    0x1e000      /home/999/new/work/src/build32/elf/ld.so
        0x48000000 0x48002000     0x2000          0
        0xbffdf000 0xc0000000    0x21000          0           [stack]
(gdb) bt
#0  0x204b73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
#1  0x00000000 in ?? ()
(gdb) continue
Continuing.

well... now it stuck... and i must do a ctrl-c

^C
Program received signal SIGINT, Interrupt.
0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
(gdb) bt
#0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
#1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
#2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
#3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
#4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
#5  0x00000000 in ?? ()
(gdb) quit
A debugging session is active.

        Inferior 1 [process 2575] will be killed.

Quit anyway? (y or n) y


> If it successfully traps in _dl_main_dispatch do what I mentioned
> above to see where the loader is crashing.
> 
> Let me know where/why it is crashing.
> 

that's all folks!

best,
--nico
-- 
acrux <acrux@cruxppc.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-24 16:47                       ` acrux
@ 2012-01-24 17:20                         ` Ryan S. Arnold
  2012-01-24 17:41                           ` acrux
  0 siblings, 1 reply; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-24 17:20 UTC (permalink / raw)
  To: acrux; +Cc: libc-ports

On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
> Program received signal SIGINT, Interrupt.
> 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> (gdb) bt
> #0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> #1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> #2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> #3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> #4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> #5  0x00000000 in ?? ()

Wow, that is not what I expected at all...

I can't imagine that there are other threads at this point but ....
(gdb) info threads

And if there are, please dump the thread backtrace.

Ryan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-24 17:20                         ` Ryan S. Arnold
@ 2012-01-24 17:41                           ` acrux
  2012-01-24 17:59                             ` Ryan S. Arnold
  0 siblings, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-24 17:41 UTC (permalink / raw)
  To: Ryan S. Arnold; +Cc: libc-ports

On Tue, 24 Jan 2012 11:19:54 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:

> On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
> > Program received signal SIGINT, Interrupt.
> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > (gdb) bt
> > #0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > #1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> > #2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> > #3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> > #4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> > #5  0x00000000 in ?? ()
> 
> Wow, that is not what I expected at all...
> 
> I can't imagine that there are other threads at this point but ....
> (gdb) info threads
> 
> And if there are, please dump the thread backtrace.
> 


root@sam4x0:/home/999/ryan# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
Breakpoint 1 at 0x163c0

Breakpoint 1, 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
(gdb) info threads
  Id   Target Id         Frame
* 1    process 2609 "ld.so.1" 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
(gdb) bt
#0  0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
#1  0x00000000 in ?? ()
(gdb) continue
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
(gdb) bt
#0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
#1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
#2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
#3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
#4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
#5  0x00000000 in ?? ()
(gdb) info threads
  Id   Target Id         Frame
* 1    process 2609 "ld.so.1" 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
(gdb) thread apply all bt full

Thread 1 (process 2609):
#0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#5  0x00000000 in ?? ()
No symbol table info available.
(gdb) q
A debugging session is active.

        Inferior 1 [process 2609] will be killed.

Quit anyway? (y or n) y



--nico
-- 
acrux <acrux_it@libero.it>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-24 17:41                           ` acrux
@ 2012-01-24 17:59                             ` Ryan S. Arnold
  2012-02-18  2:06                               ` acrux
  0 siblings, 1 reply; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-24 17:59 UTC (permalink / raw)
  To: acrux; +Cc: libc-ports

On Tue, Jan 24, 2012 at 11:43 AM, acrux <acrux_it@libero.it> wrote:
> On Tue, 24 Jan 2012 11:19:54 -0600
> "Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
>
>> On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
>> > Program received signal SIGINT, Interrupt.
>> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
>> > (gdb) bt
>> > #0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
>> > #1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
>> > #2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
>> > #3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
>> > #4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
>> > #5  0x00000000 in ?? ()
>>
>> Wow, that is not what I expected at all...
>>
>> I can't imagine that there are other threads at this point but ....
>> (gdb) info threads
>>
>> And if there are, please dump the thread backtrace.
>>
>
>
> root@sam4x0:/home/999/ryan# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
> GNU gdb (GDB) 7.3.1
> Copyright (C) 2011 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "powerpc-unknown-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
> Breakpoint 1 at 0x163c0
>
> Breakpoint 1, 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> (gdb) info threads
>  Id   Target Id         Frame
> * 1    process 2609 "ld.so.1" 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> (gdb) bt
> #0  0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> #1  0x00000000 in ?? ()
> (gdb) continue
> Continuing.
> ^C
> Program received signal SIGINT, Interrupt.
> 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> (gdb) bt
> #0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> #1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> #2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> #3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> #4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> #5  0x00000000 in ?? ()
> (gdb) info threads
>  Id   Target Id         Frame
> * 1    process 2609 "ld.so.1" 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> (gdb) thread apply all bt full

I wonder if one of the string routine calls in the loader overwrote
its bounds and ended up writing over that lock, hence why the wait is
hanging.

Can you do an (gdb) info frame and try to figure out what the value of
the futex is when it's blocking?

Ryan

Ryan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
  2012-01-24 17:59                             ` Ryan S. Arnold
@ 2012-02-18  2:06                               ` acrux
  0 siblings, 0 replies; 19+ messages in thread
From: acrux @ 2012-02-18  2:06 UTC (permalink / raw)
  To: Ryan S. Arnold; +Cc: libc-ports

On Tue, 24 Jan 2012 11:59:31 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:

> On Tue, Jan 24, 2012 at 11:43 AM, acrux <acrux_it@libero.it> wrote:
> > On Tue, 24 Jan 2012 11:19:54 -0600
> > "Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
> >
> >> On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
> >> > Program received signal SIGINT, Interrupt.
> >> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> >> > (gdb) bt
> >> > #0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> >> > #1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> >> > #2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> >> > #3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> >> > #4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> >> > #5  0x00000000 in ?? ()
> >>
> >> Wow, that is not what I expected at all...
> >>
> >> I can't imagine that there are other threads at this point but ....
> >> (gdb) info threads
> >>
> >> And if there are, please dump the thread backtrace.
> >>
> >
> >
> > root@sam4x0:/home/999/ryan# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
> > GNU gdb (GDB) 7.3.1
> > Copyright (C) 2011 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> > and "show warranty" for details.
> > This GDB was configured as "powerpc-unknown-linux-gnu".
> > For bug reporting instructions, please see:
> > <http://www.gnu.org/software/gdb/bugs/>...
> > Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
> > Breakpoint 1 at 0x163c0
> >
> > Breakpoint 1, 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> > (gdb) info threads
> >  Id   Target Id         Frame
> > * 1    process 2609 "ld.so.1" 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> > (gdb) bt
> > #0  0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> > #1  0x00000000 in ?? ()
> > (gdb) continue
> > Continuing.
> > ^C
> > Program received signal SIGINT, Interrupt.
> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > (gdb) bt
> > #0  0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > #1  0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> > #2  0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> > #3  0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> > #4  0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> > #5  0x00000000 in ?? ()
> > (gdb) info threads
> >  Id   Target Id         Frame
> > * 1    process 2609 "ld.so.1" 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > (gdb) thread apply all bt full
> 
> I wonder if one of the string routine calls in the loader overwrote
> its bounds and ended up writing over that lock, hence why the wait is
> hanging.
> 
> Can you do an (gdb) info frame and try to figure out what the value of
> the futex is when it's blocking?
> 

hi Ryan,
did you performed any further test to understand the problem on 440fp cores?

thanks,
--nico
-- 
acrux <acrux@cruxppc.org>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-02-18  2:06 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-02 17:34 [PATCH] powerpc: 405/440/464/476 support and optimizations Luis Machado
2010-09-03 14:45 ` Ryan Arnold
2010-09-03 15:00   ` Luis Machado
2010-10-04 18:54     ` Luis Machado
2010-12-13 20:26       ` Ryan Arnold
2011-01-18 13:16         ` Ryan Arnold
2011-01-25 21:32           ` Joseph S. Myers
2012-01-18 20:31           ` acrux@cruxppc.org
2012-01-19 19:35             ` Carlos O'Donell
2012-01-20 14:24               ` acrux
2012-01-20 15:52                 ` Ryan S. Arnold
2012-01-20 18:03                   ` Carlos O'Donell
2012-01-23  0:41                   ` acrux
2012-01-23 15:48                     ` Ryan S. Arnold
2012-01-24 16:47                       ` acrux
2012-01-24 17:20                         ` Ryan S. Arnold
2012-01-24 17:41                           ` acrux
2012-01-24 17:59                             ` Ryan S. Arnold
2012-02-18  2:06                               ` acrux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).