* [PATCH] powerpc: 405/440/464/476 support and optimizations
@ 2010-09-02 17:34 Luis Machado
2010-09-03 14:45 ` Ryan Arnold
0 siblings, 1 reply; 19+ messages in thread
From: Luis Machado @ 2010-09-02 17:34 UTC (permalink / raw)
To: libc-ports; +Cc: rsa, Todd Iglehart, Josh Boyer
Hi,
This patch adds powerpc 405/440/464/476 platforms to ports and adds 3
memory (memcpy,memcmp,memset) optimizations and 4 string function
(strcmp,strncmp,strcpy,strlen) optimizations (provided by Todd, copied),
placed under 405, so all those platforms can use those optimized
functions.
The patch also adds the required Makefile, sysdeps structure and Implies
files.
Is this OK?
Regards,
Luis
2010-09-02 Todd Iglehart <iglehart@us.ibm.com>
Ryan Arnold <rsa@us.ibm.com>
Luis Machado <luisgpm@br.ibm.com>
* sysdeps/powerpc/dl-procinfo.c: New file.
* sysdeps/powerpc/dl-procinfo.h: New file.
* sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
* sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
* sysdeps/powerpc/powerpc32/405/memset.S: New file.
* sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
* sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
* sysdeps/powerpc/powerpc32/405/strlen.S: New file.
* sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
* sysdeps/powerpc/powerpc32/440/Implies: New file.
* sysdeps/powerpc/powerpc32/464/Implies: New file.
* sysdeps/powerpc/powerpc32/476/Implies: New file.
* sysdeps/powerpc/powerpc32/Makefile: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.
diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
new file mode 100644
index 0000000..60fb465
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.c
@@ -0,0 +1,96 @@
+/* Data for processor capability information. PowerPC version.
+ Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+/* This information must be kept in sync with the _DL_HWCAP_COUNT and
+ _DL_PLATFORM_COUNT definitions in procinfo.h.
+
+ If anything should be added here check whether the size of each string
+ is still ok with the given array size.
+
+ All the #ifdefs in the definitions are quite irritating but
+ necessary if we want to avoid duplicating the information. There
+ are three different modes:
+
+ - PROCINFO_DECL is defined. This means we are only interested in
+ declarations.
+
+ - PROCINFO_DECL is not defined:
+
+ + if SHARED is defined the file is included in an array
+ initializer. The .element = { ... } syntax is needed.
+
+ + if SHARED is not defined a normal array initialization is
+ needed.
+ */
+
+#ifndef PROCINFO_CLASS
+# define PROCINFO_CLASS
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+ ._dl_powerpc_cap_flags
+#else
+PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
+#endif
+#ifndef PROCINFO_DECL
+= {
+ "vsx",
+ "arch_2_06", "power6x", "dfp", "pa6t",
+ "arch_2_05", "ic_snoop", "smt", "booke",
+ "cellbe", "power5+", "power5", "power4",
+ "notb", "efpdouble", "efpsingle", "spe",
+ "ucache", "4xxmac", "mmu", "fpu",
+ "altivec", "ppc601", "ppc64", "ppc32",
+ }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+ ._dl_powerpc_platforms
+#else
+PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
+#endif
+#ifndef PROCINFO_DECL
+= {
+ [PPC_PLATFORM_POWER4] = "power4",
+ [PPC_PLATFORM_PPC970] = "ppc970",
+ [PPC_PLATFORM_POWER5] = "power5",
+ [PPC_PLATFORM_POWER5_PLUS] = "power5+",
+ [PPC_PLATFORM_POWER6] = "power6",
+ [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
+ [PPC_PLATFORM_POWER6X] = "power6x",
+ [PPC_PLATFORM_POWER7] = "power7",
+ [PPC_PLATFORM_PPC405] = "ppc405",
+ [PPC_PLATFORM_PPC440] = "ppc440",
+ [PPC_PLATFORM_PPC464] = "ppc464",
+ [PPC_PLATFORM_PPC476] = "ppc476"
+ }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#undef PROCINFO_DECL
+#undef PROCINFO_CLASS
diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
new file mode 100644
index 0000000..87279de
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.h
@@ -0,0 +1,168 @@
+/* Processor capability information handling macros. PowerPC version.
+ Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#ifndef _DL_PROCINFO_H
+#define _DL_PROCINFO_H 1
+
+#include <ldsodefs.h>
+#include <sysdep.h> /* This defines the PPC_FEATURE_* macros. */
+
+/* There are 25 bits used, but they are bits 7..31. */
+#define _DL_HWCAP_FIRST 7
+#define _DL_HWCAP_COUNT 32
+
+/* These bits influence library search. */
+#define HWCAP_IMPORTANT (PPC_FEATURE_HAS_ALTIVEC \
+ + PPC_FEATURE_HAS_DFP)
+
+#define _DL_PLATFORMS_COUNT 12
+
+#define _DL_FIRST_PLATFORM 32
+/* Mask to filter out platforms. */
+#define _DL_HWCAP_PLATFORM (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
+ << _DL_FIRST_PLATFORM)
+
+/* Platform bits (relative to _DL_FIRST_PLATFORM). */
+#define PPC_PLATFORM_POWER4 0
+#define PPC_PLATFORM_PPC970 1
+#define PPC_PLATFORM_POWER5 2
+#define PPC_PLATFORM_POWER5_PLUS 3
+#define PPC_PLATFORM_POWER6 4
+#define PPC_PLATFORM_CELL_BE 5
+#define PPC_PLATFORM_POWER6X 6
+#define PPC_PLATFORM_POWER7 7
+#define PPC_PLATFORM_PPC405 8
+#define PPC_PLATFORM_PPC440 9
+#define PPC_PLATFORM_PPC464 10
+#define PPC_PLATFORM_PPC476 11
+
+static inline const char *
+__attribute__ ((unused))
+_dl_hwcap_string (int idx)
+{
+ return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
+}
+
+static inline const char *
+__attribute__ ((unused))
+_dl_platform_string (int idx)
+{
+ return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
+}
+
+static inline int
+__attribute__ ((unused))
+_dl_string_hwcap (const char *str)
+{
+ for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+ if (strcmp (str, _dl_hwcap_string (i)) == 0)
+ return i;
+ return -1;
+}
+
+static inline int
+__attribute__ ((unused, always_inline))
+_dl_string_platform (const char *str)
+{
+ if (str == NULL)
+ return -1;
+
+ if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
+ {
+ int ret;
+ str += 5;
+ switch (*str)
+ {
+ case '4':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
+ break;
+ case '5':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
+ if (str[1] == '+')
+ {
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
+ ++str;
+ }
+ break;
+ case '6':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
+ if (str[1] == 'x')
+ {
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
+ ++str;
+ }
+ break;
+ case '7':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
+ break;
+ default:
+ return -1;
+ }
+ if (str[1] == '\0')
+ return ret;
+ }
+ else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
+ 3) == 0)
+ {
+ if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
+ + 3) == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
+ }
+
+ return -1;
+}
+
+#ifdef IS_IN_rtld
+static inline int
+__attribute__ ((unused))
+_dl_procinfo (int word)
+{
+ _dl_printf ("AT_HWCAP: ");
+
+ for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+ if (word & (1 << i))
+ _dl_printf (" %s", _dl_hwcap_string (i));
+
+ _dl_printf ("\n");
+
+ return 0;
+}
+#endif
+
+#endif /* dl-procinfo.h */
diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
new file mode 100644
index 0000000..c0314e6
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
@@ -0,0 +1,132 @@
+/* Optimized memcmp implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcmp
+
+ r3:source1 address, return equality
+ r4:source2 address
+ r5:byte count
+
+ Check 2 words from src1 and src2. If unequal jump to end and
+ return src1 > src2 or src1 < src2.
+ If count = zero check bytes before zero counter and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM (memcmp), 5, 0)
+ srwi. r6,r5,5
+ beq L(preword2_count_loop)
+ mtctr r6
+ clrlwi r5,r5,27
+
+L(word8_compare_loop):
+ lwz r10,0(r3)
+ lwz r6,4(r3)
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ lwz r10,8(r3)
+ lwz r6,12(r3)
+ lwz r8,8(r4)
+ lwz r9,12(r4)
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ lwz r10,16(r3)
+ lwz r6,20(r3)
+ lwz r8,16(r4)
+ lwz r9,20(r4)
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ lwz r10,24(r3)
+ lwz r6,28(r3)
+ addi r3,r3,0x20
+ lwz r8,24(r4)
+ lwz r9,28(r4)
+ addi r4,r4,0x20
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ bdnz L(word8_compare_loop)
+
+L(preword2_count_loop):
+ srwi. r6,r5,3
+ beq L(prebyte_count_loop)
+ mtctr r6
+ clrlwi r5,r5,29
+
+L(word2_count_loop):
+ lwz r10,0(r3)
+ lwz r6,4(r3)
+ addi r3,r3,0x08
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ addi r4,r4,0x08
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ bdnz L(word2_count_loop)
+
+L(prebyte_count_loop):
+ addi r5,r5,1
+ mtctr r5
+ bdz L(end_memcmp)
+
+L(byte_count_loop):
+ lbz r6,0(r3)
+ addi r3,r3,0x01
+ lbz r8,0(r4)
+ addi r4,r4,0x01
+ cmplw cr5,r8,r6
+ bne cr5,L(st2)
+ bdnz L(byte_count_loop)
+
+L(end_memcmp):
+ addi r3,r0,0
+ blr
+
+L(l_r):
+ addi r3,r0,1
+ blr
+
+L(st1):
+ blt cr1,L(l_r)
+ addi r3,r0,-1
+ blr
+
+L(st2):
+ blt cr5,L(l_r)
+ addi r3,r0,-1
+ blr
+END (BP_SYM (memcmp))
+libc_hidden_builtin_def (memcmp)
+weak_alias (memcmp,bcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
new file mode 100644
index 0000000..777d3db
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
@@ -0,0 +1,134 @@
+/* Optimized memcpy implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcpy
+
+ r0:return address
+ r3:destination address
+ r4:source address
+ r5:byte count
+
+ Save return address in r0.
+ If destinationn and source are unaligned and copy count is greater than 256
+ then copy 0-3 bytes to make destination aligned.
+ If 32 or more bytes to copy we use 32 byte copy loop.
+ Finaly we copy 0-31 extra bytes. */
+
+EALIGN (BP_SYM (memcpy), 5, 0)
+/* Check if bytes to copy are greater than 256 and if
+ source and destination are unaligned */
+ cmpwi r5,0x0100
+ addi r0,r3,0
+ ble L(string_count_loop)
+ neg r6,r3
+ clrlwi. r6,r6,30
+ beq L(string_count_loop)
+ neg r6,r4
+ clrlwi. r6,r6,30
+ beq L(string_count_loop)
+ mtctr r6
+ subf r5,r6,r5
+
+L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
+ lbz r8,0x0(r4)
+ addi r4,r4,1
+ stb r8,0x0(r3)
+ addi r3,r3,1
+ bdnz L(unaligned_bytecopy_loop)
+ srwi. r7,r5,5
+ beq L(preword2_count_loop)
+ mtctr r7
+
+L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+ lwz r6,0(r4)
+ lwz r7,4(r4)
+ lwz r8,8(r4)
+ lwz r9,12(r4)
+ subi r5,r5,0x20
+ stw r6,0(r3)
+ stw r7,4(r3)
+ stw r8,8(r3)
+ stw r9,12(r3)
+ lwz r6,16(r4)
+ lwz r7,20(r4)
+ lwz r8,24(r4)
+ lwz r9,28(r4)
+ addi r4,r4,0x20
+ stw r6,16(r3)
+ stw r7,20(r3)
+ stw r8,24(r3)
+ stw r9,28(r3)
+ addi r3,r3,0x20
+ bdnz L(word8_count_loop_no_dcbt)
+
+L(preword2_count_loop): /* Copy remaining 0-31 bytes */
+ clrlwi. r12,r5,27
+ beq L(end_memcpy)
+ mtxer r12
+ lswx r5,0,r4
+ stswx r5,0,r3
+ mr r3,r0
+ blr
+
+L(string_count_loop): /* Copy odd 0-31 bytes */
+ clrlwi. r12,r5,28
+ add r3,r3,r5
+ add r4,r4,r5
+ beq L(pre_string_copy)
+ mtxer r12
+ subf r4,r12,r4
+ subf r3,r12,r3
+ lswx r6,0,r4
+ stswx r6,0,r3
+
+L(pre_string_copy): /* Check how many 32 byte chunck to copy */
+ srwi. r7,r5,4
+ beq L(end_memcpy)
+ mtctr r7
+
+L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+ lwz r6,-4(r4)
+ lwz r7,-8(r4)
+ lwz r8,-12(r4)
+ lwzu r9,-16(r4)
+ stw r6,-4(r3)
+ stw r7,-8(r3)
+ stw r8,-12(r3)
+ stwu r9,-16(r3)
+ bdz L(end_memcpy)
+ lwz r6,-4(r4)
+ lwz r7,-8(r4)
+ lwz r8,-12(r4)
+ lwzu r9,-16(r4)
+ stw r6,-4(r3)
+ stw r7,-8(r3)
+ stw r8,-12(r3)
+ stwu r9,-16(r3)
+ bdnz L(word4_count_loop_no_dcbt)
+
+L(end_memcpy):
+ mr r3,r0
+ blr
+END (BP_SYM (memcpy))
+libc_hidden_builtin_def (memcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
new file mode 100644
index 0000000..10b0f6e
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memset.S
@@ -0,0 +1,156 @@
+/* Optimized memset implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memset
+
+ r3:destination address and return address
+ r4:source integer to copy
+ r5:byte count
+ r11:sources integer to copy in all 32 bits of reg
+ r12:temp return address
+
+ Save return address in r12
+ If destinationn is unaligned and count is greater tha 255 bytes
+ set 0-3 bytes to make destination aligned
+ If count is greater tha 255 bytes and setting zero to memory
+ use dbcz to set memeory when we can
+ otherwsie do the follwoing
+ If 16 or more words to set we use 16 word copy loop.
+ Finaly we set 0-15 extra bytes with string store. */
+
+EALIGN (BP_SYM (memset), 5, 0)
+ rlwinm r11,r4,0,24,31
+ rlwimi r11,r4,8,16,23
+ rlwimi r11,r11,16,0,15
+ addi r12,r3,0
+ cmpwi r5,0x00FF
+ ble L(preword8_count_loop)
+ cmpwi r4,0x00
+ beq L(use_dcbz)
+ neg r6,r3
+ clrlwi. r6,r6,30
+ beq L(preword8_count_loop)
+ addi r8,0,1
+ mtctr r6
+ subi r3,r3,1
+
+L(unaligned_bytecopy_loop):
+ stbu r11,0x1(r3)
+ subf. r5,r8,r5
+ beq L(end_memset)
+ bdnz L(unaligned_bytecopy_loop)
+ addi r3,r3,1
+
+L(preword8_count_loop):
+ srwi. r6,r5,4
+ beq L(preword2_count_loop)
+ mtctr r6
+ addi r3,r3,-4
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+
+L(word8_count_loop_no_dcbt):
+ stwu r8,4(r3)
+ stwu r9,4(r3)
+ subi r5,r5,0x10
+ stwu r10,4(r3)
+ stwu r11,4(r3)
+ bdnz L(word8_count_loop_no_dcbt)
+ addi r3,r3,4
+
+L(preword2_count_loop):
+ clrlwi. r7,r5,28
+ beq L(end_memset)
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+ mtxer r7
+ stswx r8,0,r3
+
+L(end_memset):
+ addi r3,r12,0
+ blr
+
+L(use_dcbz):
+ neg r6,r3
+ clrlwi. r7,r6,28
+ beq L(skip_string_loop)
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+ subf r5,r7,r5
+ mtxer r7
+ stswx r8,0,r3
+ add r3,r3,r7
+
+L(skip_string_loop):
+ clrlwi r8,r6,25
+ srwi. r8,r8,4
+ beq L(dcbz_pre_loop)
+ mtctr r8
+
+L(word_loop):
+ stw r11,0(r3)
+ subi r5,r5,0x10
+ stw r11,4(r3)
+ stw r11,8(r3)
+ stw r11,12(r3)
+ addi r3,r3,0x10
+ bdnz L(word_loop)
+
+L(dcbz_pre_loop):
+ srwi r6,r5,7
+ mtctr r6
+ addi r7,0,0
+
+L(dcbz_loop):
+ dcbz r3,r7
+ addi r3,r3,0x80
+ subi r5,r5,0x80
+ bdnz L(dcbz_loop)
+ srwi. r6,r5,4
+ beq L(postword2_count_loop)
+ mtctr r6
+
+L(postword8_count_loop):
+ stw r11,0(r3)
+ subi r5,r5,0x10
+ stw r11,4(r3)
+ stw r11,8(r3)
+ stw r11,12(r3)
+ addi r3,r3,0x10
+ bdnz L(postword8_count_loop)
+
+L(postword2_count_loop):
+ clrlwi. r7,r5,28
+ beq L(end_memset)
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+ mtxer r7
+ stswx r8,0,r3
+ b L(end_memset)
+END (BP_SYM (memset))
+libc_hidden_builtin_def (memset)
diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
new file mode 100644
index 0000000..79a80f1
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
@@ -0,0 +1,138 @@
+/* Optimized strcmp implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcmp
+
+ Register Use
+ r0:temp return equality
+ r3:source1 address, return equality
+ r4:source2 address
+
+ Implementation description
+ Check 2 words from src1 and src2. If unequal jump to end and
+ return src1 > src2 or src1 < src2.
+ If null check bytes before null and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strcmp),5,0)
+ neg r7,r3
+ clrlwi r7,r7,20
+ neg r8,r4
+ clrlwi r8,r8,20
+ srwi. r7,r7,5
+ beq L(byte_loop)
+ srwi. r8,r8,5
+ beq L(byte_loop)
+ cmplw r7,r8
+ mtctr r7
+ ble L(big_loop)
+ mtctr r8
+
+L(big_loop):
+ lwz r5,0(r3)
+ lwz r6,4(r3)
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ lwz r5,8(r3)
+ lwz r6,12(r3)
+ lwz r8,8(r4)
+ lwz r9,12(r4)
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ lwz r5,16(r3)
+ lwz r6,20(r3)
+ lwz r8,16(r4)
+ lwz r9,20(r4)
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ lwz r5,24(r3)
+ lwz r6,28(r3)
+ addi r3,r3,0x20
+ lwz r8,24(r4)
+ lwz r9,28(r4)
+ addi r4,r4,0x20
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ bdnz L(big_loop)
+ b L(byte_loop)
+
+L(end_check):
+ subfic r12,r12,4
+ blt L(end_check2)
+ rlwinm r12,r12,3,0,31
+ srw r5,r5,r12
+ srw r8,r8,r12
+ cmplw r5,r8
+ bne L(st1)
+ b L(end_strcmp)
+
+L(end_check2):
+ addi r12,r12,4
+ cmplw r5,r8
+ rlwinm r12,r12,3,0,31
+ bne L(st1)
+ srw r6,r6,r12
+ srw r9,r9,r12
+ cmplw r6,r9
+ bne L(st1)
+
+L(end_strcmp):
+ addi r3,r0,0
+ blr
+
+L(st1):
+ mfcr r3
+ blr
+
+L(byte_loop):
+ lbz r5,0(r3)
+ addi r3,r3,1
+ lbz r6,0(r4)
+ addi r4,r4,1
+ cmplw r5,r6
+ bne L(st1)
+ cmpwi r5,0
+ beq L(end_strcmp)
+ b L(byte_loop)
+END (BP_SYM (strcmp))
+libc_hidden_builtin_def (strcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
new file mode 100644
index 0000000..f289118
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
@@ -0,0 +1,111 @@
+/* Optimized strcpy implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcpy
+
+ Register Use
+ r3:destination and return address
+ r4:source address
+ r10:temp destination address
+
+ Implementation description
+ Loop by checking 2 words at a time, with dlmzb. Check if there is a null
+ in the 2 words. If there is a null jump to end checking to determine
+ where in the last 8 bytes it is. Copy the appropriate bytes of the last
+ 8 according to the null position. */
+
+EALIGN (BP_SYM (strcpy), 5, 0)
+ neg r7,r4
+ subi r4,r4,1
+ clrlwi. r8,r7,29
+ subi r10,r3,1
+ beq L(pre_word8_loop)
+ mtctr r8
+
+L(loop):
+ lbzu r5,0x01(r4)
+ cmpi cr5,r5,0x0
+ stbu r5,0x01(r10)
+ beq cr5,L(end_strcpy)
+ bdnz L(loop)
+
+L(pre_word8_loop):
+ subi r4,r4,3
+ subi r10,r10,3
+
+L(word8_loop):
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ b L(word8_loop)
+
+L(last_bytes_copy):
+ stwu r5,0x04(r10)
+ subi r11,r11,4
+ mtctr r11
+ addi r10,r10,3
+ subi r4,r4,1
+
+L(last_bytes_copy_loop):
+ lbzu r5,0x01(r4)
+ stbu r5,0x01(r10)
+ bdnz L(last_bytes_copy_loop)
+ blr
+
+L(byte_copy):
+ blt L(last_bytes_copy)
+ mtctr r11
+ addi r10,r10,3
+ subi r4,r4,5
+
+L(last_bytes_copy_loop2):
+ lbzu r5,0x01(r4)
+ stbu r5,0x01(r10)
+ bdnz L(last_bytes_copy_loop2)
+
+L(end_strcpy):
+ blr
+END (BP_SYM (strcpy))
+libc_hidden_builtin_def (strcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
new file mode 100644
index 0000000..5da0e0b
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strlen.S
@@ -0,0 +1,79 @@
+/* Optimized strlen implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strlen
+
+ Register Use
+ r3:source address and return length of string
+ r4:byte counter
+
+ Implementation description
+ Load 2 words at a time and count bytes, if we find null we subtract one from
+ the count and return the count value. We need to subtract one because
+ we don't count the null character as a byte. */
+
+EALIGN (BP_SYM (strlen),5,0)
+ neg r7,r3
+ clrlwi. r8,r7,29
+ addi r4,0,0
+ beq L(byte_count_loop)
+ mtctr r8
+
+L(loop):
+ lbz r5,0(r3)
+ cmpi cr5,r5,0x0
+ addi r3,r3,0x1
+ addi r4,r4,0x1
+ beq cr5,L(end_strlen)
+ bdnz L(loop)
+
+L(byte_count_loop):
+ lwz r5,0(r3)
+ lwz r6,4(r3)
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ lwz r5,8(r3)
+ lwz r6,12(r3)
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ lwz r5,16(r3)
+ lwz r6,20(r3)
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ lwz r5,24(r3)
+ lwz r6,28(r3)
+ addi r3,r3,0x20
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ b L(byte_count_loop)
+
+L(end_strlen):
+ addi r3,r4,-1
+ blr
+END (BP_SYM (strlen))
+libc_hidden_builtin_def (strlen)
diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
new file mode 100644
index 0000000..658c1b1
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
@@ -0,0 +1,132 @@
+/* Optimized strncmp implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ Contributed by Todd Iglehart <iglehart@us.ibm.com>.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strncmp
+
+ Register Use
+ r0:temp return equality
+ r3:source1 address, return equality
+ r4:source2 address
+ r5:byte count
+
+ Implementation description
+ Touch in 3 lines of D-cache.
+ If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
+ Check 2 words from src1 and src2. If unequal jump to end and
+ return src1 > src2 or src1 < src2.
+ If null check bytes before null and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If count = zero check bytes before zero counter and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strncmp),5,0)
+ neg r7,r3
+ clrlwi r7,r7,20
+ neg r8,r4
+ clrlwi r8,r8,20
+ srwi. r7,r7,3
+ beq L(prebyte_count_loop)
+ srwi. r8,r8,3
+ beq L(prebyte_count_loop)
+ cmplw r7,r8
+ mtctr r7
+ ble L(preword2_count_loop)
+ mtctr r8
+
+L(preword2_count_loop):
+ srwi. r6,r5,3
+ beq L(prebyte_count_loop)
+ mfctr r7
+ cmplw r6,r7
+ bgt L(set_count_loop)
+ mtctr r6
+ clrlwi r5,r5,29
+
+L(word2_count_loop):
+ lwz r10,0(r3)
+ lwz r6,4(r3)
+ addi r3,r3,0x08
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ addi r4,r4,0x08
+ dlmzb. r12,r10,r6
+ bne L(end_check)
+ cmplw r10,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ bdnz L(word2_count_loop)
+
+L(prebyte_count_loop):
+ addi r5,r5,1
+ mtctr r5
+ bdz L(end_strncmp)
+
+L(byte_count_loop):
+ lbz r6,0(r3)
+ addi r3,r3,1
+ lbz r7,0(r4)
+ addi r4,r4,1
+ cmplw r6,r7
+ bne L(st1)
+ cmpwi r6,0
+ beq L(end_strncmp)
+ bdnz L(byte_count_loop)
+ b L(end_strncmp)
+
+L(set_count_loop):
+ slwi r7,r7,3
+ subf r5,r7,r5
+ b L(word2_count_loop)
+
+L(end_check):
+ subfic r12,r12,4
+ blt L(end_check2)
+ rlwinm r12,r12,3,0,31
+ srw r10,r10,r12
+ srw r8,r8,r12
+ cmplw r10,r8
+ bne L(st1)
+ b L(end_strncmp)
+
+L(end_check2):
+ addi r12,r12,4
+ cmplw r10,r8
+ rlwinm r12,r12,3,0,31
+ bne L(st1)
+ srw r6,r6,r12
+ srw r9,r9,r12
+ cmplw r6,r9
+ bne L(st1)
+
+L(end_strncmp):
+ addi r3,r0,0
+ blr
+
+L(st1):
+ mfcr r3
+ blr
+END (BP_SYM (strncmp))
+libc_hidden_builtin_def (strncmp)
diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
new file mode 100644
index 0000000..3d235de
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/Makefile
@@ -0,0 +1,8 @@
+# Some Powerpc32 variants assume soft-fp is the default even though there is
+# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
+
+ifeq ($(with-fp),yes)
++cflags += -mhard-float
+ASFLAGS += -mhard-float
+sysdep-LDFLAGS += -mhard-float
+endif
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..80f9170
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/476/fpu
+powerpc/powerpc32/476
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2010-09-02 17:34 [PATCH] powerpc: 405/440/464/476 support and optimizations Luis Machado
@ 2010-09-03 14:45 ` Ryan Arnold
2010-09-03 15:00 ` Luis Machado
0 siblings, 1 reply; 19+ messages in thread
From: Ryan Arnold @ 2010-09-03 14:45 UTC (permalink / raw)
To: luisgpm; +Cc: libc-ports, rsa, Todd Iglehart, Josh Boyer
On Thu, Sep 2, 2010 at 12:34 PM, Luis Machado
<luisgpm@linux.vnet.ibm.com> wrote:
> Hi,
>
> This patch adds powerpc 405/440/464/476 platforms to ports and adds 3
> memory (memcpy,memcmp,memset) optimizations and 4 string function
> (strcmp,strncmp,strcpy,strlen) optimizations (provided by Todd, copied),
> placed under 405, so all those platforms can use those optimized
> functions.
>
> The patch also adds the required Makefile, sysdeps structure and Implies
> files.
>
> Is this OK?
>
> Regards,
> Luis
>
> 2010-09-02 Todd Iglehart <iglehart@us.ibm.com>
> Ryan Arnold <rsa@us.ibm.com>
> Luis Machado <luisgpm@br.ibm.com>
>
Hi Luis,
Todd doesn't have FSF copyright assignment so he shouldn't be on the ChangeLog.
> diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
> new file mode 100644
> index 0000000..c0314e6
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
> @@ -0,0 +1,132 @@
> +/* Optimized memcmp implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + Contributed by Todd Iglehart <iglehart@us.ibm.com>.
> + This file is part of the GNU C Library.
Since Todd doesn't have copyright assignment these changes are
contributed to the FSF by IBM without author/contributor attribution.
You can simply attribute the changes to him in the email leaving his
name out of the sources per FSF policy and submit them on IBM's
behalf.
Ryan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2010-09-03 14:45 ` Ryan Arnold
@ 2010-09-03 15:00 ` Luis Machado
2010-10-04 18:54 ` Luis Machado
0 siblings, 1 reply; 19+ messages in thread
From: Luis Machado @ 2010-09-03 15:00 UTC (permalink / raw)
To: Ryan Arnold; +Cc: libc-ports, rsa, Todd Iglehart, Josh Boyer
> Since Todd doesn't have copyright assignment these changes are
> contributed to the FSF by IBM without author/contributor attribution.
>
> You can simply attribute the changes to him in the email leaving his
> name out of the sources per FSF policy and submit them on IBM's
> behalf.
>
> Ryan
Thanks.
Follows the updated patch without Todd's name on the sources.
Luis
2010-09-03 Luis Machado <luisgpm@br.ibm.com>
* sysdeps/powerpc/dl-procinfo.c: New file.
* sysdeps/powerpc/dl-procinfo.h: New file.
* sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
* sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
* sysdeps/powerpc/powerpc32/405/memset.S: New file.
* sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
* sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
* sysdeps/powerpc/powerpc32/405/strlen.S: New file.
* sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
* sysdeps/powerpc/powerpc32/440/Implies: New file.
* sysdeps/powerpc/powerpc32/464/Implies: New file.
* sysdeps/powerpc/powerpc32/476/Implies: New file.
* sysdeps/powerpc/powerpc32/Makefile: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.
diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
new file mode 100644
index 0000000..60fb465
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.c
@@ -0,0 +1,96 @@
+/* Data for processor capability information. PowerPC version.
+ Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+/* This information must be kept in sync with the _DL_HWCAP_COUNT and
+ _DL_PLATFORM_COUNT definitions in procinfo.h.
+
+ If anything should be added here check whether the size of each string
+ is still ok with the given array size.
+
+ All the #ifdefs in the definitions are quite irritating but
+ necessary if we want to avoid duplicating the information. There
+ are three different modes:
+
+ - PROCINFO_DECL is defined. This means we are only interested in
+ declarations.
+
+ - PROCINFO_DECL is not defined:
+
+ + if SHARED is defined the file is included in an array
+ initializer. The .element = { ... } syntax is needed.
+
+ + if SHARED is not defined a normal array initialization is
+ needed.
+ */
+
+#ifndef PROCINFO_CLASS
+# define PROCINFO_CLASS
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+ ._dl_powerpc_cap_flags
+#else
+PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
+#endif
+#ifndef PROCINFO_DECL
+= {
+ "vsx",
+ "arch_2_06", "power6x", "dfp", "pa6t",
+ "arch_2_05", "ic_snoop", "smt", "booke",
+ "cellbe", "power5+", "power5", "power4",
+ "notb", "efpdouble", "efpsingle", "spe",
+ "ucache", "4xxmac", "mmu", "fpu",
+ "altivec", "ppc601", "ppc64", "ppc32",
+ }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+ ._dl_powerpc_platforms
+#else
+PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
+#endif
+#ifndef PROCINFO_DECL
+= {
+ [PPC_PLATFORM_POWER4] = "power4",
+ [PPC_PLATFORM_PPC970] = "ppc970",
+ [PPC_PLATFORM_POWER5] = "power5",
+ [PPC_PLATFORM_POWER5_PLUS] = "power5+",
+ [PPC_PLATFORM_POWER6] = "power6",
+ [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
+ [PPC_PLATFORM_POWER6X] = "power6x",
+ [PPC_PLATFORM_POWER7] = "power7",
+ [PPC_PLATFORM_PPC405] = "ppc405",
+ [PPC_PLATFORM_PPC440] = "ppc440",
+ [PPC_PLATFORM_PPC464] = "ppc464",
+ [PPC_PLATFORM_PPC476] = "ppc476"
+ }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#undef PROCINFO_DECL
+#undef PROCINFO_CLASS
diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
new file mode 100644
index 0000000..87279de
--- /dev/null
+++ b/sysdeps/powerpc/dl-procinfo.h
@@ -0,0 +1,168 @@
+/* Processor capability information handling macros. PowerPC version.
+ Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#ifndef _DL_PROCINFO_H
+#define _DL_PROCINFO_H 1
+
+#include <ldsodefs.h>
+#include <sysdep.h> /* This defines the PPC_FEATURE_* macros. */
+
+/* There are 25 bits used, but they are bits 7..31. */
+#define _DL_HWCAP_FIRST 7
+#define _DL_HWCAP_COUNT 32
+
+/* These bits influence library search. */
+#define HWCAP_IMPORTANT (PPC_FEATURE_HAS_ALTIVEC \
+ + PPC_FEATURE_HAS_DFP)
+
+#define _DL_PLATFORMS_COUNT 12
+
+#define _DL_FIRST_PLATFORM 32
+/* Mask to filter out platforms. */
+#define _DL_HWCAP_PLATFORM (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
+ << _DL_FIRST_PLATFORM)
+
+/* Platform bits (relative to _DL_FIRST_PLATFORM). */
+#define PPC_PLATFORM_POWER4 0
+#define PPC_PLATFORM_PPC970 1
+#define PPC_PLATFORM_POWER5 2
+#define PPC_PLATFORM_POWER5_PLUS 3
+#define PPC_PLATFORM_POWER6 4
+#define PPC_PLATFORM_CELL_BE 5
+#define PPC_PLATFORM_POWER6X 6
+#define PPC_PLATFORM_POWER7 7
+#define PPC_PLATFORM_PPC405 8
+#define PPC_PLATFORM_PPC440 9
+#define PPC_PLATFORM_PPC464 10
+#define PPC_PLATFORM_PPC476 11
+
+static inline const char *
+__attribute__ ((unused))
+_dl_hwcap_string (int idx)
+{
+ return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
+}
+
+static inline const char *
+__attribute__ ((unused))
+_dl_platform_string (int idx)
+{
+ return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
+}
+
+static inline int
+__attribute__ ((unused))
+_dl_string_hwcap (const char *str)
+{
+ for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+ if (strcmp (str, _dl_hwcap_string (i)) == 0)
+ return i;
+ return -1;
+}
+
+static inline int
+__attribute__ ((unused, always_inline))
+_dl_string_platform (const char *str)
+{
+ if (str == NULL)
+ return -1;
+
+ if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
+ {
+ int ret;
+ str += 5;
+ switch (*str)
+ {
+ case '4':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
+ break;
+ case '5':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
+ if (str[1] == '+')
+ {
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
+ ++str;
+ }
+ break;
+ case '6':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
+ if (str[1] == 'x')
+ {
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
+ ++str;
+ }
+ break;
+ case '7':
+ ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
+ break;
+ default:
+ return -1;
+ }
+ if (str[1] == '\0')
+ return ret;
+ }
+ else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
+ 3) == 0)
+ {
+ if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
+ + 3) == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
+ else if (strcmp (str + 3,
+ GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
+ == 0)
+ return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
+ }
+
+ return -1;
+}
+
+#ifdef IS_IN_rtld
+static inline int
+__attribute__ ((unused))
+_dl_procinfo (int word)
+{
+ _dl_printf ("AT_HWCAP: ");
+
+ for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
+ if (word & (1 << i))
+ _dl_printf (" %s", _dl_hwcap_string (i));
+
+ _dl_printf ("\n");
+
+ return 0;
+}
+#endif
+
+#endif /* dl-procinfo.h */
diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
new file mode 100644
index 0000000..653d3b5
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
@@ -0,0 +1,131 @@
+/* Optimized memcmp implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcmp
+
+ r3:source1 address, return equality
+ r4:source2 address
+ r5:byte count
+
+ Check 2 words from src1 and src2. If unequal jump to end and
+ return src1 > src2 or src1 < src2.
+ If count = zero check bytes before zero counter and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM (memcmp), 5, 0)
+ srwi. r6,r5,5
+ beq L(preword2_count_loop)
+ mtctr r6
+ clrlwi r5,r5,27
+
+L(word8_compare_loop):
+ lwz r10,0(r3)
+ lwz r6,4(r3)
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ lwz r10,8(r3)
+ lwz r6,12(r3)
+ lwz r8,8(r4)
+ lwz r9,12(r4)
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ lwz r10,16(r3)
+ lwz r6,20(r3)
+ lwz r8,16(r4)
+ lwz r9,20(r4)
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ lwz r10,24(r3)
+ lwz r6,28(r3)
+ addi r3,r3,0x20
+ lwz r8,24(r4)
+ lwz r9,28(r4)
+ addi r4,r4,0x20
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ bdnz L(word8_compare_loop)
+
+L(preword2_count_loop):
+ srwi. r6,r5,3
+ beq L(prebyte_count_loop)
+ mtctr r6
+ clrlwi r5,r5,29
+
+L(word2_count_loop):
+ lwz r10,0(r3)
+ lwz r6,4(r3)
+ addi r3,r3,0x08
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ addi r4,r4,0x08
+ cmplw cr5,r8,r10
+ cmplw cr1,r9,r6
+ bne cr5,L(st2)
+ bne cr1,L(st1)
+ bdnz L(word2_count_loop)
+
+L(prebyte_count_loop):
+ addi r5,r5,1
+ mtctr r5
+ bdz L(end_memcmp)
+
+L(byte_count_loop):
+ lbz r6,0(r3)
+ addi r3,r3,0x01
+ lbz r8,0(r4)
+ addi r4,r4,0x01
+ cmplw cr5,r8,r6
+ bne cr5,L(st2)
+ bdnz L(byte_count_loop)
+
+L(end_memcmp):
+ addi r3,r0,0
+ blr
+
+L(l_r):
+ addi r3,r0,1
+ blr
+
+L(st1):
+ blt cr1,L(l_r)
+ addi r3,r0,-1
+ blr
+
+L(st2):
+ blt cr5,L(l_r)
+ addi r3,r0,-1
+ blr
+END (BP_SYM (memcmp))
+libc_hidden_builtin_def (memcmp)
+weak_alias (memcmp,bcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
new file mode 100644
index 0000000..a654c73
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
@@ -0,0 +1,133 @@
+/* Optimized memcpy implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memcpy
+
+ r0:return address
+ r3:destination address
+ r4:source address
+ r5:byte count
+
+ Save return address in r0.
+ If destinationn and source are unaligned and copy count is greater than 256
+ then copy 0-3 bytes to make destination aligned.
+ If 32 or more bytes to copy we use 32 byte copy loop.
+ Finaly we copy 0-31 extra bytes. */
+
+EALIGN (BP_SYM (memcpy), 5, 0)
+/* Check if bytes to copy are greater than 256 and if
+ source and destination are unaligned */
+ cmpwi r5,0x0100
+ addi r0,r3,0
+ ble L(string_count_loop)
+ neg r6,r3
+ clrlwi. r6,r6,30
+ beq L(string_count_loop)
+ neg r6,r4
+ clrlwi. r6,r6,30
+ beq L(string_count_loop)
+ mtctr r6
+ subf r5,r6,r5
+
+L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
+ lbz r8,0x0(r4)
+ addi r4,r4,1
+ stb r8,0x0(r3)
+ addi r3,r3,1
+ bdnz L(unaligned_bytecopy_loop)
+ srwi. r7,r5,5
+ beq L(preword2_count_loop)
+ mtctr r7
+
+L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+ lwz r6,0(r4)
+ lwz r7,4(r4)
+ lwz r8,8(r4)
+ lwz r9,12(r4)
+ subi r5,r5,0x20
+ stw r6,0(r3)
+ stw r7,4(r3)
+ stw r8,8(r3)
+ stw r9,12(r3)
+ lwz r6,16(r4)
+ lwz r7,20(r4)
+ lwz r8,24(r4)
+ lwz r9,28(r4)
+ addi r4,r4,0x20
+ stw r6,16(r3)
+ stw r7,20(r3)
+ stw r8,24(r3)
+ stw r9,28(r3)
+ addi r3,r3,0x20
+ bdnz L(word8_count_loop_no_dcbt)
+
+L(preword2_count_loop): /* Copy remaining 0-31 bytes */
+ clrlwi. r12,r5,27
+ beq L(end_memcpy)
+ mtxer r12
+ lswx r5,0,r4
+ stswx r5,0,r3
+ mr r3,r0
+ blr
+
+L(string_count_loop): /* Copy odd 0-31 bytes */
+ clrlwi. r12,r5,28
+ add r3,r3,r5
+ add r4,r4,r5
+ beq L(pre_string_copy)
+ mtxer r12
+ subf r4,r12,r4
+ subf r3,r12,r3
+ lswx r6,0,r4
+ stswx r6,0,r3
+
+L(pre_string_copy): /* Check how many 32 byte chunck to copy */
+ srwi. r7,r5,4
+ beq L(end_memcpy)
+ mtctr r7
+
+L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
+ lwz r6,-4(r4)
+ lwz r7,-8(r4)
+ lwz r8,-12(r4)
+ lwzu r9,-16(r4)
+ stw r6,-4(r3)
+ stw r7,-8(r3)
+ stw r8,-12(r3)
+ stwu r9,-16(r3)
+ bdz L(end_memcpy)
+ lwz r6,-4(r4)
+ lwz r7,-8(r4)
+ lwz r8,-12(r4)
+ lwzu r9,-16(r4)
+ stw r6,-4(r3)
+ stw r7,-8(r3)
+ stw r8,-12(r3)
+ stwu r9,-16(r3)
+ bdnz L(word4_count_loop_no_dcbt)
+
+L(end_memcpy):
+ mr r3,r0
+ blr
+END (BP_SYM (memcpy))
+libc_hidden_builtin_def (memcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
new file mode 100644
index 0000000..69d5d4c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/memset.S
@@ -0,0 +1,155 @@
+/* Optimized memset implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* memset
+
+ r3:destination address and return address
+ r4:source integer to copy
+ r5:byte count
+ r11:sources integer to copy in all 32 bits of reg
+ r12:temp return address
+
+ Save return address in r12
+ If destinationn is unaligned and count is greater tha 255 bytes
+ set 0-3 bytes to make destination aligned
+ If count is greater tha 255 bytes and setting zero to memory
+ use dbcz to set memeory when we can
+ otherwsie do the follwoing
+ If 16 or more words to set we use 16 word copy loop.
+ Finaly we set 0-15 extra bytes with string store. */
+
+EALIGN (BP_SYM (memset), 5, 0)
+ rlwinm r11,r4,0,24,31
+ rlwimi r11,r4,8,16,23
+ rlwimi r11,r11,16,0,15
+ addi r12,r3,0
+ cmpwi r5,0x00FF
+ ble L(preword8_count_loop)
+ cmpwi r4,0x00
+ beq L(use_dcbz)
+ neg r6,r3
+ clrlwi. r6,r6,30
+ beq L(preword8_count_loop)
+ addi r8,0,1
+ mtctr r6
+ subi r3,r3,1
+
+L(unaligned_bytecopy_loop):
+ stbu r11,0x1(r3)
+ subf. r5,r8,r5
+ beq L(end_memset)
+ bdnz L(unaligned_bytecopy_loop)
+ addi r3,r3,1
+
+L(preword8_count_loop):
+ srwi. r6,r5,4
+ beq L(preword2_count_loop)
+ mtctr r6
+ addi r3,r3,-4
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+
+L(word8_count_loop_no_dcbt):
+ stwu r8,4(r3)
+ stwu r9,4(r3)
+ subi r5,r5,0x10
+ stwu r10,4(r3)
+ stwu r11,4(r3)
+ bdnz L(word8_count_loop_no_dcbt)
+ addi r3,r3,4
+
+L(preword2_count_loop):
+ clrlwi. r7,r5,28
+ beq L(end_memset)
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+ mtxer r7
+ stswx r8,0,r3
+
+L(end_memset):
+ addi r3,r12,0
+ blr
+
+L(use_dcbz):
+ neg r6,r3
+ clrlwi. r7,r6,28
+ beq L(skip_string_loop)
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+ subf r5,r7,r5
+ mtxer r7
+ stswx r8,0,r3
+ add r3,r3,r7
+
+L(skip_string_loop):
+ clrlwi r8,r6,25
+ srwi. r8,r8,4
+ beq L(dcbz_pre_loop)
+ mtctr r8
+
+L(word_loop):
+ stw r11,0(r3)
+ subi r5,r5,0x10
+ stw r11,4(r3)
+ stw r11,8(r3)
+ stw r11,12(r3)
+ addi r3,r3,0x10
+ bdnz L(word_loop)
+
+L(dcbz_pre_loop):
+ srwi r6,r5,7
+ mtctr r6
+ addi r7,0,0
+
+L(dcbz_loop):
+ dcbz r3,r7
+ addi r3,r3,0x80
+ subi r5,r5,0x80
+ bdnz L(dcbz_loop)
+ srwi. r6,r5,4
+ beq L(postword2_count_loop)
+ mtctr r6
+
+L(postword8_count_loop):
+ stw r11,0(r3)
+ subi r5,r5,0x10
+ stw r11,4(r3)
+ stw r11,8(r3)
+ stw r11,12(r3)
+ addi r3,r3,0x10
+ bdnz L(postword8_count_loop)
+
+L(postword2_count_loop):
+ clrlwi. r7,r5,28
+ beq L(end_memset)
+ mr r8,r11
+ mr r9,r11
+ mr r10,r11
+ mtxer r7
+ stswx r8,0,r3
+ b L(end_memset)
+END (BP_SYM (memset))
+libc_hidden_builtin_def (memset)
diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
new file mode 100644
index 0000000..6eb5b5a
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
@@ -0,0 +1,137 @@
+/* Optimized strcmp implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcmp
+
+ Register Use
+ r0:temp return equality
+ r3:source1 address, return equality
+ r4:source2 address
+
+ Implementation description
+ Check 2 words from src1 and src2. If unequal jump to end and
+ return src1 > src2 or src1 < src2.
+ If null check bytes before null and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strcmp),5,0)
+ neg r7,r3
+ clrlwi r7,r7,20
+ neg r8,r4
+ clrlwi r8,r8,20
+ srwi. r7,r7,5
+ beq L(byte_loop)
+ srwi. r8,r8,5
+ beq L(byte_loop)
+ cmplw r7,r8
+ mtctr r7
+ ble L(big_loop)
+ mtctr r8
+
+L(big_loop):
+ lwz r5,0(r3)
+ lwz r6,4(r3)
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ lwz r5,8(r3)
+ lwz r6,12(r3)
+ lwz r8,8(r4)
+ lwz r9,12(r4)
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ lwz r5,16(r3)
+ lwz r6,20(r3)
+ lwz r8,16(r4)
+ lwz r9,20(r4)
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ lwz r5,24(r3)
+ lwz r6,28(r3)
+ addi r3,r3,0x20
+ lwz r8,24(r4)
+ lwz r9,28(r4)
+ addi r4,r4,0x20
+ dlmzb. r12,r5,r6
+ bne L(end_check)
+ cmplw r5,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ bdnz L(big_loop)
+ b L(byte_loop)
+
+L(end_check):
+ subfic r12,r12,4
+ blt L(end_check2)
+ rlwinm r12,r12,3,0,31
+ srw r5,r5,r12
+ srw r8,r8,r12
+ cmplw r5,r8
+ bne L(st1)
+ b L(end_strcmp)
+
+L(end_check2):
+ addi r12,r12,4
+ cmplw r5,r8
+ rlwinm r12,r12,3,0,31
+ bne L(st1)
+ srw r6,r6,r12
+ srw r9,r9,r12
+ cmplw r6,r9
+ bne L(st1)
+
+L(end_strcmp):
+ addi r3,r0,0
+ blr
+
+L(st1):
+ mfcr r3
+ blr
+
+L(byte_loop):
+ lbz r5,0(r3)
+ addi r3,r3,1
+ lbz r6,0(r4)
+ addi r4,r4,1
+ cmplw r5,r6
+ bne L(st1)
+ cmpwi r5,0
+ beq L(end_strcmp)
+ b L(byte_loop)
+END (BP_SYM (strcmp))
+libc_hidden_builtin_def (strcmp)
diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
new file mode 100644
index 0000000..025ac16
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
@@ -0,0 +1,110 @@
+/* Optimized strcpy implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strcpy
+
+ Register Use
+ r3:destination and return address
+ r4:source address
+ r10:temp destination address
+
+ Implementation description
+ Loop by checking 2 words at a time, with dlmzb. Check if there is a null
+ in the 2 words. If there is a null jump to end checking to determine
+ where in the last 8 bytes it is. Copy the appropriate bytes of the last
+ 8 according to the null position. */
+
+EALIGN (BP_SYM (strcpy), 5, 0)
+ neg r7,r4
+ subi r4,r4,1
+ clrlwi. r8,r7,29
+ subi r10,r3,1
+ beq L(pre_word8_loop)
+ mtctr r8
+
+L(loop):
+ lbzu r5,0x01(r4)
+ cmpi cr5,r5,0x0
+ stbu r5,0x01(r10)
+ beq cr5,L(end_strcpy)
+ bdnz L(loop)
+
+L(pre_word8_loop):
+ subi r4,r4,3
+ subi r10,r10,3
+
+L(word8_loop):
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ lwzu r5,0x04(r4)
+ lwzu r6,0x04(r4)
+ dlmzb. r11,r5,r6
+ bne L(byte_copy)
+ stwu r5,0x04(r10)
+ stwu r6,0x04(r10)
+ b L(word8_loop)
+
+L(last_bytes_copy):
+ stwu r5,0x04(r10)
+ subi r11,r11,4
+ mtctr r11
+ addi r10,r10,3
+ subi r4,r4,1
+
+L(last_bytes_copy_loop):
+ lbzu r5,0x01(r4)
+ stbu r5,0x01(r10)
+ bdnz L(last_bytes_copy_loop)
+ blr
+
+L(byte_copy):
+ blt L(last_bytes_copy)
+ mtctr r11
+ addi r10,r10,3
+ subi r4,r4,5
+
+L(last_bytes_copy_loop2):
+ lbzu r5,0x01(r4)
+ stbu r5,0x01(r10)
+ bdnz L(last_bytes_copy_loop2)
+
+L(end_strcpy):
+ blr
+END (BP_SYM (strcpy))
+libc_hidden_builtin_def (strcpy)
diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
new file mode 100644
index 0000000..146b582
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strlen.S
@@ -0,0 +1,78 @@
+/* Optimized strlen implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strlen
+
+ Register Use
+ r3:source address and return length of string
+ r4:byte counter
+
+ Implementation description
+ Load 2 words at a time and count bytes, if we find null we subtract one from
+ the count and return the count value. We need to subtract one because
+ we don't count the null character as a byte. */
+
+EALIGN (BP_SYM (strlen),5,0)
+ neg r7,r3
+ clrlwi. r8,r7,29
+ addi r4,0,0
+ beq L(byte_count_loop)
+ mtctr r8
+
+L(loop):
+ lbz r5,0(r3)
+ cmpi cr5,r5,0x0
+ addi r3,r3,0x1
+ addi r4,r4,0x1
+ beq cr5,L(end_strlen)
+ bdnz L(loop)
+
+L(byte_count_loop):
+ lwz r5,0(r3)
+ lwz r6,4(r3)
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ lwz r5,8(r3)
+ lwz r6,12(r3)
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ lwz r5,16(r3)
+ lwz r6,20(r3)
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ lwz r5,24(r3)
+ lwz r6,28(r3)
+ addi r3,r3,0x20
+ dlmzb. r12,r5,r6
+ add r4,r4,r12
+ bne L(end_strlen)
+ b L(byte_count_loop)
+
+L(end_strlen):
+ addi r3,r4,-1
+ blr
+END (BP_SYM (strlen))
+libc_hidden_builtin_def (strlen)
diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
new file mode 100644
index 0000000..c1beb23
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
@@ -0,0 +1,131 @@
+/* Optimized strncmp implementation for PowerPC476.
+ Copyright (C) 2010 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
+ 02110-1301 USA. */
+
+#include <sysdep.h>
+#include <bp-sym.h>
+#include <bp-asm.h>
+
+/* strncmp
+
+ Register Use
+ r0:temp return equality
+ r3:source1 address, return equality
+ r4:source2 address
+ r5:byte count
+
+ Implementation description
+ Touch in 3 lines of D-cache.
+ If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
+ Check 2 words from src1 and src2. If unequal jump to end and
+ return src1 > src2 or src1 < src2.
+ If null check bytes before null and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If count = zero check bytes before zero counter and then jump to end and
+ return src1 > src2, src1 < src2 or src1 = src2.
+ If src1 = src2 and no null, repeat. */
+
+EALIGN (BP_SYM(strncmp),5,0)
+ neg r7,r3
+ clrlwi r7,r7,20
+ neg r8,r4
+ clrlwi r8,r8,20
+ srwi. r7,r7,3
+ beq L(prebyte_count_loop)
+ srwi. r8,r8,3
+ beq L(prebyte_count_loop)
+ cmplw r7,r8
+ mtctr r7
+ ble L(preword2_count_loop)
+ mtctr r8
+
+L(preword2_count_loop):
+ srwi. r6,r5,3
+ beq L(prebyte_count_loop)
+ mfctr r7
+ cmplw r6,r7
+ bgt L(set_count_loop)
+ mtctr r6
+ clrlwi r5,r5,29
+
+L(word2_count_loop):
+ lwz r10,0(r3)
+ lwz r6,4(r3)
+ addi r3,r3,0x08
+ lwz r8,0(r4)
+ lwz r9,4(r4)
+ addi r4,r4,0x08
+ dlmzb. r12,r10,r6
+ bne L(end_check)
+ cmplw r10,r8
+ bne L(st1)
+ cmplw r6,r9
+ bne L(st1)
+ bdnz L(word2_count_loop)
+
+L(prebyte_count_loop):
+ addi r5,r5,1
+ mtctr r5
+ bdz L(end_strncmp)
+
+L(byte_count_loop):
+ lbz r6,0(r3)
+ addi r3,r3,1
+ lbz r7,0(r4)
+ addi r4,r4,1
+ cmplw r6,r7
+ bne L(st1)
+ cmpwi r6,0
+ beq L(end_strncmp)
+ bdnz L(byte_count_loop)
+ b L(end_strncmp)
+
+L(set_count_loop):
+ slwi r7,r7,3
+ subf r5,r7,r5
+ b L(word2_count_loop)
+
+L(end_check):
+ subfic r12,r12,4
+ blt L(end_check2)
+ rlwinm r12,r12,3,0,31
+ srw r10,r10,r12
+ srw r8,r8,r12
+ cmplw r10,r8
+ bne L(st1)
+ b L(end_strncmp)
+
+L(end_check2):
+ addi r12,r12,4
+ cmplw r10,r8
+ rlwinm r12,r12,3,0,31
+ bne L(st1)
+ srw r6,r6,r12
+ srw r9,r9,r12
+ cmplw r6,r9
+ bne L(st1)
+
+L(end_strncmp):
+ addi r3,r0,0
+ blr
+
+L(st1):
+ mfcr r3
+ blr
+END (BP_SYM (strncmp))
+libc_hidden_builtin_def (strncmp)
diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
new file mode 100644
index 0000000..3d235de
--- /dev/null
+++ b/sysdeps/powerpc/powerpc32/Makefile
@@ -0,0 +1,8 @@
+# Some Powerpc32 variants assume soft-fp is the default even though there is
+# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
+
+ifeq ($(with-fp),yes)
++cflags += -mhard-float
+ASFLAGS += -mhard-float
+sysdep-LDFLAGS += -mhard-float
+endif
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
new file mode 100644
index 0000000..70c0d2e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/405/fpu
+powerpc/powerpc32/405
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
new file mode 100644
index 0000000..c3e52c5
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/440/fpu
+powerpc/powerpc32/440
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
new file mode 100644
index 0000000..2829f9c
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/464/fpu
+powerpc/powerpc32/464
diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
new file mode 100644
index 0000000..80f9170
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
@@ -0,0 +1,2 @@
+powerpc/powerpc32/476/fpu
+powerpc/powerpc32/476
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2010-09-03 15:00 ` Luis Machado
@ 2010-10-04 18:54 ` Luis Machado
2010-12-13 20:26 ` Ryan Arnold
0 siblings, 1 reply; 19+ messages in thread
From: Luis Machado @ 2010-10-04 18:54 UTC (permalink / raw)
To: Ryan Arnold; +Cc: libc-ports, rsa, Todd Iglehart, Josh Boyer
Ping?
On Fri, 2010-09-03 at 12:00 -0300, Luis Machado wrote:
> > Since Todd doesn't have copyright assignment these changes are
> > contributed to the FSF by IBM without author/contributor attribution.
> >
> > You can simply attribute the changes to him in the email leaving his
> > name out of the sources per FSF policy and submit them on IBM's
> > behalf.
> >
> > Ryan
>
> Thanks.
>
> Follows the updated patch without Todd's name on the sources.
>
> Luis
>
>
> 2010-09-03 Luis Machado <luisgpm@br.ibm.com>
>
> * sysdeps/powerpc/dl-procinfo.c: New file.
> * sysdeps/powerpc/dl-procinfo.h: New file.
> * sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
> * sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
> * sysdeps/powerpc/powerpc32/405/memset.S: New file.
> * sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
> * sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
> * sysdeps/powerpc/powerpc32/405/strlen.S: New file.
> * sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
> * sysdeps/powerpc/powerpc32/440/Implies: New file.
> * sysdeps/powerpc/powerpc32/464/Implies: New file.
> * sysdeps/powerpc/powerpc32/476/Implies: New file.
> * sysdeps/powerpc/powerpc32/Makefile: New file.
> * sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
> * sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
> * sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
> * sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.
>
> diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
> new file mode 100644
> index 0000000..60fb465
> --- /dev/null
> +++ b/sysdeps/powerpc/dl-procinfo.c
> @@ -0,0 +1,96 @@
> +/* Data for processor capability information. PowerPC version.
> + Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> + 02111-1307 USA. */
> +
> +/* This information must be kept in sync with the _DL_HWCAP_COUNT and
> + _DL_PLATFORM_COUNT definitions in procinfo.h.
> +
> + If anything should be added here check whether the size of each string
> + is still ok with the given array size.
> +
> + All the #ifdefs in the definitions are quite irritating but
> + necessary if we want to avoid duplicating the information. There
> + are three different modes:
> +
> + - PROCINFO_DECL is defined. This means we are only interested in
> + declarations.
> +
> + - PROCINFO_DECL is not defined:
> +
> + + if SHARED is defined the file is included in an array
> + initializer. The .element = { ... } syntax is needed.
> +
> + + if SHARED is not defined a normal array initialization is
> + needed.
> + */
> +
> +#ifndef PROCINFO_CLASS
> +# define PROCINFO_CLASS
> +#endif
> +
> +#if !defined PROCINFO_DECL && defined SHARED
> + ._dl_powerpc_cap_flags
> +#else
> +PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
> +#endif
> +#ifndef PROCINFO_DECL
> += {
> + "vsx",
> + "arch_2_06", "power6x", "dfp", "pa6t",
> + "arch_2_05", "ic_snoop", "smt", "booke",
> + "cellbe", "power5+", "power5", "power4",
> + "notb", "efpdouble", "efpsingle", "spe",
> + "ucache", "4xxmac", "mmu", "fpu",
> + "altivec", "ppc601", "ppc64", "ppc32",
> + }
> +#endif
> +#if !defined SHARED || defined PROCINFO_DECL
> +;
> +#else
> +,
> +#endif
> +
> +#if !defined PROCINFO_DECL && defined SHARED
> + ._dl_powerpc_platforms
> +#else
> +PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
> +#endif
> +#ifndef PROCINFO_DECL
> += {
> + [PPC_PLATFORM_POWER4] = "power4",
> + [PPC_PLATFORM_PPC970] = "ppc970",
> + [PPC_PLATFORM_POWER5] = "power5",
> + [PPC_PLATFORM_POWER5_PLUS] = "power5+",
> + [PPC_PLATFORM_POWER6] = "power6",
> + [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
> + [PPC_PLATFORM_POWER6X] = "power6x",
> + [PPC_PLATFORM_POWER7] = "power7",
> + [PPC_PLATFORM_PPC405] = "ppc405",
> + [PPC_PLATFORM_PPC440] = "ppc440",
> + [PPC_PLATFORM_PPC464] = "ppc464",
> + [PPC_PLATFORM_PPC476] = "ppc476"
> + }
> +#endif
> +#if !defined SHARED || defined PROCINFO_DECL
> +;
> +#else
> +,
> +#endif
> +
> +#undef PROCINFO_DECL
> +#undef PROCINFO_CLASS
> diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
> new file mode 100644
> index 0000000..87279de
> --- /dev/null
> +++ b/sysdeps/powerpc/dl-procinfo.h
> @@ -0,0 +1,168 @@
> +/* Processor capability information handling macros. PowerPC version.
> + Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> + 02111-1307 USA. */
> +
> +#ifndef _DL_PROCINFO_H
> +#define _DL_PROCINFO_H 1
> +
> +#include <ldsodefs.h>
> +#include <sysdep.h> /* This defines the PPC_FEATURE_* macros. */
> +
> +/* There are 25 bits used, but they are bits 7..31. */
> +#define _DL_HWCAP_FIRST 7
> +#define _DL_HWCAP_COUNT 32
> +
> +/* These bits influence library search. */
> +#define HWCAP_IMPORTANT (PPC_FEATURE_HAS_ALTIVEC \
> + + PPC_FEATURE_HAS_DFP)
> +
> +#define _DL_PLATFORMS_COUNT 12
> +
> +#define _DL_FIRST_PLATFORM 32
> +/* Mask to filter out platforms. */
> +#define _DL_HWCAP_PLATFORM (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
> + << _DL_FIRST_PLATFORM)
> +
> +/* Platform bits (relative to _DL_FIRST_PLATFORM). */
> +#define PPC_PLATFORM_POWER4 0
> +#define PPC_PLATFORM_PPC970 1
> +#define PPC_PLATFORM_POWER5 2
> +#define PPC_PLATFORM_POWER5_PLUS 3
> +#define PPC_PLATFORM_POWER6 4
> +#define PPC_PLATFORM_CELL_BE 5
> +#define PPC_PLATFORM_POWER6X 6
> +#define PPC_PLATFORM_POWER7 7
> +#define PPC_PLATFORM_PPC405 8
> +#define PPC_PLATFORM_PPC440 9
> +#define PPC_PLATFORM_PPC464 10
> +#define PPC_PLATFORM_PPC476 11
> +
> +static inline const char *
> +__attribute__ ((unused))
> +_dl_hwcap_string (int idx)
> +{
> + return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
> +}
> +
> +static inline const char *
> +__attribute__ ((unused))
> +_dl_platform_string (int idx)
> +{
> + return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
> +}
> +
> +static inline int
> +__attribute__ ((unused))
> +_dl_string_hwcap (const char *str)
> +{
> + for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> + if (strcmp (str, _dl_hwcap_string (i)) == 0)
> + return i;
> + return -1;
> +}
> +
> +static inline int
> +__attribute__ ((unused, always_inline))
> +_dl_string_platform (const char *str)
> +{
> + if (str == NULL)
> + return -1;
> +
> + if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
> + {
> + int ret;
> + str += 5;
> + switch (*str)
> + {
> + case '4':
> + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
> + break;
> + case '5':
> + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
> + if (str[1] == '+')
> + {
> + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
> + ++str;
> + }
> + break;
> + case '6':
> + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
> + if (str[1] == 'x')
> + {
> + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
> + ++str;
> + }
> + break;
> + case '7':
> + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
> + break;
> + default:
> + return -1;
> + }
> + if (str[1] == '\0')
> + return ret;
> + }
> + else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
> + 3) == 0)
> + {
> + if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
> + + 3) == 0)
> + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
> + else if (strcmp (str + 3,
> + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
> + == 0)
> + return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
> + else if (strcmp (str + 3,
> + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
> + == 0)
> + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
> + else if (strcmp (str + 3,
> + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
> + == 0)
> + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
> + else if (strcmp (str + 3,
> + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
> + == 0)
> + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
> + else if (strcmp (str + 3,
> + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
> + == 0)
> + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
> + }
> +
> + return -1;
> +}
> +
> +#ifdef IS_IN_rtld
> +static inline int
> +__attribute__ ((unused))
> +_dl_procinfo (int word)
> +{
> + _dl_printf ("AT_HWCAP: ");
> +
> + for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> + if (word & (1 << i))
> + _dl_printf (" %s", _dl_hwcap_string (i));
> +
> + _dl_printf ("\n");
> +
> + return 0;
> +}
> +#endif
> +
> +#endif /* dl-procinfo.h */
> diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
> new file mode 100644
> index 0000000..653d3b5
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
> @@ -0,0 +1,131 @@
> +/* Optimized memcmp implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> + 02110-1301 USA. */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* memcmp
> +
> + r3:source1 address, return equality
> + r4:source2 address
> + r5:byte count
> +
> + Check 2 words from src1 and src2. If unequal jump to end and
> + return src1 > src2 or src1 < src2.
> + If count = zero check bytes before zero counter and then jump to end and
> + return src1 > src2, src1 < src2 or src1 = src2.
> + If src1 = src2 and no null, repeat. */
> +
> +EALIGN (BP_SYM (memcmp), 5, 0)
> + srwi. r6,r5,5
> + beq L(preword2_count_loop)
> + mtctr r6
> + clrlwi r5,r5,27
> +
> +L(word8_compare_loop):
> + lwz r10,0(r3)
> + lwz r6,4(r3)
> + lwz r8,0(r4)
> + lwz r9,4(r4)
> + cmplw cr5,r8,r10
> + cmplw cr1,r9,r6
> + bne cr5,L(st2)
> + bne cr1,L(st1)
> + lwz r10,8(r3)
> + lwz r6,12(r3)
> + lwz r8,8(r4)
> + lwz r9,12(r4)
> + cmplw cr5,r8,r10
> + cmplw cr1,r9,r6
> + bne cr5,L(st2)
> + bne cr1,L(st1)
> + lwz r10,16(r3)
> + lwz r6,20(r3)
> + lwz r8,16(r4)
> + lwz r9,20(r4)
> + cmplw cr5,r8,r10
> + cmplw cr1,r9,r6
> + bne cr5,L(st2)
> + bne cr1,L(st1)
> + lwz r10,24(r3)
> + lwz r6,28(r3)
> + addi r3,r3,0x20
> + lwz r8,24(r4)
> + lwz r9,28(r4)
> + addi r4,r4,0x20
> + cmplw cr5,r8,r10
> + cmplw cr1,r9,r6
> + bne cr5,L(st2)
> + bne cr1,L(st1)
> + bdnz L(word8_compare_loop)
> +
> +L(preword2_count_loop):
> + srwi. r6,r5,3
> + beq L(prebyte_count_loop)
> + mtctr r6
> + clrlwi r5,r5,29
> +
> +L(word2_count_loop):
> + lwz r10,0(r3)
> + lwz r6,4(r3)
> + addi r3,r3,0x08
> + lwz r8,0(r4)
> + lwz r9,4(r4)
> + addi r4,r4,0x08
> + cmplw cr5,r8,r10
> + cmplw cr1,r9,r6
> + bne cr5,L(st2)
> + bne cr1,L(st1)
> + bdnz L(word2_count_loop)
> +
> +L(prebyte_count_loop):
> + addi r5,r5,1
> + mtctr r5
> + bdz L(end_memcmp)
> +
> +L(byte_count_loop):
> + lbz r6,0(r3)
> + addi r3,r3,0x01
> + lbz r8,0(r4)
> + addi r4,r4,0x01
> + cmplw cr5,r8,r6
> + bne cr5,L(st2)
> + bdnz L(byte_count_loop)
> +
> +L(end_memcmp):
> + addi r3,r0,0
> + blr
> +
> +L(l_r):
> + addi r3,r0,1
> + blr
> +
> +L(st1):
> + blt cr1,L(l_r)
> + addi r3,r0,-1
> + blr
> +
> +L(st2):
> + blt cr5,L(l_r)
> + addi r3,r0,-1
> + blr
> +END (BP_SYM (memcmp))
> +libc_hidden_builtin_def (memcmp)
> +weak_alias (memcmp,bcmp)
> diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
> new file mode 100644
> index 0000000..a654c73
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
> @@ -0,0 +1,133 @@
> +/* Optimized memcpy implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> + 02110-1301 USA. */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* memcpy
> +
> + r0:return address
> + r3:destination address
> + r4:source address
> + r5:byte count
> +
> + Save return address in r0.
> + If destinationn and source are unaligned and copy count is greater than 256
> + then copy 0-3 bytes to make destination aligned.
> + If 32 or more bytes to copy we use 32 byte copy loop.
> + Finaly we copy 0-31 extra bytes. */
> +
> +EALIGN (BP_SYM (memcpy), 5, 0)
> +/* Check if bytes to copy are greater than 256 and if
> + source and destination are unaligned */
> + cmpwi r5,0x0100
> + addi r0,r3,0
> + ble L(string_count_loop)
> + neg r6,r3
> + clrlwi. r6,r6,30
> + beq L(string_count_loop)
> + neg r6,r4
> + clrlwi. r6,r6,30
> + beq L(string_count_loop)
> + mtctr r6
> + subf r5,r6,r5
> +
> +L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
> + lbz r8,0x0(r4)
> + addi r4,r4,1
> + stb r8,0x0(r3)
> + addi r3,r3,1
> + bdnz L(unaligned_bytecopy_loop)
> + srwi. r7,r5,5
> + beq L(preword2_count_loop)
> + mtctr r7
> +
> +L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> + lwz r6,0(r4)
> + lwz r7,4(r4)
> + lwz r8,8(r4)
> + lwz r9,12(r4)
> + subi r5,r5,0x20
> + stw r6,0(r3)
> + stw r7,4(r3)
> + stw r8,8(r3)
> + stw r9,12(r3)
> + lwz r6,16(r4)
> + lwz r7,20(r4)
> + lwz r8,24(r4)
> + lwz r9,28(r4)
> + addi r4,r4,0x20
> + stw r6,16(r3)
> + stw r7,20(r3)
> + stw r8,24(r3)
> + stw r9,28(r3)
> + addi r3,r3,0x20
> + bdnz L(word8_count_loop_no_dcbt)
> +
> +L(preword2_count_loop): /* Copy remaining 0-31 bytes */
> + clrlwi. r12,r5,27
> + beq L(end_memcpy)
> + mtxer r12
> + lswx r5,0,r4
> + stswx r5,0,r3
> + mr r3,r0
> + blr
> +
> +L(string_count_loop): /* Copy odd 0-31 bytes */
> + clrlwi. r12,r5,28
> + add r3,r3,r5
> + add r4,r4,r5
> + beq L(pre_string_copy)
> + mtxer r12
> + subf r4,r12,r4
> + subf r3,r12,r3
> + lswx r6,0,r4
> + stswx r6,0,r3
> +
> +L(pre_string_copy): /* Check how many 32 byte chunck to copy */
> + srwi. r7,r5,4
> + beq L(end_memcpy)
> + mtctr r7
> +
> +L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> + lwz r6,-4(r4)
> + lwz r7,-8(r4)
> + lwz r8,-12(r4)
> + lwzu r9,-16(r4)
> + stw r6,-4(r3)
> + stw r7,-8(r3)
> + stw r8,-12(r3)
> + stwu r9,-16(r3)
> + bdz L(end_memcpy)
> + lwz r6,-4(r4)
> + lwz r7,-8(r4)
> + lwz r8,-12(r4)
> + lwzu r9,-16(r4)
> + stw r6,-4(r3)
> + stw r7,-8(r3)
> + stw r8,-12(r3)
> + stwu r9,-16(r3)
> + bdnz L(word4_count_loop_no_dcbt)
> +
> +L(end_memcpy):
> + mr r3,r0
> + blr
> +END (BP_SYM (memcpy))
> +libc_hidden_builtin_def (memcpy)
> diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
> new file mode 100644
> index 0000000..69d5d4c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/memset.S
> @@ -0,0 +1,155 @@
> +/* Optimized memset implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> + 02110-1301 USA. */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* memset
> +
> + r3:destination address and return address
> + r4:source integer to copy
> + r5:byte count
> + r11:sources integer to copy in all 32 bits of reg
> + r12:temp return address
> +
> + Save return address in r12
> + If destinationn is unaligned and count is greater tha 255 bytes
> + set 0-3 bytes to make destination aligned
> + If count is greater tha 255 bytes and setting zero to memory
> + use dbcz to set memeory when we can
> + otherwsie do the follwoing
> + If 16 or more words to set we use 16 word copy loop.
> + Finaly we set 0-15 extra bytes with string store. */
> +
> +EALIGN (BP_SYM (memset), 5, 0)
> + rlwinm r11,r4,0,24,31
> + rlwimi r11,r4,8,16,23
> + rlwimi r11,r11,16,0,15
> + addi r12,r3,0
> + cmpwi r5,0x00FF
> + ble L(preword8_count_loop)
> + cmpwi r4,0x00
> + beq L(use_dcbz)
> + neg r6,r3
> + clrlwi. r6,r6,30
> + beq L(preword8_count_loop)
> + addi r8,0,1
> + mtctr r6
> + subi r3,r3,1
> +
> +L(unaligned_bytecopy_loop):
> + stbu r11,0x1(r3)
> + subf. r5,r8,r5
> + beq L(end_memset)
> + bdnz L(unaligned_bytecopy_loop)
> + addi r3,r3,1
> +
> +L(preword8_count_loop):
> + srwi. r6,r5,4
> + beq L(preword2_count_loop)
> + mtctr r6
> + addi r3,r3,-4
> + mr r8,r11
> + mr r9,r11
> + mr r10,r11
> +
> +L(word8_count_loop_no_dcbt):
> + stwu r8,4(r3)
> + stwu r9,4(r3)
> + subi r5,r5,0x10
> + stwu r10,4(r3)
> + stwu r11,4(r3)
> + bdnz L(word8_count_loop_no_dcbt)
> + addi r3,r3,4
> +
> +L(preword2_count_loop):
> + clrlwi. r7,r5,28
> + beq L(end_memset)
> + mr r8,r11
> + mr r9,r11
> + mr r10,r11
> + mtxer r7
> + stswx r8,0,r3
> +
> +L(end_memset):
> + addi r3,r12,0
> + blr
> +
> +L(use_dcbz):
> + neg r6,r3
> + clrlwi. r7,r6,28
> + beq L(skip_string_loop)
> + mr r8,r11
> + mr r9,r11
> + mr r10,r11
> + subf r5,r7,r5
> + mtxer r7
> + stswx r8,0,r3
> + add r3,r3,r7
> +
> +L(skip_string_loop):
> + clrlwi r8,r6,25
> + srwi. r8,r8,4
> + beq L(dcbz_pre_loop)
> + mtctr r8
> +
> +L(word_loop):
> + stw r11,0(r3)
> + subi r5,r5,0x10
> + stw r11,4(r3)
> + stw r11,8(r3)
> + stw r11,12(r3)
> + addi r3,r3,0x10
> + bdnz L(word_loop)
> +
> +L(dcbz_pre_loop):
> + srwi r6,r5,7
> + mtctr r6
> + addi r7,0,0
> +
> +L(dcbz_loop):
> + dcbz r3,r7
> + addi r3,r3,0x80
> + subi r5,r5,0x80
> + bdnz L(dcbz_loop)
> + srwi. r6,r5,4
> + beq L(postword2_count_loop)
> + mtctr r6
> +
> +L(postword8_count_loop):
> + stw r11,0(r3)
> + subi r5,r5,0x10
> + stw r11,4(r3)
> + stw r11,8(r3)
> + stw r11,12(r3)
> + addi r3,r3,0x10
> + bdnz L(postword8_count_loop)
> +
> +L(postword2_count_loop):
> + clrlwi. r7,r5,28
> + beq L(end_memset)
> + mr r8,r11
> + mr r9,r11
> + mr r10,r11
> + mtxer r7
> + stswx r8,0,r3
> + b L(end_memset)
> +END (BP_SYM (memset))
> +libc_hidden_builtin_def (memset)
> diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
> new file mode 100644
> index 0000000..6eb5b5a
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
> @@ -0,0 +1,137 @@
> +/* Optimized strcmp implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> + 02110-1301 USA. */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strcmp
> +
> + Register Use
> + r0:temp return equality
> + r3:source1 address, return equality
> + r4:source2 address
> +
> + Implementation description
> + Check 2 words from src1 and src2. If unequal jump to end and
> + return src1 > src2 or src1 < src2.
> + If null check bytes before null and then jump to end and
> + return src1 > src2, src1 < src2 or src1 = src2.
> + If src1 = src2 and no null, repeat. */
> +
> +EALIGN (BP_SYM(strcmp),5,0)
> + neg r7,r3
> + clrlwi r7,r7,20
> + neg r8,r4
> + clrlwi r8,r8,20
> + srwi. r7,r7,5
> + beq L(byte_loop)
> + srwi. r8,r8,5
> + beq L(byte_loop)
> + cmplw r7,r8
> + mtctr r7
> + ble L(big_loop)
> + mtctr r8
> +
> +L(big_loop):
> + lwz r5,0(r3)
> + lwz r6,4(r3)
> + lwz r8,0(r4)
> + lwz r9,4(r4)
> + dlmzb. r12,r5,r6
> + bne L(end_check)
> + cmplw r5,r8
> + bne L(st1)
> + cmplw r6,r9
> + bne L(st1)
> + lwz r5,8(r3)
> + lwz r6,12(r3)
> + lwz r8,8(r4)
> + lwz r9,12(r4)
> + dlmzb. r12,r5,r6
> + bne L(end_check)
> + cmplw r5,r8
> + bne L(st1)
> + cmplw r6,r9
> + bne L(st1)
> + lwz r5,16(r3)
> + lwz r6,20(r3)
> + lwz r8,16(r4)
> + lwz r9,20(r4)
> + dlmzb. r12,r5,r6
> + bne L(end_check)
> + cmplw r5,r8
> + bne L(st1)
> + cmplw r6,r9
> + bne L(st1)
> + lwz r5,24(r3)
> + lwz r6,28(r3)
> + addi r3,r3,0x20
> + lwz r8,24(r4)
> + lwz r9,28(r4)
> + addi r4,r4,0x20
> + dlmzb. r12,r5,r6
> + bne L(end_check)
> + cmplw r5,r8
> + bne L(st1)
> + cmplw r6,r9
> + bne L(st1)
> + bdnz L(big_loop)
> + b L(byte_loop)
> +
> +L(end_check):
> + subfic r12,r12,4
> + blt L(end_check2)
> + rlwinm r12,r12,3,0,31
> + srw r5,r5,r12
> + srw r8,r8,r12
> + cmplw r5,r8
> + bne L(st1)
> + b L(end_strcmp)
> +
> +L(end_check2):
> + addi r12,r12,4
> + cmplw r5,r8
> + rlwinm r12,r12,3,0,31
> + bne L(st1)
> + srw r6,r6,r12
> + srw r9,r9,r12
> + cmplw r6,r9
> + bne L(st1)
> +
> +L(end_strcmp):
> + addi r3,r0,0
> + blr
> +
> +L(st1):
> + mfcr r3
> + blr
> +
> +L(byte_loop):
> + lbz r5,0(r3)
> + addi r3,r3,1
> + lbz r6,0(r4)
> + addi r4,r4,1
> + cmplw r5,r6
> + bne L(st1)
> + cmpwi r5,0
> + beq L(end_strcmp)
> + b L(byte_loop)
> +END (BP_SYM (strcmp))
> +libc_hidden_builtin_def (strcmp)
> diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
> new file mode 100644
> index 0000000..025ac16
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
> @@ -0,0 +1,110 @@
> +/* Optimized strcpy implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> + 02110-1301 USA. */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strcpy
> +
> + Register Use
> + r3:destination and return address
> + r4:source address
> + r10:temp destination address
> +
> + Implementation description
> + Loop by checking 2 words at a time, with dlmzb. Check if there is a null
> + in the 2 words. If there is a null jump to end checking to determine
> + where in the last 8 bytes it is. Copy the appropriate bytes of the last
> + 8 according to the null position. */
> +
> +EALIGN (BP_SYM (strcpy), 5, 0)
> + neg r7,r4
> + subi r4,r4,1
> + clrlwi. r8,r7,29
> + subi r10,r3,1
> + beq L(pre_word8_loop)
> + mtctr r8
> +
> +L(loop):
> + lbzu r5,0x01(r4)
> + cmpi cr5,r5,0x0
> + stbu r5,0x01(r10)
> + beq cr5,L(end_strcpy)
> + bdnz L(loop)
> +
> +L(pre_word8_loop):
> + subi r4,r4,3
> + subi r10,r10,3
> +
> +L(word8_loop):
> + lwzu r5,0x04(r4)
> + lwzu r6,0x04(r4)
> + dlmzb. r11,r5,r6
> + bne L(byte_copy)
> + stwu r5,0x04(r10)
> + stwu r6,0x04(r10)
> + lwzu r5,0x04(r4)
> + lwzu r6,0x04(r4)
> + dlmzb. r11,r5,r6
> + bne L(byte_copy)
> + stwu r5,0x04(r10)
> + stwu r6,0x04(r10)
> + lwzu r5,0x04(r4)
> + lwzu r6,0x04(r4)
> + dlmzb. r11,r5,r6
> + bne L(byte_copy)
> + stwu r5,0x04(r10)
> + stwu r6,0x04(r10)
> + lwzu r5,0x04(r4)
> + lwzu r6,0x04(r4)
> + dlmzb. r11,r5,r6
> + bne L(byte_copy)
> + stwu r5,0x04(r10)
> + stwu r6,0x04(r10)
> + b L(word8_loop)
> +
> +L(last_bytes_copy):
> + stwu r5,0x04(r10)
> + subi r11,r11,4
> + mtctr r11
> + addi r10,r10,3
> + subi r4,r4,1
> +
> +L(last_bytes_copy_loop):
> + lbzu r5,0x01(r4)
> + stbu r5,0x01(r10)
> + bdnz L(last_bytes_copy_loop)
> + blr
> +
> +L(byte_copy):
> + blt L(last_bytes_copy)
> + mtctr r11
> + addi r10,r10,3
> + subi r4,r4,5
> +
> +L(last_bytes_copy_loop2):
> + lbzu r5,0x01(r4)
> + stbu r5,0x01(r10)
> + bdnz L(last_bytes_copy_loop2)
> +
> +L(end_strcpy):
> + blr
> +END (BP_SYM (strcpy))
> +libc_hidden_builtin_def (strcpy)
> diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
> new file mode 100644
> index 0000000..146b582
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strlen.S
> @@ -0,0 +1,78 @@
> +/* Optimized strlen implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> + 02110-1301 USA. */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strlen
> +
> + Register Use
> + r3:source address and return length of string
> + r4:byte counter
> +
> + Implementation description
> + Load 2 words at a time and count bytes, if we find null we subtract one from
> + the count and return the count value. We need to subtract one because
> + we don't count the null character as a byte. */
> +
> +EALIGN (BP_SYM (strlen),5,0)
> + neg r7,r3
> + clrlwi. r8,r7,29
> + addi r4,0,0
> + beq L(byte_count_loop)
> + mtctr r8
> +
> +L(loop):
> + lbz r5,0(r3)
> + cmpi cr5,r5,0x0
> + addi r3,r3,0x1
> + addi r4,r4,0x1
> + beq cr5,L(end_strlen)
> + bdnz L(loop)
> +
> +L(byte_count_loop):
> + lwz r5,0(r3)
> + lwz r6,4(r3)
> + dlmzb. r12,r5,r6
> + add r4,r4,r12
> + bne L(end_strlen)
> + lwz r5,8(r3)
> + lwz r6,12(r3)
> + dlmzb. r12,r5,r6
> + add r4,r4,r12
> + bne L(end_strlen)
> + lwz r5,16(r3)
> + lwz r6,20(r3)
> + dlmzb. r12,r5,r6
> + add r4,r4,r12
> + bne L(end_strlen)
> + lwz r5,24(r3)
> + lwz r6,28(r3)
> + addi r3,r3,0x20
> + dlmzb. r12,r5,r6
> + add r4,r4,r12
> + bne L(end_strlen)
> + b L(byte_count_loop)
> +
> +L(end_strlen):
> + addi r3,r4,-1
> + blr
> +END (BP_SYM (strlen))
> +libc_hidden_builtin_def (strlen)
> diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
> new file mode 100644
> index 0000000..c1beb23
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
> @@ -0,0 +1,131 @@
> +/* Optimized strncmp implementation for PowerPC476.
> + Copyright (C) 2010 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, write to the Free
> + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> + 02110-1301 USA. */
> +
> +#include <sysdep.h>
> +#include <bp-sym.h>
> +#include <bp-asm.h>
> +
> +/* strncmp
> +
> + Register Use
> + r0:temp return equality
> + r3:source1 address, return equality
> + r4:source2 address
> + r5:byte count
> +
> + Implementation description
> + Touch in 3 lines of D-cache.
> + If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
> + Check 2 words from src1 and src2. If unequal jump to end and
> + return src1 > src2 or src1 < src2.
> + If null check bytes before null and then jump to end and
> + return src1 > src2, src1 < src2 or src1 = src2.
> + If count = zero check bytes before zero counter and then jump to end and
> + return src1 > src2, src1 < src2 or src1 = src2.
> + If src1 = src2 and no null, repeat. */
> +
> +EALIGN (BP_SYM(strncmp),5,0)
> + neg r7,r3
> + clrlwi r7,r7,20
> + neg r8,r4
> + clrlwi r8,r8,20
> + srwi. r7,r7,3
> + beq L(prebyte_count_loop)
> + srwi. r8,r8,3
> + beq L(prebyte_count_loop)
> + cmplw r7,r8
> + mtctr r7
> + ble L(preword2_count_loop)
> + mtctr r8
> +
> +L(preword2_count_loop):
> + srwi. r6,r5,3
> + beq L(prebyte_count_loop)
> + mfctr r7
> + cmplw r6,r7
> + bgt L(set_count_loop)
> + mtctr r6
> + clrlwi r5,r5,29
> +
> +L(word2_count_loop):
> + lwz r10,0(r3)
> + lwz r6,4(r3)
> + addi r3,r3,0x08
> + lwz r8,0(r4)
> + lwz r9,4(r4)
> + addi r4,r4,0x08
> + dlmzb. r12,r10,r6
> + bne L(end_check)
> + cmplw r10,r8
> + bne L(st1)
> + cmplw r6,r9
> + bne L(st1)
> + bdnz L(word2_count_loop)
> +
> +L(prebyte_count_loop):
> + addi r5,r5,1
> + mtctr r5
> + bdz L(end_strncmp)
> +
> +L(byte_count_loop):
> + lbz r6,0(r3)
> + addi r3,r3,1
> + lbz r7,0(r4)
> + addi r4,r4,1
> + cmplw r6,r7
> + bne L(st1)
> + cmpwi r6,0
> + beq L(end_strncmp)
> + bdnz L(byte_count_loop)
> + b L(end_strncmp)
> +
> +L(set_count_loop):
> + slwi r7,r7,3
> + subf r5,r7,r5
> + b L(word2_count_loop)
> +
> +L(end_check):
> + subfic r12,r12,4
> + blt L(end_check2)
> + rlwinm r12,r12,3,0,31
> + srw r10,r10,r12
> + srw r8,r8,r12
> + cmplw r10,r8
> + bne L(st1)
> + b L(end_strncmp)
> +
> +L(end_check2):
> + addi r12,r12,4
> + cmplw r10,r8
> + rlwinm r12,r12,3,0,31
> + bne L(st1)
> + srw r6,r6,r12
> + srw r9,r9,r12
> + cmplw r6,r9
> + bne L(st1)
> +
> +L(end_strncmp):
> + addi r3,r0,0
> + blr
> +
> +L(st1):
> + mfcr r3
> + blr
> +END (BP_SYM (strncmp))
> +libc_hidden_builtin_def (strncmp)
> diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
> new file mode 100644
> index 0000000..70c0d2e
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/440/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/405/fpu
> +powerpc/powerpc32/405
> diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
> new file mode 100644
> index 0000000..c3e52c5
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/464/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/440/fpu
> +powerpc/powerpc32/440
> diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
> new file mode 100644
> index 0000000..2829f9c
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/476/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/464/fpu
> +powerpc/powerpc32/464
> diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
> new file mode 100644
> index 0000000..3d235de
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc32/Makefile
> @@ -0,0 +1,8 @@
> +# Some Powerpc32 variants assume soft-fp is the default even though there is
> +# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
> +
> +ifeq ($(with-fp),yes)
> ++cflags += -mhard-float
> +ASFLAGS += -mhard-float
> +sysdep-LDFLAGS += -mhard-float
> +endif
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> new file mode 100644
> index 0000000..70c0d2e
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/405/fpu
> +powerpc/powerpc32/405
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> new file mode 100644
> index 0000000..c3e52c5
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/440/fpu
> +powerpc/powerpc32/440
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> new file mode 100644
> index 0000000..2829f9c
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/464/fpu
> +powerpc/powerpc32/464
> diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> new file mode 100644
> index 0000000..80f9170
> --- /dev/null
> +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> @@ -0,0 +1,2 @@
> +powerpc/powerpc32/476/fpu
> +powerpc/powerpc32/476
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2010-10-04 18:54 ` Luis Machado
@ 2010-12-13 20:26 ` Ryan Arnold
2011-01-18 13:16 ` Ryan Arnold
0 siblings, 1 reply; 19+ messages in thread
From: Ryan Arnold @ 2010-12-13 20:26 UTC (permalink / raw)
To: luisgpm; +Cc: Ryan Arnold, libc-ports, Todd Iglehart, Josh Boyer
On Mon, 2010-10-04 at 15:53 -0300, Luis Machado wrote:
> Ping?
>
> On Fri, 2010-09-03 at 12:00 -0300, Luis Machado wrote:
> > > Since Todd doesn't have copyright assignment these changes are
> > > contributed to the FSF by IBM without author/contributor attribution.
> > >
> > > You can simply attribute the changes to him in the email leaving his
> > > name out of the sources per FSF policy and submit them on IBM's
> > > behalf.
> > >
> > > Ryan
> >
> > Thanks.
> >
> > Follows the updated patch without Todd's name on the sources.
> >
> > Luis
> >
> >
> > 2010-09-03 Luis Machado <luisgpm@br.ibm.com>
> >
> > * sysdeps/powerpc/dl-procinfo.c: New file.
> > * sysdeps/powerpc/dl-procinfo.h: New file.
> > * sysdeps/powerpc/powerpc32/405/memcmp.S: New file.
> > * sysdeps/powerpc/powerpc32/405/memcpy.S: New file.
> > * sysdeps/powerpc/powerpc32/405/memset.S: New file.
> > * sysdeps/powerpc/powerpc32/405/strcmp.S: New file.
> > * sysdeps/powerpc/powerpc32/405/strcpy.S: New file.
> > * sysdeps/powerpc/powerpc32/405/strlen.S: New file.
> > * sysdeps/powerpc/powerpc32/405/strncmp.S: New file.
> > * sysdeps/powerpc/powerpc32/440/Implies: New file.
> > * sysdeps/powerpc/powerpc32/464/Implies: New file.
> > * sysdeps/powerpc/powerpc32/476/Implies: New file.
> > * sysdeps/powerpc/powerpc32/Makefile: New file.
> > * sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies: New file.
> > * sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies: New file.
> > * sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies: New file.
> > * sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies: New file.
> >
> > diff --git a/sysdeps/powerpc/dl-procinfo.c b/sysdeps/powerpc/dl-procinfo.c
> > new file mode 100644
> > index 0000000..60fb465
> > --- /dev/null
> > +++ b/sysdeps/powerpc/dl-procinfo.c
> > @@ -0,0 +1,96 @@
> > +/* Data for processor capability information. PowerPC version.
> > + Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> > + 02111-1307 USA. */
> > +
> > +/* This information must be kept in sync with the _DL_HWCAP_COUNT and
> > + _DL_PLATFORM_COUNT definitions in procinfo.h.
> > +
> > + If anything should be added here check whether the size of each string
> > + is still ok with the given array size.
> > +
> > + All the #ifdefs in the definitions are quite irritating but
> > + necessary if we want to avoid duplicating the information. There
> > + are three different modes:
> > +
> > + - PROCINFO_DECL is defined. This means we are only interested in
> > + declarations.
> > +
> > + - PROCINFO_DECL is not defined:
> > +
> > + + if SHARED is defined the file is included in an array
> > + initializer. The .element = { ... } syntax is needed.
> > +
> > + + if SHARED is not defined a normal array initialization is
> > + needed.
> > + */
> > +
> > +#ifndef PROCINFO_CLASS
> > +# define PROCINFO_CLASS
> > +#endif
> > +
> > +#if !defined PROCINFO_DECL && defined SHARED
> > + ._dl_powerpc_cap_flags
> > +#else
> > +PROCINFO_CLASS const char _dl_powerpc_cap_flags[25][10]
> > +#endif
> > +#ifndef PROCINFO_DECL
> > += {
> > + "vsx",
> > + "arch_2_06", "power6x", "dfp", "pa6t",
> > + "arch_2_05", "ic_snoop", "smt", "booke",
> > + "cellbe", "power5+", "power5", "power4",
> > + "notb", "efpdouble", "efpsingle", "spe",
> > + "ucache", "4xxmac", "mmu", "fpu",
> > + "altivec", "ppc601", "ppc64", "ppc32",
> > + }
> > +#endif
> > +#if !defined SHARED || defined PROCINFO_DECL
> > +;
> > +#else
> > +,
> > +#endif
> > +
> > +#if !defined PROCINFO_DECL && defined SHARED
> > + ._dl_powerpc_platforms
> > +#else
> > +PROCINFO_CLASS const char _dl_powerpc_platforms[12][12]
> > +#endif
> > +#ifndef PROCINFO_DECL
> > += {
> > + [PPC_PLATFORM_POWER4] = "power4",
> > + [PPC_PLATFORM_PPC970] = "ppc970",
> > + [PPC_PLATFORM_POWER5] = "power5",
> > + [PPC_PLATFORM_POWER5_PLUS] = "power5+",
> > + [PPC_PLATFORM_POWER6] = "power6",
> > + [PPC_PLATFORM_CELL_BE] = "ppc-cell-be",
> > + [PPC_PLATFORM_POWER6X] = "power6x",
> > + [PPC_PLATFORM_POWER7] = "power7",
> > + [PPC_PLATFORM_PPC405] = "ppc405",
> > + [PPC_PLATFORM_PPC440] = "ppc440",
> > + [PPC_PLATFORM_PPC464] = "ppc464",
> > + [PPC_PLATFORM_PPC476] = "ppc476"
> > + }
> > +#endif
> > +#if !defined SHARED || defined PROCINFO_DECL
> > +;
> > +#else
> > +,
> > +#endif
> > +
> > +#undef PROCINFO_DECL
> > +#undef PROCINFO_CLASS
> > diff --git a/sysdeps/powerpc/dl-procinfo.h b/sysdeps/powerpc/dl-procinfo.h
> > new file mode 100644
> > index 0000000..87279de
> > --- /dev/null
> > +++ b/sysdeps/powerpc/dl-procinfo.h
> > @@ -0,0 +1,168 @@
> > +/* Processor capability information handling macros. PowerPC version.
> > + Copyright (C) 2005, 2006, 2008 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
> > + 02111-1307 USA. */
> > +
> > +#ifndef _DL_PROCINFO_H
> > +#define _DL_PROCINFO_H 1
> > +
> > +#include <ldsodefs.h>
> > +#include <sysdep.h> /* This defines the PPC_FEATURE_* macros. */
> > +
> > +/* There are 25 bits used, but they are bits 7..31. */
> > +#define _DL_HWCAP_FIRST 7
> > +#define _DL_HWCAP_COUNT 32
> > +
> > +/* These bits influence library search. */
> > +#define HWCAP_IMPORTANT (PPC_FEATURE_HAS_ALTIVEC \
> > + + PPC_FEATURE_HAS_DFP)
> > +
> > +#define _DL_PLATFORMS_COUNT 12
> > +
> > +#define _DL_FIRST_PLATFORM 32
> > +/* Mask to filter out platforms. */
> > +#define _DL_HWCAP_PLATFORM (((1ULL << _DL_PLATFORMS_COUNT) - 1) \
> > + << _DL_FIRST_PLATFORM)
> > +
> > +/* Platform bits (relative to _DL_FIRST_PLATFORM). */
> > +#define PPC_PLATFORM_POWER4 0
> > +#define PPC_PLATFORM_PPC970 1
> > +#define PPC_PLATFORM_POWER5 2
> > +#define PPC_PLATFORM_POWER5_PLUS 3
> > +#define PPC_PLATFORM_POWER6 4
> > +#define PPC_PLATFORM_CELL_BE 5
> > +#define PPC_PLATFORM_POWER6X 6
> > +#define PPC_PLATFORM_POWER7 7
> > +#define PPC_PLATFORM_PPC405 8
> > +#define PPC_PLATFORM_PPC440 9
> > +#define PPC_PLATFORM_PPC464 10
> > +#define PPC_PLATFORM_PPC476 11
> > +
> > +static inline const char *
> > +__attribute__ ((unused))
> > +_dl_hwcap_string (int idx)
> > +{
> > + return GLRO(dl_powerpc_cap_flags)[idx - _DL_HWCAP_FIRST];
> > +}
> > +
> > +static inline const char *
> > +__attribute__ ((unused))
> > +_dl_platform_string (int idx)
> > +{
> > + return GLRO(dl_powerpc_platforms)[idx - _DL_FIRST_PLATFORM];
> > +}
> > +
> > +static inline int
> > +__attribute__ ((unused))
> > +_dl_string_hwcap (const char *str)
> > +{
> > + for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> > + if (strcmp (str, _dl_hwcap_string (i)) == 0)
> > + return i;
> > + return -1;
> > +}
> > +
> > +static inline int
> > +__attribute__ ((unused, always_inline))
> > +_dl_string_platform (const char *str)
> > +{
> > + if (str == NULL)
> > + return -1;
> > +
> > + if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_POWER4], 5) == 0)
> > + {
> > + int ret;
> > + str += 5;
> > + switch (*str)
> > + {
> > + case '4':
> > + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER4;
> > + break;
> > + case '5':
> > + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5;
> > + if (str[1] == '+')
> > + {
> > + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER5_PLUS;
> > + ++str;
> > + }
> > + break;
> > + case '6':
> > + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6;
> > + if (str[1] == 'x')
> > + {
> > + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER6X;
> > + ++str;
> > + }
> > + break;
> > + case '7':
> > + ret = _DL_FIRST_PLATFORM + PPC_PLATFORM_POWER7;
> > + break;
> > + default:
> > + return -1;
> > + }
> > + if (str[1] == '\0')
> > + return ret;
> > + }
> > + else if (strncmp (str, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970],
> > + 3) == 0)
> > + {
> > + if (strcmp (str + 3, GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC970]
> > + + 3) == 0)
> > + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC970;
> > + else if (strcmp (str + 3,
> > + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_CELL_BE] + 3)
> > + == 0)
> > + return _DL_FIRST_PLATFORM + PPC_PLATFORM_CELL_BE;
> > + else if (strcmp (str + 3,
> > + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC405] + 3)
> > + == 0)
> > + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC405;
> > + else if (strcmp (str + 3,
> > + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC440] + 3)
> > + == 0)
> > + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC440;
> > + else if (strcmp (str + 3,
> > + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC464] + 3)
> > + == 0)
> > + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC464;
> > + else if (strcmp (str + 3,
> > + GLRO(dl_powerpc_platforms)[PPC_PLATFORM_PPC476] + 3)
> > + == 0)
> > + return _DL_FIRST_PLATFORM + PPC_PLATFORM_PPC476;
> > + }
> > +
> > + return -1;
> > +}
> > +
> > +#ifdef IS_IN_rtld
> > +static inline int
> > +__attribute__ ((unused))
> > +_dl_procinfo (int word)
> > +{
> > + _dl_printf ("AT_HWCAP: ");
> > +
> > + for (int i = _DL_HWCAP_FIRST; i < _DL_HWCAP_COUNT; ++i)
> > + if (word & (1 << i))
> > + _dl_printf (" %s", _dl_hwcap_string (i));
> > +
> > + _dl_printf ("\n");
> > +
> > + return 0;
> > +}
> > +#endif
> > +
> > +#endif /* dl-procinfo.h */
> > diff --git a/sysdeps/powerpc/powerpc32/405/memcmp.S b/sysdeps/powerpc/powerpc32/405/memcmp.S
> > new file mode 100644
> > index 0000000..653d3b5
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/memcmp.S
> > @@ -0,0 +1,131 @@
> > +/* Optimized memcmp implementation for PowerPC476.
> > + Copyright (C) 2010 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > + 02110-1301 USA. */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* memcmp
> > +
> > + r3:source1 address, return equality
> > + r4:source2 address
> > + r5:byte count
> > +
> > + Check 2 words from src1 and src2. If unequal jump to end and
> > + return src1 > src2 or src1 < src2.
> > + If count = zero check bytes before zero counter and then jump to end and
> > + return src1 > src2, src1 < src2 or src1 = src2.
> > + If src1 = src2 and no null, repeat. */
> > +
> > +EALIGN (BP_SYM (memcmp), 5, 0)
> > + srwi. r6,r5,5
> > + beq L(preword2_count_loop)
> > + mtctr r6
> > + clrlwi r5,r5,27
> > +
> > +L(word8_compare_loop):
> > + lwz r10,0(r3)
> > + lwz r6,4(r3)
> > + lwz r8,0(r4)
> > + lwz r9,4(r4)
> > + cmplw cr5,r8,r10
> > + cmplw cr1,r9,r6
> > + bne cr5,L(st2)
> > + bne cr1,L(st1)
> > + lwz r10,8(r3)
> > + lwz r6,12(r3)
> > + lwz r8,8(r4)
> > + lwz r9,12(r4)
> > + cmplw cr5,r8,r10
> > + cmplw cr1,r9,r6
> > + bne cr5,L(st2)
> > + bne cr1,L(st1)
> > + lwz r10,16(r3)
> > + lwz r6,20(r3)
> > + lwz r8,16(r4)
> > + lwz r9,20(r4)
> > + cmplw cr5,r8,r10
> > + cmplw cr1,r9,r6
> > + bne cr5,L(st2)
> > + bne cr1,L(st1)
> > + lwz r10,24(r3)
> > + lwz r6,28(r3)
> > + addi r3,r3,0x20
> > + lwz r8,24(r4)
> > + lwz r9,28(r4)
> > + addi r4,r4,0x20
> > + cmplw cr5,r8,r10
> > + cmplw cr1,r9,r6
> > + bne cr5,L(st2)
> > + bne cr1,L(st1)
> > + bdnz L(word8_compare_loop)
> > +
> > +L(preword2_count_loop):
> > + srwi. r6,r5,3
> > + beq L(prebyte_count_loop)
> > + mtctr r6
> > + clrlwi r5,r5,29
> > +
> > +L(word2_count_loop):
> > + lwz r10,0(r3)
> > + lwz r6,4(r3)
> > + addi r3,r3,0x08
> > + lwz r8,0(r4)
> > + lwz r9,4(r4)
> > + addi r4,r4,0x08
> > + cmplw cr5,r8,r10
> > + cmplw cr1,r9,r6
> > + bne cr5,L(st2)
> > + bne cr1,L(st1)
> > + bdnz L(word2_count_loop)
> > +
> > +L(prebyte_count_loop):
> > + addi r5,r5,1
> > + mtctr r5
> > + bdz L(end_memcmp)
> > +
> > +L(byte_count_loop):
> > + lbz r6,0(r3)
> > + addi r3,r3,0x01
> > + lbz r8,0(r4)
> > + addi r4,r4,0x01
> > + cmplw cr5,r8,r6
> > + bne cr5,L(st2)
> > + bdnz L(byte_count_loop)
> > +
> > +L(end_memcmp):
> > + addi r3,r0,0
> > + blr
> > +
> > +L(l_r):
> > + addi r3,r0,1
> > + blr
> > +
> > +L(st1):
> > + blt cr1,L(l_r)
> > + addi r3,r0,-1
> > + blr
> > +
> > +L(st2):
> > + blt cr5,L(l_r)
> > + addi r3,r0,-1
> > + blr
> > +END (BP_SYM (memcmp))
> > +libc_hidden_builtin_def (memcmp)
> > +weak_alias (memcmp,bcmp)
> > diff --git a/sysdeps/powerpc/powerpc32/405/memcpy.S b/sysdeps/powerpc/powerpc32/405/memcpy.S
> > new file mode 100644
> > index 0000000..a654c73
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/memcpy.S
> > @@ -0,0 +1,133 @@
> > +/* Optimized memcpy implementation for PowerPC476.
> > + Copyright (C) 2010 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > + 02110-1301 USA. */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* memcpy
> > +
> > + r0:return address
> > + r3:destination address
> > + r4:source address
> > + r5:byte count
> > +
> > + Save return address in r0.
> > + If destinationn and source are unaligned and copy count is greater than 256
> > + then copy 0-3 bytes to make destination aligned.
> > + If 32 or more bytes to copy we use 32 byte copy loop.
> > + Finaly we copy 0-31 extra bytes. */
> > +
> > +EALIGN (BP_SYM (memcpy), 5, 0)
> > +/* Check if bytes to copy are greater than 256 and if
> > + source and destination are unaligned */
> > + cmpwi r5,0x0100
> > + addi r0,r3,0
> > + ble L(string_count_loop)
> > + neg r6,r3
> > + clrlwi. r6,r6,30
> > + beq L(string_count_loop)
> > + neg r6,r4
> > + clrlwi. r6,r6,30
> > + beq L(string_count_loop)
> > + mtctr r6
> > + subf r5,r6,r5
> > +
> > +L(unaligned_bytecopy_loop): /* Align destination by coping 0-3 bytes */
> > + lbz r8,0x0(r4)
> > + addi r4,r4,1
> > + stb r8,0x0(r3)
> > + addi r3,r3,1
> > + bdnz L(unaligned_bytecopy_loop)
> > + srwi. r7,r5,5
> > + beq L(preword2_count_loop)
> > + mtctr r7
> > +
> > +L(word8_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> > + lwz r6,0(r4)
> > + lwz r7,4(r4)
> > + lwz r8,8(r4)
> > + lwz r9,12(r4)
> > + subi r5,r5,0x20
> > + stw r6,0(r3)
> > + stw r7,4(r3)
> > + stw r8,8(r3)
> > + stw r9,12(r3)
> > + lwz r6,16(r4)
> > + lwz r7,20(r4)
> > + lwz r8,24(r4)
> > + lwz r9,28(r4)
> > + addi r4,r4,0x20
> > + stw r6,16(r3)
> > + stw r7,20(r3)
> > + stw r8,24(r3)
> > + stw r9,28(r3)
> > + addi r3,r3,0x20
> > + bdnz L(word8_count_loop_no_dcbt)
> > +
> > +L(preword2_count_loop): /* Copy remaining 0-31 bytes */
> > + clrlwi. r12,r5,27
> > + beq L(end_memcpy)
> > + mtxer r12
> > + lswx r5,0,r4
> > + stswx r5,0,r3
> > + mr r3,r0
> > + blr
> > +
> > +L(string_count_loop): /* Copy odd 0-31 bytes */
> > + clrlwi. r12,r5,28
> > + add r3,r3,r5
> > + add r4,r4,r5
> > + beq L(pre_string_copy)
> > + mtxer r12
> > + subf r4,r12,r4
> > + subf r3,r12,r3
> > + lswx r6,0,r4
> > + stswx r6,0,r3
> > +
> > +L(pre_string_copy): /* Check how many 32 byte chunck to copy */
> > + srwi. r7,r5,4
> > + beq L(end_memcpy)
> > + mtctr r7
> > +
> > +L(word4_count_loop_no_dcbt): /* Copy 32 bytes at a time */
> > + lwz r6,-4(r4)
> > + lwz r7,-8(r4)
> > + lwz r8,-12(r4)
> > + lwzu r9,-16(r4)
> > + stw r6,-4(r3)
> > + stw r7,-8(r3)
> > + stw r8,-12(r3)
> > + stwu r9,-16(r3)
> > + bdz L(end_memcpy)
> > + lwz r6,-4(r4)
> > + lwz r7,-8(r4)
> > + lwz r8,-12(r4)
> > + lwzu r9,-16(r4)
> > + stw r6,-4(r3)
> > + stw r7,-8(r3)
> > + stw r8,-12(r3)
> > + stwu r9,-16(r3)
> > + bdnz L(word4_count_loop_no_dcbt)
> > +
> > +L(end_memcpy):
> > + mr r3,r0
> > + blr
> > +END (BP_SYM (memcpy))
> > +libc_hidden_builtin_def (memcpy)
> > diff --git a/sysdeps/powerpc/powerpc32/405/memset.S b/sysdeps/powerpc/powerpc32/405/memset.S
> > new file mode 100644
> > index 0000000..69d5d4c
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/memset.S
> > @@ -0,0 +1,155 @@
> > +/* Optimized memset implementation for PowerPC476.
> > + Copyright (C) 2010 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > + 02110-1301 USA. */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* memset
> > +
> > + r3:destination address and return address
> > + r4:source integer to copy
> > + r5:byte count
> > + r11:sources integer to copy in all 32 bits of reg
> > + r12:temp return address
> > +
> > + Save return address in r12
> > + If destinationn is unaligned and count is greater tha 255 bytes
> > + set 0-3 bytes to make destination aligned
> > + If count is greater tha 255 bytes and setting zero to memory
> > + use dbcz to set memeory when we can
> > + otherwsie do the follwoing
> > + If 16 or more words to set we use 16 word copy loop.
> > + Finaly we set 0-15 extra bytes with string store. */
> > +
> > +EALIGN (BP_SYM (memset), 5, 0)
> > + rlwinm r11,r4,0,24,31
> > + rlwimi r11,r4,8,16,23
> > + rlwimi r11,r11,16,0,15
> > + addi r12,r3,0
> > + cmpwi r5,0x00FF
> > + ble L(preword8_count_loop)
> > + cmpwi r4,0x00
> > + beq L(use_dcbz)
> > + neg r6,r3
> > + clrlwi. r6,r6,30
> > + beq L(preword8_count_loop)
> > + addi r8,0,1
> > + mtctr r6
> > + subi r3,r3,1
> > +
> > +L(unaligned_bytecopy_loop):
> > + stbu r11,0x1(r3)
> > + subf. r5,r8,r5
> > + beq L(end_memset)
> > + bdnz L(unaligned_bytecopy_loop)
> > + addi r3,r3,1
> > +
> > +L(preword8_count_loop):
> > + srwi. r6,r5,4
> > + beq L(preword2_count_loop)
> > + mtctr r6
> > + addi r3,r3,-4
> > + mr r8,r11
> > + mr r9,r11
> > + mr r10,r11
> > +
> > +L(word8_count_loop_no_dcbt):
> > + stwu r8,4(r3)
> > + stwu r9,4(r3)
> > + subi r5,r5,0x10
> > + stwu r10,4(r3)
> > + stwu r11,4(r3)
> > + bdnz L(word8_count_loop_no_dcbt)
> > + addi r3,r3,4
> > +
> > +L(preword2_count_loop):
> > + clrlwi. r7,r5,28
> > + beq L(end_memset)
> > + mr r8,r11
> > + mr r9,r11
> > + mr r10,r11
> > + mtxer r7
> > + stswx r8,0,r3
> > +
> > +L(end_memset):
> > + addi r3,r12,0
> > + blr
> > +
> > +L(use_dcbz):
> > + neg r6,r3
> > + clrlwi. r7,r6,28
> > + beq L(skip_string_loop)
> > + mr r8,r11
> > + mr r9,r11
> > + mr r10,r11
> > + subf r5,r7,r5
> > + mtxer r7
> > + stswx r8,0,r3
> > + add r3,r3,r7
> > +
> > +L(skip_string_loop):
> > + clrlwi r8,r6,25
> > + srwi. r8,r8,4
> > + beq L(dcbz_pre_loop)
> > + mtctr r8
> > +
> > +L(word_loop):
> > + stw r11,0(r3)
> > + subi r5,r5,0x10
> > + stw r11,4(r3)
> > + stw r11,8(r3)
> > + stw r11,12(r3)
> > + addi r3,r3,0x10
> > + bdnz L(word_loop)
> > +
> > +L(dcbz_pre_loop):
> > + srwi r6,r5,7
> > + mtctr r6
> > + addi r7,0,0
> > +
> > +L(dcbz_loop):
> > + dcbz r3,r7
> > + addi r3,r3,0x80
> > + subi r5,r5,0x80
> > + bdnz L(dcbz_loop)
> > + srwi. r6,r5,4
> > + beq L(postword2_count_loop)
> > + mtctr r6
> > +
> > +L(postword8_count_loop):
> > + stw r11,0(r3)
> > + subi r5,r5,0x10
> > + stw r11,4(r3)
> > + stw r11,8(r3)
> > + stw r11,12(r3)
> > + addi r3,r3,0x10
> > + bdnz L(postword8_count_loop)
> > +
> > +L(postword2_count_loop):
> > + clrlwi. r7,r5,28
> > + beq L(end_memset)
> > + mr r8,r11
> > + mr r9,r11
> > + mr r10,r11
> > + mtxer r7
> > + stswx r8,0,r3
> > + b L(end_memset)
> > +END (BP_SYM (memset))
> > +libc_hidden_builtin_def (memset)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strcmp.S b/sysdeps/powerpc/powerpc32/405/strcmp.S
> > new file mode 100644
> > index 0000000..6eb5b5a
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strcmp.S
> > @@ -0,0 +1,137 @@
> > +/* Optimized strcmp implementation for PowerPC476.
> > + Copyright (C) 2010 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > + 02110-1301 USA. */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strcmp
> > +
> > + Register Use
> > + r0:temp return equality
> > + r3:source1 address, return equality
> > + r4:source2 address
> > +
> > + Implementation description
> > + Check 2 words from src1 and src2. If unequal jump to end and
> > + return src1 > src2 or src1 < src2.
> > + If null check bytes before null and then jump to end and
> > + return src1 > src2, src1 < src2 or src1 = src2.
> > + If src1 = src2 and no null, repeat. */
> > +
> > +EALIGN (BP_SYM(strcmp),5,0)
> > + neg r7,r3
> > + clrlwi r7,r7,20
> > + neg r8,r4
> > + clrlwi r8,r8,20
> > + srwi. r7,r7,5
> > + beq L(byte_loop)
> > + srwi. r8,r8,5
> > + beq L(byte_loop)
> > + cmplw r7,r8
> > + mtctr r7
> > + ble L(big_loop)
> > + mtctr r8
> > +
> > +L(big_loop):
> > + lwz r5,0(r3)
> > + lwz r6,4(r3)
> > + lwz r8,0(r4)
> > + lwz r9,4(r4)
> > + dlmzb. r12,r5,r6
> > + bne L(end_check)
> > + cmplw r5,r8
> > + bne L(st1)
> > + cmplw r6,r9
> > + bne L(st1)
> > + lwz r5,8(r3)
> > + lwz r6,12(r3)
> > + lwz r8,8(r4)
> > + lwz r9,12(r4)
> > + dlmzb. r12,r5,r6
> > + bne L(end_check)
> > + cmplw r5,r8
> > + bne L(st1)
> > + cmplw r6,r9
> > + bne L(st1)
> > + lwz r5,16(r3)
> > + lwz r6,20(r3)
> > + lwz r8,16(r4)
> > + lwz r9,20(r4)
> > + dlmzb. r12,r5,r6
> > + bne L(end_check)
> > + cmplw r5,r8
> > + bne L(st1)
> > + cmplw r6,r9
> > + bne L(st1)
> > + lwz r5,24(r3)
> > + lwz r6,28(r3)
> > + addi r3,r3,0x20
> > + lwz r8,24(r4)
> > + lwz r9,28(r4)
> > + addi r4,r4,0x20
> > + dlmzb. r12,r5,r6
> > + bne L(end_check)
> > + cmplw r5,r8
> > + bne L(st1)
> > + cmplw r6,r9
> > + bne L(st1)
> > + bdnz L(big_loop)
> > + b L(byte_loop)
> > +
> > +L(end_check):
> > + subfic r12,r12,4
> > + blt L(end_check2)
> > + rlwinm r12,r12,3,0,31
> > + srw r5,r5,r12
> > + srw r8,r8,r12
> > + cmplw r5,r8
> > + bne L(st1)
> > + b L(end_strcmp)
> > +
> > +L(end_check2):
> > + addi r12,r12,4
> > + cmplw r5,r8
> > + rlwinm r12,r12,3,0,31
> > + bne L(st1)
> > + srw r6,r6,r12
> > + srw r9,r9,r12
> > + cmplw r6,r9
> > + bne L(st1)
> > +
> > +L(end_strcmp):
> > + addi r3,r0,0
> > + blr
> > +
> > +L(st1):
> > + mfcr r3
> > + blr
> > +
> > +L(byte_loop):
> > + lbz r5,0(r3)
> > + addi r3,r3,1
> > + lbz r6,0(r4)
> > + addi r4,r4,1
> > + cmplw r5,r6
> > + bne L(st1)
> > + cmpwi r5,0
> > + beq L(end_strcmp)
> > + b L(byte_loop)
> > +END (BP_SYM (strcmp))
> > +libc_hidden_builtin_def (strcmp)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strcpy.S b/sysdeps/powerpc/powerpc32/405/strcpy.S
> > new file mode 100644
> > index 0000000..025ac16
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strcpy.S
> > @@ -0,0 +1,110 @@
> > +/* Optimized strcpy implementation for PowerPC476.
> > + Copyright (C) 2010 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > + 02110-1301 USA. */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strcpy
> > +
> > + Register Use
> > + r3:destination and return address
> > + r4:source address
> > + r10:temp destination address
> > +
> > + Implementation description
> > + Loop by checking 2 words at a time, with dlmzb. Check if there is a null
> > + in the 2 words. If there is a null jump to end checking to determine
> > + where in the last 8 bytes it is. Copy the appropriate bytes of the last
> > + 8 according to the null position. */
> > +
> > +EALIGN (BP_SYM (strcpy), 5, 0)
> > + neg r7,r4
> > + subi r4,r4,1
> > + clrlwi. r8,r7,29
> > + subi r10,r3,1
> > + beq L(pre_word8_loop)
> > + mtctr r8
> > +
> > +L(loop):
> > + lbzu r5,0x01(r4)
> > + cmpi cr5,r5,0x0
> > + stbu r5,0x01(r10)
> > + beq cr5,L(end_strcpy)
> > + bdnz L(loop)
> > +
> > +L(pre_word8_loop):
> > + subi r4,r4,3
> > + subi r10,r10,3
> > +
> > +L(word8_loop):
> > + lwzu r5,0x04(r4)
> > + lwzu r6,0x04(r4)
> > + dlmzb. r11,r5,r6
> > + bne L(byte_copy)
> > + stwu r5,0x04(r10)
> > + stwu r6,0x04(r10)
> > + lwzu r5,0x04(r4)
> > + lwzu r6,0x04(r4)
> > + dlmzb. r11,r5,r6
> > + bne L(byte_copy)
> > + stwu r5,0x04(r10)
> > + stwu r6,0x04(r10)
> > + lwzu r5,0x04(r4)
> > + lwzu r6,0x04(r4)
> > + dlmzb. r11,r5,r6
> > + bne L(byte_copy)
> > + stwu r5,0x04(r10)
> > + stwu r6,0x04(r10)
> > + lwzu r5,0x04(r4)
> > + lwzu r6,0x04(r4)
> > + dlmzb. r11,r5,r6
> > + bne L(byte_copy)
> > + stwu r5,0x04(r10)
> > + stwu r6,0x04(r10)
> > + b L(word8_loop)
> > +
> > +L(last_bytes_copy):
> > + stwu r5,0x04(r10)
> > + subi r11,r11,4
> > + mtctr r11
> > + addi r10,r10,3
> > + subi r4,r4,1
> > +
> > +L(last_bytes_copy_loop):
> > + lbzu r5,0x01(r4)
> > + stbu r5,0x01(r10)
> > + bdnz L(last_bytes_copy_loop)
> > + blr
> > +
> > +L(byte_copy):
> > + blt L(last_bytes_copy)
> > + mtctr r11
> > + addi r10,r10,3
> > + subi r4,r4,5
> > +
> > +L(last_bytes_copy_loop2):
> > + lbzu r5,0x01(r4)
> > + stbu r5,0x01(r10)
> > + bdnz L(last_bytes_copy_loop2)
> > +
> > +L(end_strcpy):
> > + blr
> > +END (BP_SYM (strcpy))
> > +libc_hidden_builtin_def (strcpy)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strlen.S b/sysdeps/powerpc/powerpc32/405/strlen.S
> > new file mode 100644
> > index 0000000..146b582
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strlen.S
> > @@ -0,0 +1,78 @@
> > +/* Optimized strlen implementation for PowerPC476.
> > + Copyright (C) 2010 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > + 02110-1301 USA. */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strlen
> > +
> > + Register Use
> > + r3:source address and return length of string
> > + r4:byte counter
> > +
> > + Implementation description
> > + Load 2 words at a time and count bytes, if we find null we subtract one from
> > + the count and return the count value. We need to subtract one because
> > + we don't count the null character as a byte. */
> > +
> > +EALIGN (BP_SYM (strlen),5,0)
> > + neg r7,r3
> > + clrlwi. r8,r7,29
> > + addi r4,0,0
> > + beq L(byte_count_loop)
> > + mtctr r8
> > +
> > +L(loop):
> > + lbz r5,0(r3)
> > + cmpi cr5,r5,0x0
> > + addi r3,r3,0x1
> > + addi r4,r4,0x1
> > + beq cr5,L(end_strlen)
> > + bdnz L(loop)
> > +
> > +L(byte_count_loop):
> > + lwz r5,0(r3)
> > + lwz r6,4(r3)
> > + dlmzb. r12,r5,r6
> > + add r4,r4,r12
> > + bne L(end_strlen)
> > + lwz r5,8(r3)
> > + lwz r6,12(r3)
> > + dlmzb. r12,r5,r6
> > + add r4,r4,r12
> > + bne L(end_strlen)
> > + lwz r5,16(r3)
> > + lwz r6,20(r3)
> > + dlmzb. r12,r5,r6
> > + add r4,r4,r12
> > + bne L(end_strlen)
> > + lwz r5,24(r3)
> > + lwz r6,28(r3)
> > + addi r3,r3,0x20
> > + dlmzb. r12,r5,r6
> > + add r4,r4,r12
> > + bne L(end_strlen)
> > + b L(byte_count_loop)
> > +
> > +L(end_strlen):
> > + addi r3,r4,-1
> > + blr
> > +END (BP_SYM (strlen))
> > +libc_hidden_builtin_def (strlen)
> > diff --git a/sysdeps/powerpc/powerpc32/405/strncmp.S b/sysdeps/powerpc/powerpc32/405/strncmp.S
> > new file mode 100644
> > index 0000000..c1beb23
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/405/strncmp.S
> > @@ -0,0 +1,131 @@
> > +/* Optimized strncmp implementation for PowerPC476.
> > + Copyright (C) 2010 Free Software Foundation, Inc.
> > + This file is part of the GNU C Library.
> > +
> > + The GNU C Library is free software; you can redistribute it and/or
> > + modify it under the terms of the GNU Lesser General Public
> > + License as published by the Free Software Foundation; either
> > + version 2.1 of the License, or (at your option) any later version.
> > +
> > + The GNU C Library is distributed in the hope that it will be useful,
> > + but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + Lesser General Public License for more details.
> > +
> > + You should have received a copy of the GNU Lesser General Public
> > + License along with the GNU C Library; if not, write to the Free
> > + Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
> > + 02110-1301 USA. */
> > +
> > +#include <sysdep.h>
> > +#include <bp-sym.h>
> > +#include <bp-asm.h>
> > +
> > +/* strncmp
> > +
> > + Register Use
> > + r0:temp return equality
> > + r3:source1 address, return equality
> > + r4:source2 address
> > + r5:byte count
> > +
> > + Implementation description
> > + Touch in 3 lines of D-cache.
> > + If source1 or source2 is unaligned copy 0-3 bytes to make source1 aligned
> > + Check 2 words from src1 and src2. If unequal jump to end and
> > + return src1 > src2 or src1 < src2.
> > + If null check bytes before null and then jump to end and
> > + return src1 > src2, src1 < src2 or src1 = src2.
> > + If count = zero check bytes before zero counter and then jump to end and
> > + return src1 > src2, src1 < src2 or src1 = src2.
> > + If src1 = src2 and no null, repeat. */
> > +
> > +EALIGN (BP_SYM(strncmp),5,0)
> > + neg r7,r3
> > + clrlwi r7,r7,20
> > + neg r8,r4
> > + clrlwi r8,r8,20
> > + srwi. r7,r7,3
> > + beq L(prebyte_count_loop)
> > + srwi. r8,r8,3
> > + beq L(prebyte_count_loop)
> > + cmplw r7,r8
> > + mtctr r7
> > + ble L(preword2_count_loop)
> > + mtctr r8
> > +
> > +L(preword2_count_loop):
> > + srwi. r6,r5,3
> > + beq L(prebyte_count_loop)
> > + mfctr r7
> > + cmplw r6,r7
> > + bgt L(set_count_loop)
> > + mtctr r6
> > + clrlwi r5,r5,29
> > +
> > +L(word2_count_loop):
> > + lwz r10,0(r3)
> > + lwz r6,4(r3)
> > + addi r3,r3,0x08
> > + lwz r8,0(r4)
> > + lwz r9,4(r4)
> > + addi r4,r4,0x08
> > + dlmzb. r12,r10,r6
> > + bne L(end_check)
> > + cmplw r10,r8
> > + bne L(st1)
> > + cmplw r6,r9
> > + bne L(st1)
> > + bdnz L(word2_count_loop)
> > +
> > +L(prebyte_count_loop):
> > + addi r5,r5,1
> > + mtctr r5
> > + bdz L(end_strncmp)
> > +
> > +L(byte_count_loop):
> > + lbz r6,0(r3)
> > + addi r3,r3,1
> > + lbz r7,0(r4)
> > + addi r4,r4,1
> > + cmplw r6,r7
> > + bne L(st1)
> > + cmpwi r6,0
> > + beq L(end_strncmp)
> > + bdnz L(byte_count_loop)
> > + b L(end_strncmp)
> > +
> > +L(set_count_loop):
> > + slwi r7,r7,3
> > + subf r5,r7,r5
> > + b L(word2_count_loop)
> > +
> > +L(end_check):
> > + subfic r12,r12,4
> > + blt L(end_check2)
> > + rlwinm r12,r12,3,0,31
> > + srw r10,r10,r12
> > + srw r8,r8,r12
> > + cmplw r10,r8
> > + bne L(st1)
> > + b L(end_strncmp)
> > +
> > +L(end_check2):
> > + addi r12,r12,4
> > + cmplw r10,r8
> > + rlwinm r12,r12,3,0,31
> > + bne L(st1)
> > + srw r6,r6,r12
> > + srw r9,r9,r12
> > + cmplw r6,r9
> > + bne L(st1)
> > +
> > +L(end_strncmp):
> > + addi r3,r0,0
> > + blr
> > +
> > +L(st1):
> > + mfcr r3
> > + blr
> > +END (BP_SYM (strncmp))
> > +libc_hidden_builtin_def (strncmp)
> > diff --git a/sysdeps/powerpc/powerpc32/440/Implies b/sysdeps/powerpc/powerpc32/440/Implies
> > new file mode 100644
> > index 0000000..70c0d2e
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/440/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/405/fpu
> > +powerpc/powerpc32/405
> > diff --git a/sysdeps/powerpc/powerpc32/464/Implies b/sysdeps/powerpc/powerpc32/464/Implies
> > new file mode 100644
> > index 0000000..c3e52c5
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/464/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/440/fpu
> > +powerpc/powerpc32/440
> > diff --git a/sysdeps/powerpc/powerpc32/476/Implies b/sysdeps/powerpc/powerpc32/476/Implies
> > new file mode 100644
> > index 0000000..2829f9c
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/476/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/464/fpu
> > +powerpc/powerpc32/464
> > diff --git a/sysdeps/powerpc/powerpc32/Makefile b/sysdeps/powerpc/powerpc32/Makefile
> > new file mode 100644
> > index 0000000..3d235de
> > --- /dev/null
> > +++ b/sysdeps/powerpc/powerpc32/Makefile
> > @@ -0,0 +1,8 @@
> > +# Some Powerpc32 variants assume soft-fp is the default even though there is
> > +# an fp variant so provide -mhard-float if --with-fp is explicitly passed.
> > +
> > +ifeq ($(with-fp),yes)
> > ++cflags += -mhard-float
> > +ASFLAGS += -mhard-float
> > +sysdep-LDFLAGS += -mhard-float
> > +endif
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> > new file mode 100644
> > index 0000000..70c0d2e
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/405/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/405/fpu
> > +powerpc/powerpc32/405
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> > new file mode 100644
> > index 0000000..c3e52c5
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/440/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/440/fpu
> > +powerpc/powerpc32/440
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> > new file mode 100644
> > index 0000000..2829f9c
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/464/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/464/fpu
> > +powerpc/powerpc32/464
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> > new file mode 100644
> > index 0000000..80f9170
> > --- /dev/null
> > +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/476/Implies
> > @@ -0,0 +1,2 @@
> > +powerpc/powerpc32/476/fpu
> > +powerpc/powerpc32/476
> >
> >
Sorry for the delinquent response. This looks good to me and I think it
should be checked in.
I'd like for someone with a 405, 440, or 464 to test it further. As far
as we know the code only uses instructions available on all of these
platforms.
I'd like to stress that it was authored by Todd Iglehart
<iglehart@us.ibm.com> and contributed by IBM. Luis did the fixup and
authored the implies structure.
Ryan S. Arnold
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2010-12-13 20:26 ` Ryan Arnold
@ 2011-01-18 13:16 ` Ryan Arnold
2011-01-25 21:32 ` Joseph S. Myers
2012-01-18 20:31 ` acrux@cruxppc.org
0 siblings, 2 replies; 19+ messages in thread
From: Ryan Arnold @ 2011-01-18 13:16 UTC (permalink / raw)
To: libc-ports; +Cc: luisgpm, Todd Iglehart, Josh Boyer, rsa
On Mon, Dec 13, 2010 at 2:25 PM, Ryan Arnold <rsa@us.ibm.com> wrote:
> Sorry for the delinquent response. This looks good to me and I think it
> should be checked in.
>
> I'd like for someone with a 405, 440, or 464 to test it further. As far
> as we know the code only uses instructions available on all of these
> platforms.
>
> I'd like to stress that it was authored by Todd Iglehart
> <iglehart@us.ibm.com> and contributed by IBM. Luis did the fixup and
> authored the implies structure.
>
> Ryan S. Arnold
I've checked this patch into glibc-ports under:
commit # a72cc2b29d00207fd8e2ee4612502339a14816b6
Just a general note on configuration; some of these processors have a
floating point unit but I believe all of them default to soft-fp.
GLIBC configure won't recognize --with-cpu=476fp even though the
compiler might recognize -mcpu=476fp.
If you want to configure a hard-fp build just pass --with-cpu=476
--with-fp instead and a new Makefile fragment will make sure that
-mhard-float is added to CFLAGS and ASFLAGS.
Ryan S. Arnold
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2011-01-18 13:16 ` Ryan Arnold
@ 2011-01-25 21:32 ` Joseph S. Myers
2012-01-18 20:31 ` acrux@cruxppc.org
1 sibling, 0 replies; 19+ messages in thread
From: Joseph S. Myers @ 2011-01-25 21:32 UTC (permalink / raw)
To: Ryan Arnold; +Cc: libc-ports, luisgpm, Todd Iglehart, Josh Boyer, rsa
On Wed, 12 Jan 2011, Ryan Arnold wrote:
> I've checked this patch into glibc-ports under:
>
> commit # a72cc2b29d00207fd8e2ee4612502339a14816b6
I've now moved the ChangeLog entry from the target-independent ChangeLog
to ChangeLog.powerpc, where is where changes to powerpc sysdeps files in
ports go.
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2011-01-18 13:16 ` Ryan Arnold
2011-01-25 21:32 ` Joseph S. Myers
@ 2012-01-18 20:31 ` acrux@cruxppc.org
2012-01-19 19:35 ` Carlos O'Donell
1 sibling, 1 reply; 19+ messages in thread
From: acrux@cruxppc.org @ 2012-01-18 20:31 UTC (permalink / raw)
To: libc-ports
just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a Sam440ep[1]
(PPC440EP SoC [2]) but it remains stuck in this point:
[...]
gcc -m32 -nostdlib -nostartfiles -o
/home/999/new/work/src/build32/sunrpc/rpcgen -mhard-float
-Wl,-dynamic-linker=/lib/ld.so.1 -Wl,-z,combreloc -Wl,-z,relro
-Wl,--hash-style=both /home/999/new/work/src/build32/csu/crt1.o
/home/999/new/work/src/build32/csu/crti.o `gcc -m32 -mhard-float
--print-file-name=crtbegin.o`
/home/999/new/work/src/build32/sunrpc/rpc_main.o
/home/999/new/work/src/build32/sunrpc/rpc_hout.o
/home/999/new/work/src/build32/sunrpc/rpc_cout.o
/home/999/new/work/src/build32/sunrpc/rpc_parse.o
/home/999/new/work/src/build32/sunrpc/rpc_scan.o
/home/999/new/work/src/build32/sunrpc/rpc_util.o
/home/999/new/work/src/build32/sunrpc/rpc_svcout.o
/home/999/new/work/src/build32/sunrpc/rpc_clntout.o
/home/999/new/work/src/build32/sunrpc/rpc_tblout.o
/home/999/new/work/src/build32/sunrpc/rpc_sample.o
-Wl,-rpath-link=/home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
/home/999/new/work/src/build32/libc.so.6
/home/999/new/work/src/build32/libc_nonshared.a -Wl,--as-needed
/home/999/new/work/src/build32/elf/ld.so -Wl,--no-as-needed -lgcc
-Wl,--as-needed -lgcc_s -Wl,--no-as-needed `gcc -m32 -mhard-float
--print-file-name=crtend.o` /home/999/new/work/src/build32/csu/crtn.o
gcc -m32 rpcinfo.c -c -std=gnu99 -fgnu89-inline -O2 -Wall -Winline
-Wwrite-strings -fmerge-all-constants -mcpu=440 -mcpu=powerpc -pipe
-mhard-float -mnew-mnemonics -Wstrict-prototypes -mlong-double-128
-I../include -I/home/999/new/work/src/build32/sunrpc
-I/home/999/new/work/src/build32 -I../sysdeps/powerpc/powerpc32/elf
-I../sysdeps/powerpc/elf
-I../ports/sysdeps/unix/sysv/linux/powerpc/powerpc32/440
-I../ports/sysdeps/powerpc/powerpc32/440
-I../ports/sysdeps/powerpc/powerpc32/405
-I../sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu
-I../sysdeps/powerpc/powerpc32/fpu
-I../nptl/sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../ports/sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../sysdeps/unix/sysv/linux/powerpc/powerpc32
-I../nptl/sysdeps/unix/sysv/linux/powerpc
-I../ports/sysdeps/unix/sysv/linux/powerpc
-I../sysdeps/unix/sysv/linux/powerpc -I../sysdeps/ieee754/ldbl-128ibm
-I../sysdeps/ieee754/ldbl-opt -I../nptl/sysdeps/unix/sysv/linux
-I../nptl/sysdeps/pthread -I../sysdeps/pthread
-I../ports/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux
-I../sysdeps/gnu -I../sysdeps/unix/common -I../sysdeps/unix/mman
-I../sysdeps/unix/inet -I../nptl/sysdeps/unix/sysv
-I../ports/sysdeps/unix/sysv -I../sysdeps/unix/sysv
-I../sysdeps/unix/powerpc -I../nptl/sysdeps/unix -I../ports/sysdeps/unix
-I../sysdeps/unix -I../sysdeps/posix -I../ports/sysdeps/powerpc/powerpc32
-I../sysdeps/powerpc/powerpc32 -I../sysdeps/wordsize-32
-I../sysdeps/powerpc/fpu -I../nptl/sysdeps/powerpc
-I../ports/sysdeps/powerpc -I../sysdeps/powerpc -I../sysdeps/ieee754/dbl-64
-I../sysdeps/ieee754/flt-32 -I../sysdeps/ieee754 -I../sysdeps/generic/elf
-I../sysdeps/generic -I../nptl -I../ports -I.. -I../libio -I. -nostdinc
-isystem /usr/lib/gcc/powerpc-unknown-linux-gnu/4.5.3/include -isystem
/home/999/new/work/pkg/usr/include -D_LIBC_REENTRANT -include
../include/libc-symbols.h -DNOT_IN_libc=1 -D_RPC_THREAD_SAFE_ -o
/home/999/new/work/src/build32/sunrpc/rpcinfo.o -MD -MP -MF
/home/999/new/work/src/build32/sunrpc/rpcinfo.o.dt -MT
/home/999/new/work/src/build32/sunrpc/rpcinfo.o
gcc -m32 -nostdlib -nostartfiles -o
/home/999/new/work/src/build32/sunrpc/rpcinfo -mhard-float
-Wl,-dynamic-linker=/lib/ld.so.1 -Wl,-z,combreloc -Wl,-z,relro
-Wl,--hash-style=both /home/999/new/work/src/build32/csu/crt1.o
/home/999/new/work/src/build32/csu/crti.o `gcc -m32 -mhard-float
--print-file-name=crtbegin.o`
/home/999/new/work/src/build32/sunrpc/rpcinfo.o
-Wl,-rpath-link=/home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
/home/999/new/work/src/build32/libc.so.6
/home/999/new/work/src/build32/libc_nonshared.a -Wl,--as-needed
/home/999/new/work/src/build32/elf/ld.so -Wl,--no-as-needed -lgcc
-Wl,--as-needed -lgcc_s -Wl,--no-as-needed `gcc -m32 -mhard-float
--print-file-name=crtend.o` /home/999/new/work/src/build32/csu/crtn.o
CPP='gcc -m32 -E -x c-header' /home/999/new/work/src/build32/elf/ld.so.1
--library-path
/home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
/home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
rpcsvc/bootparam_prot.x -o
/home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
Here my config.log: http://cruxppc.org/~acrux/config.log
Instead i successfully built glibc-2.13 without "--with-cpu=440 --with-fp" .
I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3, binutils-2.21.1,
glibc-2.12.2
cheers,
--nico
[1] http://www.acube-systems.biz/index.php?page=hardware&pid=2
[2]
http://myapm.apm.com/MyAMCC/jsp/public/productDetail/product_detail.jsp?productID=PPC440EP
Ryan S. Arnold wrote:
>
> On Mon, Dec 13, 2010 at 2:25 PM, Ryan Arnold <rsa@us.ibm.com> wrote:
>> Sorry for the delinquent response. This looks good to me and I think it
>> should be checked in.
>>
>> I'd like for someone with a 405, 440, or 464 to test it further. As far
>> as we know the code only uses instructions available on all of these
>> platforms.
>>
>> I'd like to stress that it was authored by Todd Iglehart
>> <iglehart@us.ibm.com> and contributed by IBM. Luis did the fixup and
>> authored the implies structure.
>>
>> Ryan S. Arnold
>
> I've checked this patch into glibc-ports under:
>
> commit # a72cc2b29d00207fd8e2ee4612502339a14816b6
>
> Just a general note on configuration; some of these processors have a
> floating point unit but I believe all of them default to soft-fp.
>
> GLIBC configure won't recognize --with-cpu=476fp even though the
> compiler might recognize -mcpu=476fp.
>
> If you want to configure a hard-fp build just pass --with-cpu=476
> --with-fp instead and a new Makefile fragment will make sure that
> -mhard-float is added to CFLAGS and ASFLAGS.
>
> Ryan S. Arnold
>
>
--
View this message in context: http://old.nabble.com/-PATCH--powerpc%3A-405-440-464-476-support-and-optimizations-tp29607194p33163939.html
Sent from the Sourceware - libc-ports mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-18 20:31 ` acrux@cruxppc.org
@ 2012-01-19 19:35 ` Carlos O'Donell
2012-01-20 14:24 ` acrux
0 siblings, 1 reply; 19+ messages in thread
From: Carlos O'Donell @ 2012-01-19 19:35 UTC (permalink / raw)
To: acrux@cruxppc.org; +Cc: libc-ports
On Wed, Jan 18, 2012 at 3:30 PM, acrux@cruxppc.org <acrux@linuxmail.org> wrote:
>
> just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a Sam440ep[1]
> (PPC440EP SoC [2]) but it remains stuck in this point:
> CPP='gcc -m32 -E -x c-header' /home/999/new/work/src/build32/elf/ld.so.1
> --library-path
> /home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
> /home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
> rpcsvc/bootparam_prot.x -o
> /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
This is the first use of the newly build dynamic loader.
A failure here means that the dynamic loader has not been correctly compiled.
You should debug this to figure out what is going wrong in the loader.
> Here my config.log: http://cruxppc.org/~acrux/config.log
> Instead i successfully built glibc-2.13 without "--with-cpu=440 --with-fp" .
> I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3, binutils-2.21.1,
> glibc-2.12.2
Have you tested your compiler? What were the test results?
Did you test binutils? What were the test results?
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-19 19:35 ` Carlos O'Donell
@ 2012-01-20 14:24 ` acrux
2012-01-20 15:52 ` Ryan S. Arnold
0 siblings, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-20 14:24 UTC (permalink / raw)
To: Carlos O'Donell; +Cc: libc-ports
On Thu, 19 Jan 2012 14:34:50 -0500
"Carlos O'Donell" <carlos@systemhalted.org> wrote:
> On Wed, Jan 18, 2012 at 3:30 PM, acrux@cruxppc.org
> <acrux@linuxmail.org> wrote:
> >
> > just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a
> > Sam440ep[1] (PPC440EP SoC [2]) but it remains stuck in this point:
> > CPP='gcc -m32 -E -x c-header'
> > /home/999/new/work/src/build32/elf/ld.so.1
> > --library-path
> > /home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
> > /home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
> > rpcsvc/bootparam_prot.x -o
> > /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
>
> This is the first use of the newly build dynamic loader.
>
> A failure here means that the dynamic loader has not been correctly
> compiled.
>
> You should debug this to figure out what is going wrong in the loader.
>
i know, but i've no resource and i guess not enough skill to debug and
fix it.
As i received a borda with a 440EP SoC, just only for fun, i
tested a build "--with-cpu=440 --with-fp".
I'd like to know if somobody really tested these features.
> > Here my config.log: http://cruxppc.org/~acrux/config.log
> > Instead i successfully built glibc-2.13 without "--with-cpu=440
> > --with-fp" . I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3,
> > binutils-2.21.1, glibc-2.12.2
>
> Have you tested your compiler? What were the test results?
>
> Did you test binutils? What were the test results?
>
they finished their testsuites with only the well know failures for
their own releases. Thus i can say they are good.
best,
--nico
--
GNU/Linux on Power Architecture
CRUX PPC - http://cruxppc.org/
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-20 14:24 ` acrux
@ 2012-01-20 15:52 ` Ryan S. Arnold
2012-01-20 18:03 ` Carlos O'Donell
2012-01-23 0:41 ` acrux
0 siblings, 2 replies; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-20 15:52 UTC (permalink / raw)
To: acrux; +Cc: Carlos O'Donell, libc-ports
On Fri, Jan 20, 2012 at 8:24 AM, acrux <acrux_it@libero.it> wrote:
> On Thu, 19 Jan 2012 14:34:50 -0500
> "Carlos O'Donell" <carlos@systemhalted.org> wrote:
>
>> On Wed, Jan 18, 2012 at 3:30 PM, acrux@cruxppc.org
>> <acrux@linuxmail.org> wrote:
>> >
>> > just tried to build glibc-2.13 "--with-cpu=440 --with-fp" on a
>> > Sam440ep[1] (PPC440EP SoC [2]) but it remains stuck in this point:
>> > CPP='gcc -m32 -E -x c-header'
>> > /home/999/new/work/src/build32/elf/ld.so.1
>> > --library-path
>> > /home/999/new/work/src/build32:/home/999/new/work/src/build32/math:/home/999/new/work/src/build32/elf:/home/999/new/work/src/build32/dlfcn:/home/999/new/work/src/build32/nss:/home/999/new/work/src/build32/nis:/home/999/new/work/src/build32/rt:/home/999/new/work/src/build32/resolv:/home/999/new/work/src/build32/crypt:/home/999/new/work/src/build32/nptl
>> > /home/999/new/work/src/build32/sunrpc/rpcgen -Y ../scripts -c
>> > rpcsvc/bootparam_prot.x -o
>> > /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
>>
>> This is the first use of the newly build dynamic loader.
>>
>> A failure here means that the dynamic loader has not been correctly
>> compiled.
>>
>> You should debug this to figure out what is going wrong in the loader.
>>
>
> i know, but i've no resource and i guess not enough skill to debug and
> fix it.
>
> As i received a borda with a 440EP SoC, just only for fun, i
> tested a build "--with-cpu=440 --with-fp".
> I'd like to know if somobody really tested these features.
>
>> > Here my config.log: http://cruxppc.org/~acrux/config.log
>> > Instead i successfully built glibc-2.13 without "--with-cpu=440
>> > --with-fp" . I'm using an updated CRUX PPC 2.7 (32bit): gcc-4.5.3,
>> > binutils-2.21.1, glibc-2.12.2
>>
>> Have you tested your compiler? What were the test results?
>>
>> Did you test binutils? What were the test results?
>>
>
> they finished their testsuites with only the well know failures for
> their own releases. Thus i can say they are good.
Integration can often expose problems, but I suspect something else is
going on here...
Does the machine you're building this toolchain on understand the
440fp instruction set? If not then the loader is likely encountering
a sigill.
The solution to this situation is to enable cross compiling prior to
running configure. This will prevent the build from attempting to run
any of the code.
echo "cross-compiling=yes" >> configparms
If that's not the problem I would suggest trying to build using 440 without fp.
If that works then I suspect that the problem is related to the string
routine optimizations that one of my guys put in for the 476
processor. It was requested that we provide it to the entire 4xx
series since the instructions used (allegedly) weren't unique to the
476.
It'd be interesting to run a debugger against the loader at this point
and identify whether you're encountering a sigill or a sigsegv.
Ryan S. Arnold
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-20 15:52 ` Ryan S. Arnold
@ 2012-01-20 18:03 ` Carlos O'Donell
2012-01-23 0:41 ` acrux
1 sibling, 0 replies; 19+ messages in thread
From: Carlos O'Donell @ 2012-01-20 18:03 UTC (permalink / raw)
To: Ryan S. Arnold; +Cc: acrux, libc-ports
On Fri, Jan 20, 2012 at 10:52 AM, Ryan S. Arnold <ryan.arnold@gmail.com> wrote:
> Does the machine you're building this toolchain on understand the
> 440fp instruction set? If not then the loader is likely encountering
> a sigill.
There are some known problems in this area which Mentor Graphics ESD
has been fixing and submitting upstream.
They mainly have to do with the graphics and string instructions that
aren't uniformly supported and gcc doesn't know when not to use some
of these instructions.
If your goal is to succeed at compiling things for your target then I
would start here:
http://www.mentor.com/embedded-software/sourcery-tools/sourcery-codebench/platforms/power-gnulinux
or just command-line tools for free, click under "Power Architecture
Processors"->"Download the GNU/Linux release"
http://www.mentor.com/embedded-software/sourcery-tools/sourcery-codebench/editions/lite-edition/
They are all based on the open-source tools but with enhancements to
fix problems like the one you ran into.
I'm part of the team that produces these toolchains so I'm biased :-)
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-20 15:52 ` Ryan S. Arnold
2012-01-20 18:03 ` Carlos O'Donell
@ 2012-01-23 0:41 ` acrux
2012-01-23 15:48 ` Ryan S. Arnold
1 sibling, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-23 0:41 UTC (permalink / raw)
To: Ryan S. Arnold; +Cc: Carlos O'Donell, libc-ports
On Fri, 20 Jan 2012 09:52:22 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
_omissis__
>
> Integration can often expose problems, but I suspect something else is
> going on here...
>
> Does the machine you're building this toolchain on understand the
> 440fp instruction set? If not then the loader is likely encountering
> a sigill.
>
> The solution to this situation is to enable cross compiling prior to
> running configure. This will prevent the build from attempting to run
> any of the code.
>
> echo "cross-compiling=yes" >> configparms
>
> If that's not the problem I would suggest trying to build using 440 without fp.
>
hi,
also without fp it makes no difference and i still have the same problem.
Just to be sure about the compiler[1]... i simply removed those seven files (the optimized routines) and i successfully built it "--with-cpu=440 --with-fp". Indeed i've always been able to build everything, even, with very aggressive cflags[2] without any kind of issues.
> If that works then I suspect that the problem is related to the string
> routine optimizations that one of my guys put in for the 476
> processor. It was requested that we provide it to the entire 4xx
> series since the instructions used (allegedly) weren't unique to the
> 476.
>
> It'd be interesting to run a debugger against the loader at this point
> and identify whether you're encountering a sigill or a sigsegv.
>
btw, i guess a SIGSEGV because it simply stuck there and cpu goes idle.
If you need i can provide to you an ssh access on this little board [3].
best,
--nico
[1]
binutils-2.21.1, gcc-4.5.3, glibc-2.12.2
gmp-5.0.2-2, mpfr-3.1.0-p3, mpc-0.9
ppl-0.11.2, cloog-ppl-0.15.11
[2]
"-O3 -mcpu=440fp -mmulhw -mdlmzb -pipe -fsigned-char -mpowerpc-gfxopt -fpeel-loops -ftracer -fgraphite-identity -floop-parallelize-all -ftree-loop-linear -ftree-loop-distribution -funroll-loops -floop-interchange -floop-strip-mine -floop-block"
[3]
root@sam4x0:~# uname -a
Linux sam4x0 3.1.10 #1 PREEMPT Fri Jan 20 21:30:46 CET 2012 ppc 440EP Rev. C GNU/Linux
root@sam4x0:~# lscpu
Architecture: ppc
Byte Order: Big Endian
CPU(s): 1
On-line CPU(s) list: 0
Model: acube,sam440ep
BogoMIPS: 1333.33
L1d cache: 32K
L1i cache: 32K
root@sam4x0:~# cat /proc/cpuinfo
processor : 0
cpu : 440EP Rev. C
clock : 666.666660MHz
revision : 24.212 (pvr 4222 18d4)
bogomips : 1333.33
timebase : 666666660
platform : Sam440ep
model : acube,sam440ep
Memory : 1023 MB
root@sam4x0:~# lspci
00:00.0 Bridge: IBM Device 027f
00:0a.0 PCI bridge: Pericom Semiconductor PCI to PCI Bridge (rev 02)
00:0c.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV280 [Radeon 9200 PRO] (rev 01)
00:0c.1 Display controller: Advanced Micro Devices [AMD] nee ATI RV280 [Radeon 9200 PRO] (Secondary) (rev 01)
00:0e.0 Mass storage controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
01:04.0 Multimedia audio controller: Cirrus Logic Crystal CS4281 PCI Audio (rev 01)
01:05.0 USB controller: NEC Corporation USB (rev 43)
01:05.1 USB controller: NEC Corporation USB (rev 43)
01:05.2 USB controller: NEC Corporation USB 2.0 (rev 04)
--
acrux <acrux@cruxppc.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-23 0:41 ` acrux
@ 2012-01-23 15:48 ` Ryan S. Arnold
2012-01-24 16:47 ` acrux
0 siblings, 1 reply; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-23 15:48 UTC (permalink / raw)
To: acrux; +Cc: Carlos O'Donell, libc-ports
On Sun, Jan 22, 2012 at 6:42 PM, acrux <acrux_it@libero.it> wrote:
>> If that works then I suspect that the problem is related to the string
>> routine optimizations that one of my guys put in for the 476
>> processor. It was requested that we provide it to the entire 4xx
>> series since the instructions used (allegedly) weren't unique to the
>> 476.
>>
>> It'd be interesting to run a debugger against the loader at this point
>> and identify whether you're encountering a sigill or a sigsegv.
>>
>
> btw, i guess a SIGSEGV because it simply stuck there and cpu goes idle.
> If you need i can provide to you an ssh access on this little board [3].
Hi Nico, I think you should debug the loader and get a backtrace.
You'll use the instructions here:
http://sourceware.org/glibc/wiki/Debugging/Loader_Debugging#Debugging_With_An_Alternate_Loader
I've made it easy for you. Here's two scripts with your build
directory already embedded. What you need to do is invoke GDB using
the new loader if you can (so that you don't get library mismatching)
and then tell GDB (with the .gdb script) to debug the loader.
Here's the .gdb script you'll use:
rpcgen.gdb:
------------------------------------
set environment gcc -m32 C -E -x c-header
break _dl_main_dispatch
run --library-path
/home/999/new/work/src/build32/:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/math:\
/home/999/new/work/src/build32/elf:\
/home/999/new/work/src/build32/dlfcn:\
/home/999/new/work/src/build32/nss:\
/home/999/new/work/src/build32/nis:\
/home/999/new/work/src/build32/rt:\
/home/999/new/work/src/build32/resolv:\
/home/999/new/work/src/build32/crypt:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/nptl_db \
/home/999/new/work/src/build32/sunrpc/rpcgen -Y
/home/999/new/work/src/scripts -c
/home/999/new/work/src/build32/sunrpc/rpcsvc/bootparam_prot.x -o
/home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
Here's the shell script which will invoke GDB:
debug_rpcgen.sh:
-------------------------------------
#!/bin/bash
ulimit -c unlimited
GLIBC="/home/999/new/work/src/build32/"
CPP='gcc -m32 -E -x c-header' \
${GLIBC}/elf/ld.so.1 --library-path \
${GLIBC}:\
${GLIBC}/math:\
${GLIBC}/elf:\
${GLIBC}/dlfcn:\
${GLIBC}/nss:\
${GLIBC}/nis:\
${GLIBC}/rt:\
${GLIBC}/resolv:\
${GLIBC}/crypt:\
${GLIBC}/nptl:\
${GLIBC}/nptl_db: \
/usr/bin/gdb -x rpcgen.gdb -d home/999/new/work/src/build32/elf/ld.so.1
So try running debug_rpcgen.sh first.
If it works the loader should be breaking in the loader on
_dl_main_dispatch. You can simply (gdb) continue at this point and
the loader should crash whereby gdb will trap and show you where it
crashed and why (segfault or sigill).
If debug_rpcgen.sh crashes immediately without GDB coming up it means
that the loader itself is crashing in the string routines (which is
the most likely scenario). If that is the case you should try running
the debugger with the system loader instead. You may get some library
mismatch warnings but do the following:
CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb -d
home/999/new/work/src/build32/elf/ld.so.1
If it successfully traps in _dl_main_dispatch do what I mentioned
above to see where the loader is crashing.
Let me know where/why it is crashing.
Ryan S. Arnold
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-23 15:48 ` Ryan S. Arnold
@ 2012-01-24 16:47 ` acrux
2012-01-24 17:20 ` Ryan S. Arnold
0 siblings, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-24 16:47 UTC (permalink / raw)
To: Ryan S. Arnold; +Cc: libc-ports
On Mon, 23 Jan 2012 09:48:19 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
> On Sun, Jan 22, 2012 at 6:42 PM, acrux <acrux_it@libero.it> wrote:
> >> If that works then I suspect that the problem is related to the string
> >> routine optimizations that one of my guys put in for the 476
> >> processor. It was requested that we provide it to the entire 4xx
> >> series since the instructions used (allegedly) weren't unique to the
> >> 476.
> >>
> >> It'd be interesting to run a debugger against the loader at this point
> >> and identify whether you're encountering a sigill or a sigsegv.
> >>
> >
> > btw, i guess a SIGSEGV because it simply stuck there and cpu goes idle.
> > If you need i can provide to you an ssh access on this little board [3].
>
> Hi Nico, I think you should debug the loader and get a backtrace.
> You'll use the instructions here:
>
> http://sourceware.org/glibc/wiki/Debugging/Loader_Debugging#Debugging_With_An_Alternate_Loader
>
> I've made it easy for you. Here's two scripts with your build
> directory already embedded. What you need to do is invoke GDB using
> the new loader if you can (so that you don't get library mismatching)
> and then tell GDB (with the .gdb script) to debug the loader.
>
> Here's the .gdb script you'll use:
>
> rpcgen.gdb:
> ------------------------------------
> set environment gcc -m32 C -E -x c-header
> break _dl_main_dispatch
> run --library-path
> /home/999/new/work/src/build32/:\
> /home/999/new/work/src/build32/nptl:\
> /home/999/new/work/src/build32/math:\
> /home/999/new/work/src/build32/elf:\
> /home/999/new/work/src/build32/dlfcn:\
> /home/999/new/work/src/build32/nss:\
> /home/999/new/work/src/build32/nis:\
> /home/999/new/work/src/build32/rt:\
> /home/999/new/work/src/build32/resolv:\
> /home/999/new/work/src/build32/crypt:\
> /home/999/new/work/src/build32/nptl:\
> /home/999/new/work/src/build32/nptl_db \
> /home/999/new/work/src/build32/sunrpc/rpcgen -Y
> /home/999/new/work/src/scripts -c
> /home/999/new/work/src/build32/sunrpc/rpcsvc/bootparam_prot.x -o
> /home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
>
>
> Here's the shell script which will invoke GDB:
>
> debug_rpcgen.sh:
> -------------------------------------
> #!/bin/bash
>
> ulimit -c unlimited
> GLIBC="/home/999/new/work/src/build32/"
>
> CPP='gcc -m32 -E -x c-header' \
> ${GLIBC}/elf/ld.so.1 --library-path \
> ${GLIBC}:\
> ${GLIBC}/math:\
> ${GLIBC}/elf:\
> ${GLIBC}/dlfcn:\
> ${GLIBC}/nss:\
> ${GLIBC}/nis:\
> ${GLIBC}/rt:\
> ${GLIBC}/resolv:\
> ${GLIBC}/crypt:\
> ${GLIBC}/nptl:\
> ${GLIBC}/nptl_db: \
> /usr/bin/gdb -x rpcgen.gdb -d home/999/new/work/src/build32/elf/ld.so.1
>
> So try running debug_rpcgen.sh first.
>
hi Ryan,
thanks. I just did some minor fix to your scripts to have the correct path for my built
# cat > rpcgen.gdb << "EOF"
set environment gcc -m32 C -E -x c-header
break _dl_main_dispatch
run --library-path \
/home/999/new/work/src/build32/:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/math:\
/home/999/new/work/src/build32/elf:\
/home/999/new/work/src/build32/dlfcn:\
/home/999/new/work/src/build32/nss:\
/home/999/new/work/src/build32/nis:\
/home/999/new/work/src/build32/rt:\
/home/999/new/work/src/build32/resolv:\
/home/999/new/work/src/build32/crypt:\
/home/999/new/work/src/build32/nptl:\
/home/999/new/work/src/build32/nptl_db \
/home/999/new/work/src/build32/sunrpc/rpcgen -Y \
/home/999/new/work/src/glibc-2.13/scripts -c \
/home/999/new/work/src/glibc-2.13/sunrpc/rpcsvc/bootparam_prot.x -o \
/home/999/new/work/src/build32/sunrpc/xbootparam_prot.T
EOF
# cat > debug_rpcgen.sh << "EOF"
#!/bin/bash
ulimit -c unlimited
GLIBC="/home/999/new/work/src/build32/"
CPP='gcc -m32 -E -x c-header' \
${GLIBC}/elf/ld.so.1 --library-path \
${GLIBC}:\
${GLIBC}/math:\
${GLIBC}/elf:\
${GLIBC}/dlfcn:\
${GLIBC}/nss:\
${GLIBC}/nis:\
${GLIBC}/rt:\
${GLIBC}/resolv:\
${GLIBC}/crypt:\
${GLIBC}/nptl:\
${GLIBC}/nptl_db \
/usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
EOF
> If it works the loader should be breaking in the loader on
> _dl_main_dispatch. You can simply (gdb) continue at this point and
> the loader should crash whereby gdb will trap and show you where it
> crashed and why (segfault or sigill).
>
> If debug_rpcgen.sh crashes immediately without GDB coming up it means
> that the loader itself is crashing in the string routines (which is
it's right, it simply stuck.
> the most likely scenario). If that is the case you should try running
> the debugger with the system loader instead. You may get some library
> mismatch warnings but do the following:
>
> CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb -d
> home/999/new/work/src/build32/elf/ld.so.1
>
# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
Breakpoint 1 at 0x163c0
Breakpoint 1, 0x204b73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
(gdb) info proc mapping
process 2575
cmdline = '/home/999/new/work/src/build32/elf/ld.so.1'
cwd = '/home/999/ryan'
exe = '/home/999/new/work/src/build32/elf/ld.so'
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x100000 0x102000 0x2000 0 [vdso]
0xfe82000 0xffd8000 0x156000 0 /home/999/new/work/src/build32/libc.so
0xffd8000 0xffe8000 0x10000 0x156000 /home/999/new/work/src/build32/libc.so
0xffe8000 0xffea000 0x2000 0x156000 /home/999/new/work/src/build32/libc.so
0xffea000 0xffed000 0x3000 0x158000 /home/999/new/work/src/build32/libc.so
0xffed000 0xfff0000 0x3000 0
0x10000000 0x10014000 0x14000 0 /home/999/new/work/src/build32/sunrpc/rpcgen
0x10023000 0x10024000 0x1000 0x13000 /home/999/new/work/src/build32/sunrpc/rpcgen
0x10024000 0x10025000 0x1000 0x14000 /home/999/new/work/src/build32/sunrpc/rpcgen
0x204a1000 0x204bf000 0x1e000 0 /home/999/new/work/src/build32/elf/ld.so
0x204ce000 0x204cf000 0x1000 0x1d000 /home/999/new/work/src/build32/elf/ld.so
0x204cf000 0x204d1000 0x2000 0x1e000 /home/999/new/work/src/build32/elf/ld.so
0x48000000 0x48002000 0x2000 0
0xbffdf000 0xc0000000 0x21000 0 [stack]
(gdb) bt
#0 0x204b73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
#1 0x00000000 in ?? ()
(gdb) continue
Continuing.
well... now it stuck... and i must do a ctrl-c
^C
Program received signal SIGINT, Interrupt.
0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
(gdb) bt
#0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
#1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
#2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
#3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
#4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
#5 0x00000000 in ?? ()
(gdb) quit
A debugging session is active.
Inferior 1 [process 2575] will be killed.
Quit anyway? (y or n) y
> If it successfully traps in _dl_main_dispatch do what I mentioned
> above to see where the loader is crashing.
>
> Let me know where/why it is crashing.
>
that's all folks!
best,
--nico
--
acrux <acrux@cruxppc.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-24 16:47 ` acrux
@ 2012-01-24 17:20 ` Ryan S. Arnold
2012-01-24 17:41 ` acrux
0 siblings, 1 reply; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-24 17:20 UTC (permalink / raw)
To: acrux; +Cc: libc-ports
On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
> Program received signal SIGINT, Interrupt.
> 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> (gdb) bt
> #0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> #1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> #2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> #3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> #4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> #5 0x00000000 in ?? ()
Wow, that is not what I expected at all...
I can't imagine that there are other threads at this point but ....
(gdb) info threads
And if there are, please dump the thread backtrace.
Ryan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-24 17:20 ` Ryan S. Arnold
@ 2012-01-24 17:41 ` acrux
2012-01-24 17:59 ` Ryan S. Arnold
0 siblings, 1 reply; 19+ messages in thread
From: acrux @ 2012-01-24 17:41 UTC (permalink / raw)
To: Ryan S. Arnold; +Cc: libc-ports
On Tue, 24 Jan 2012 11:19:54 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
> On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
> > Program received signal SIGINT, Interrupt.
> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > (gdb) bt
> > #0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > #1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> > #2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> > #3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> > #4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> > #5 0x00000000 in ?? ()
>
> Wow, that is not what I expected at all...
>
> I can't imagine that there are other threads at this point but ....
> (gdb) info threads
>
> And if there are, please dump the thread backtrace.
>
root@sam4x0:/home/999/ryan# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "powerpc-unknown-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
Breakpoint 1 at 0x163c0
Breakpoint 1, 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
(gdb) info threads
Id Target Id Frame
* 1 process 2609 "ld.so.1" 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
(gdb) bt
#0 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
#1 0x00000000 in ?? ()
(gdb) continue
Continuing.
^C
Program received signal SIGINT, Interrupt.
0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
(gdb) bt
#0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
#1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
#2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
#3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
#4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
#5 0x00000000 in ?? ()
(gdb) info threads
Id Target Id Frame
* 1 process 2609 "ld.so.1" 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
(gdb) thread apply all bt full
Thread 1 (process 2609):
#0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
No symbol table info available.
#5 0x00000000 in ?? ()
No symbol table info available.
(gdb) q
A debugging session is active.
Inferior 1 [process 2609] will be killed.
Quit anyway? (y or n) y
--nico
--
acrux <acrux_it@libero.it>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-24 17:41 ` acrux
@ 2012-01-24 17:59 ` Ryan S. Arnold
2012-02-18 2:06 ` acrux
0 siblings, 1 reply; 19+ messages in thread
From: Ryan S. Arnold @ 2012-01-24 17:59 UTC (permalink / raw)
To: acrux; +Cc: libc-ports
On Tue, Jan 24, 2012 at 11:43 AM, acrux <acrux_it@libero.it> wrote:
> On Tue, 24 Jan 2012 11:19:54 -0600
> "Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
>
>> On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
>> > Program received signal SIGINT, Interrupt.
>> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
>> > (gdb) bt
>> > #0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
>> > #1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
>> > #2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
>> > #3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
>> > #4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
>> > #5 0x00000000 in ?? ()
>>
>> Wow, that is not what I expected at all...
>>
>> I can't imagine that there are other threads at this point but ....
>> (gdb) info threads
>>
>> And if there are, please dump the thread backtrace.
>>
>
>
> root@sam4x0:/home/999/ryan# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
> GNU gdb (GDB) 7.3.1
> Copyright (C) 2011 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "powerpc-unknown-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
> Breakpoint 1 at 0x163c0
>
> Breakpoint 1, 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> (gdb) info threads
> Id Target Id Frame
> * 1 process 2609 "ld.so.1" 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> (gdb) bt
> #0 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> #1 0x00000000 in ?? ()
> (gdb) continue
> Continuing.
> ^C
> Program received signal SIGINT, Interrupt.
> 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> (gdb) bt
> #0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> #1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> #2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> #3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> #4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> #5 0x00000000 in ?? ()
> (gdb) info threads
> Id Target Id Frame
> * 1 process 2609 "ld.so.1" 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> (gdb) thread apply all bt full
I wonder if one of the string routine calls in the loader overwrote
its bounds and ended up writing over that lock, hence why the wait is
hanging.
Can you do an (gdb) info frame and try to figure out what the value of
the futex is when it's blocking?
Ryan
Ryan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] powerpc: 405/440/464/476 support and optimizations
2012-01-24 17:59 ` Ryan S. Arnold
@ 2012-02-18 2:06 ` acrux
0 siblings, 0 replies; 19+ messages in thread
From: acrux @ 2012-02-18 2:06 UTC (permalink / raw)
To: Ryan S. Arnold; +Cc: libc-ports
On Tue, 24 Jan 2012 11:59:31 -0600
"Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
> On Tue, Jan 24, 2012 at 11:43 AM, acrux <acrux_it@libero.it> wrote:
> > On Tue, 24 Jan 2012 11:19:54 -0600
> > "Ryan S. Arnold" <ryan.arnold@gmail.com> wrote:
> >
> >> On Tue, Jan 24, 2012 at 10:48 AM, acrux <acrux_it@libero.it> wrote:
> >> > Program received signal SIGINT, Interrupt.
> >> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> >> > (gdb) bt
> >> > #0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> >> > #1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> >> > #2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> >> > #3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> >> > #4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> >> > #5 0x00000000 in ?? ()
> >>
> >> Wow, that is not what I expected at all...
> >>
> >> I can't imagine that there are other threads at this point but ....
> >> (gdb) info threads
> >>
> >> And if there are, please dump the thread backtrace.
> >>
> >
> >
> > root@sam4x0:/home/999/ryan# CPP='gcc -m32 -E -x -c-header' /usr/bin/gdb -x rpcgen.gdb /home/999/new/work/src/build32/elf/ld.so.1
> > GNU gdb (GDB) 7.3.1
> > Copyright (C) 2011 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> > and "show warranty" for details.
> > This GDB was configured as "powerpc-unknown-linux-gnu".
> > For bug reporting instructions, please see:
> > <http://www.gnu.org/software/gdb/bugs/>...
> > Reading symbols from /home/999/new/work/src/build32/elf/ld.so.1...(no debugging symbols found)...done.
> > Breakpoint 1 at 0x163c0
> >
> > Breakpoint 1, 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> > (gdb) info threads
> > Id Target Id Frame
> > * 1 process 2609 "ld.so.1" 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> > (gdb) bt
> > #0 0x206f73c0 in _dl_main_dispatch () from /home/999/new/work/src/build32/elf/ld.so.1
> > #1 0x00000000 in ?? ()
> > (gdb) continue
> > Continuing.
> > ^C
> > Program received signal SIGINT, Interrupt.
> > 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > (gdb) bt
> > #0 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > #1 0x0febb2e4 in __new_exitfn () from /home/999/new/work/src/build32/libc.so.6
> > #2 0x0febb338 in __internal_atexit () from /home/999/new/work/src/build32/libc.so.6
> > #3 0x0fea174c in generic_start_main.clone.0 () from /home/999/new/work/src/build32/libc.so.6
> > #4 0x0fea1970 in __libc_start_main () from /home/999/new/work/src/build32/libc.so.6
> > #5 0x00000000 in ?? ()
> > (gdb) info threads
> > Id Target Id Frame
> > * 1 process 2609 "ld.so.1" 0x0ff70c80 in __lll_lock_wait_private () from /home/999/new/work/src/build32/libc.so.6
> > (gdb) thread apply all bt full
>
> I wonder if one of the string routine calls in the loader overwrote
> its bounds and ended up writing over that lock, hence why the wait is
> hanging.
>
> Can you do an (gdb) info frame and try to figure out what the value of
> the futex is when it's blocking?
>
hi Ryan,
did you performed any further test to understand the problem on 440fp cores?
thanks,
--nico
--
acrux <acrux@cruxppc.org>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-02-18 2:06 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-02 17:34 [PATCH] powerpc: 405/440/464/476 support and optimizations Luis Machado
2010-09-03 14:45 ` Ryan Arnold
2010-09-03 15:00 ` Luis Machado
2010-10-04 18:54 ` Luis Machado
2010-12-13 20:26 ` Ryan Arnold
2011-01-18 13:16 ` Ryan Arnold
2011-01-25 21:32 ` Joseph S. Myers
2012-01-18 20:31 ` acrux@cruxppc.org
2012-01-19 19:35 ` Carlos O'Donell
2012-01-20 14:24 ` acrux
2012-01-20 15:52 ` Ryan S. Arnold
2012-01-20 18:03 ` Carlos O'Donell
2012-01-23 0:41 ` acrux
2012-01-23 15:48 ` Ryan S. Arnold
2012-01-24 16:47 ` acrux
2012-01-24 17:20 ` Ryan S. Arnold
2012-01-24 17:41 ` acrux
2012-01-24 17:59 ` Ryan S. Arnold
2012-02-18 2:06 ` acrux
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).