From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16486 invoked by alias); 2 Oct 2002 04:16:02 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 16436 invoked by uid 71); 2 Oct 2002 04:16:01 -0000 Date: Tue, 01 Oct 2002 21:16:00 -0000 Message-ID: <20021002041601.16411.qmail@sources.redhat.com> To: nobody@gcc.gnu.org Cc: gcc-prs@gcc.gnu.org, From: "Aaron Williams" Subject: Re: target/8004: All C++ binaries crash in __register_frame_info_bases on Sparc Solaris 2.7 Reply-To: "Aaron Williams" X-SW-Source: 2002-10/txt/msg00032.txt.bz2 List-Id: The following reply was made to PR target/8004; it has been noted by GNATS. From: "Aaron Williams" To: davem@gcc.gnu.org, gcc-bugs@gcc.gnu.org, gcc-prs@gcc.gnu.org, nobody@gcc.gnu.org, gcc-gnats@gcc.gnu.org Cc: Subject: Re: target/8004: All C++ binaries crash in __register_frame_info_bases on Sparc Solaris 2.7 Date: Tue, 01 Oct 2002 21:13:03 -0700 I should have all the required patches installed. I have Sun's patch cluster as of 9/11 installed. I believe this may be due to a bug in ld.so. I am attaching a copy of an email I received from someone else who appears to have the same problem. My current workaround is to use Sun's /usr/ccs/bin/ld instead of the one from binutils 2.13. I am having other stability problems with gcc 3.2 on Solaris and will likely go back to 2.95.3. Konqueror in KDE 3.0.3 and qt-3.0.5 compiled with gcc 3.2 is unstable, for example. I was hoping 3.2 would fix a problem where I see static destructors being called in a shared library when the shared library is no longer present (causing a crash in the exit handler). This too, unfortunately sound like it might be a Solaris bug. -Aaron Email follows: Dear Aaron Williams, >> After searching the web regarding a problem I am having with GCC 3.2 on >> Solaris I came across your bug report at : >> >> http://www.geocrawler.com/lists/3/GNU/361/0/9566991/ >> >> I am experiencing exactly the same problem but with Solaris 2.7. I was >> wondering if you were successful in resolving this problem and if so how you >> did it? > > one of my colleagues, Christian Ehrhardt analyzed this problem further on and he believes that it is a bug of ld.so.1. Here is his report: >> The dynamic runtime linker fails to relocate valid shared libraries >> generated by recent versions of GNU-ld. /usr/local/bin/ld is from >> the GNU binutils-2.13 package: >> >> turing$ /usr/local/bin/ld -v >> GNU ld version 2.13 >> >> How to reproduce: >> >> Script started on Fri Sep 20 19:46:43 2002 >> turing$ cat t2.c >> struct object { >> int i; >> int j; >> int k; >> int l; >> }; >> >> >> >> int func () >> { >> static struct object x; >> struct object * p; >> p = &x; >> p->i = 3; >> return 0; >> } >> >> turing$ cat t3.c >> extern int func(); >> >> int main () >> { >> func(); >> return 0; >> } >> turing$ cat Makefile.sun >> .PHONY: clean >> all: a.out >> t2.o: t2.c >> CC -c -KPIC t2.c >> libt2.so: t2.o >> /usr/local/bin/ld -G t2.o -olibt2.so >> t3.o: t3.c >> CC -c t3.c >> a.out: libt2.so t3.o >> CC -lt2 t3.o -L. -R. >> clean: >> rm -f *.so *.o a.out >> >> turing$ cat Makefile >> .PHONY: clean >> all: a.out >> t2.o: t2.c >> gcc -c -fPIC t2.c >> libt2.so: t2.o >> /usr/local/bin/ld -nostdlib -shared -olibt2.so t2.o >> a.out: libt2.so t3.c >> gcc -nostdlib t3.c libt2.so -L. -R. >> clean: >> rm -f *.so *.o a.out core >> >> turing$ make -f Makefile.sun clean >> rm -f *.so *.o a.out >> turing$ make -f Makefile.sun >> CC -c -KPIC t2.c >> /usr/local/bin/ld -G t2.o -olibt2.so >> CC -c t3.c >> CC -lt2 t3.o -L. -R. >> turing$ a.out >> Segmentation Fault (core dumped) >> turing$ exit >> >> script done on Fri Sep 20 19:47:32 2002 >> >> Note that I compiled everything with /opt/SUNCspro/bin/CC to >> rule out bugs in gcc. This problem can be reproduced using >> the second Makefile and gcc with an even smaller executable. >> >> >> Analyzing the core shows the following: >> turing$ pmap core | grep libt2.so >> FF370000 8K read/exec libt2.so >> FF380000 8K read/write/exec libt2.so >> >> Script started on Fri Sep 20 19:53:10 2002 >> turing$ gdb a.out core >> GNU gdb 5.0 >> [ ... ] >> #0 0xff370318 in __1cEfunc6F_i_ () >> from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so >> (gdb) disass >> Dump of assembler code for function __1cEfunc6F_i_: >> 0xff3702e0 <__1cEfunc6F_i_>: save %sp, -112, %sp >> 0xff3702e4 <__1cEfunc6F_i_+4>: call 0xff3702ec <__1cEfunc6F_i_+12> >> 0xff3702e8 <__1cEfunc6F_i_+8>: sethi %hi(0), %o1 >> 0xff3702ec <__1cEfunc6F_i_+12>: mov %o1, %o1 ! 0x0 >> 0xff3702f0 <__1cEfunc6F_i_+16>: add %o7, %o1, %o1 >> 0xff3702f4 <__1cEfunc6F_i_+20>: st %o1, [ %fp + -12 ] >> 0xff3702f8 <__1cEfunc6F_i_+24>: sethi %hi(0x10000), %o0 >> 0xff3702fc <__1cEfunc6F_i_+28>: or %o0, 0xc4, %o0 ! 0x100c4 >> 0xff370300 <__1cEfunc6F_i_+32>: add %o1, %o0, %l7 >> 0xff370304 <__1cEfunc6F_i_+36>: sethi %hi(0), %g1 >> 0xff370308 <__1cEfunc6F_i_+40>: or %g1, 4, %g1 ! 0x4 >> 0xff37030c <__1cEfunc6F_i_+44>: ld [ %l7 + %g1 ], %o0 >> 0xff370310 <__1cEfunc6F_i_+48>: st %o0, [ %fp + -8 ] >> 0xff370314 <__1cEfunc6F_i_+52>: mov 3, %o1 >> 0xff370318 <__1cEfunc6F_i_+56>: st %o1, [ %o0 ] >> 0xff37031c <__1cEfunc6F_i_+60>: clr [ %fp + -4 ] >> 0xff370320 <__1cEfunc6F_i_+64>: mov %g0, %i0 >> 0xff370324 <__1cEfunc6F_i_+68>: ret >> 0xff370328 <__1cEfunc6F_i_+72>: restore >> 0xff37032c <__1cEfunc6F_i_+76>: mov %g0, %i0 >> 0xff370330 <__1cEfunc6F_i_+80>: ret >> 0xff370334 <__1cEfunc6F_i_+84>: restore >> ---Type to continue, or q to quit--- >> End of assembler dump. >> (gdb) bt >> #0 0xff370318 in __1cEfunc6F_i_ () >> from /home/thales/ehrhardt/ld.so.1-bug/./libt2.so >> #1 0x10884 in main () >> (gdb) info reg o0 >> o0 0xff370000 -13172736 >> (gdb) info reg o1 >> o1 0x3 3 >> (gdb) info reg l7 >> l7 0xff3803a8 -13106264 >> (gdb) info reg g1 >> g1 0x4 4 >> (gdb) turing$ exit >> >> script done on Fri Sep 20 19:54:46 2002 >> >> Looking back at function func from t2.c shows: >> int func () >> { >> static struct object x; >> struct object * p; >> p = &x; >> p->i = 3; <====== crash is here. >> return 0; >> } >> >> The value of the pointer p is obviously in register o0, i.e. it is >> 0xff370000. This is precisely the BASE address where the shared library >> libt2.so has been mapped to. Register l7 contains the base address of >> the .got section (the global offset table of this library). The >> questionable address is loaded from offset 4 into the global offset table. >> >> Looking at the contents of the global offset table in the shared >> library shows the following: >> >> turing$ elfdump -G libt2.so >> >> Global Offset Table: 2 entries >> ndx addr value reloc addend symbol >> [00000] 000103a8 00010338 R_SPARC_NONE 00000000 >> [00001] 000103ac 000103b0 R_SPARC_RELATIVE 00000000 >> turing$ >> >> Note that we have indeed >> %l7(0xff3803a8) = Offset of .got(0x000103a8) + library base address(0xFF370000) >> >> The Solaris Linker and Libraries Guide (freshly downloaded from >> docs.sun.com) hast this explanation about R_SPARC_RELATIVE: >> >> |Some relocation types have semantics beyond simple calculation: >> |[ ... ] >> |R_SPARC_RELATIVE >> | Created by the link-editor for dynamic objects. Its offset member >> | gives the location within a shared object that contains a value >> | representing a relative address. The runtime linker computes the >> | corresponding virtual address by adding the virtual address at which >> | the shared object is loaded to the relative address. Relocation >> | entries for this type must specify 0 for the symbol table index. >> >> This means that the value at offset 0x4 in the global offset >> Table should be >> library base address + Value in .got >> 0xFF370000 + 0x000103B0 = 0xFF3803B0 >> after relocation. However looking at the value of register o0 we >> see that the .got section obviously contains the value 0xFF37B000 >> instead. >> >> Checking the source code of the /usr/lib/ld.so.1 from Solaris 7 (the >> latest that we currently have access to) I found the following >> concerning R_SPARC_RELATIVE relocations. >> >> os_net/src_ws/usr/src/cmd/sgs/rtld/sparc/sparc_elf.c function elf_reloc: >> | if ((rtype == R_SPARC_RELATIVE) && >> | !(FLAGS(lmp) & FLG_RT_FIXED) && !dbg_mask) { >> | if (relacount) { >> | relbgn = elf_reloc_relacount(relbgn, relacount, >> | relsiz, basebgn); >> | >> | relacount = 0; >> | } else >> | relbgn = elf_reloc_relative(relbgn, relend, >> | relsiz, basebgn, etext, emap); >> | if (relbgn >= relend) >> | break; >> | rtype = ELF_R_TYPE(((Rel *)relbgn)->r_info); >> | } >> >> i.e. there are two functions that may be called to perform an >> R_SPARC_RELATIVE relocation, elf_reloc_relacount or elf_reloc_relative. >> >> However, these function do fundamentally different things to resolve >> these relocations: >> >> elf_reloc_relative (in file common_sparc.c) does the following: >> >> | /* >> | * Perform the actual relocation. >> | */ >> | *((ulong_t *) roffset) += >> | basebgn + (long)(((Rel *)relbgn)->r_addend); >> >> whereas elf_reloc_relacount (in file common_sparc.c) does this: >> >> | /* >> | * Perform the actual relocation. >> | */ >> | *((ulong_t *) roffset) = >> | basebgn + (long)(((Rel *)relbgn)->r_addend); >> >> Note the assignment (``='') instead of the addition ``+=''. >> I highly suspect that changing this will fix the problem. > > Regards, Andreas Borchert. -- Andreas Borchert, Universitaet Ulm, SAI, Helmholtzstr. 18, 89069 Ulm, Germany E-Mail: borchert@mathematik.uni-ulm.de WWW: http://www.mathematik.uni-ulm.de/sai/borchert/ PGP: http://www.mathematik.uni-ulm.de/sai/borchert/pgp.html davem@gcc.gnu.org wrote: >Synopsis: All C++ binaries crash in __register_frame_info_bases on Sparc Solaris 2.7 > >State-Changed-From-To: open->feedback >State-Changed-By: davem >State-Changed-When: Tue Oct 1 20:59:26 2002 >State-Changed-Why: > Do you have all the fixed installed which are mentioned > in: > > http://gcc.gnu.org/install/specific.html#sparc-sun-solaris2.7 > > These are necessary to get gcc working on 2.7 > >http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit-trail&database=gcc&pr=8004 > >