public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* ld unpredictable lookup failure building shared library
@ 2012-04-21  0:47 James K. Lowden
  2012-04-21  1:04 ` Ian Lance Taylor
  0 siblings, 1 reply; 9+ messages in thread
From: James K. Lowden @ 2012-04-21  0:47 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 3082 bytes --]

I'm attempting to build clang on x86_64.  ld fails to look up a symbol
from std::string, recommending -fPIC, but file(1) reports the object
file is relocatable.  Upgrading to 1.22 has no effect.  Am I missing an
option, or have I found a bug?  It's not a common a problem,
but I'm not the only one to see it.  

$ /usr/pkg/bin/gnu-ld --version | grep ^GNU
GNU ld (GNU Binutils)2.22

$ /usr/pkg/bin/gnu-ld @linker.options
/usr/pkg/bin/gnu-ld: /usr/pkgsrc/wip/clang/work/llvm/Release/lib/libLLVMCodeGen.a
(ShrinkWrapping.o): 
relocation R_X86_64_PC32 against undefined symbol
	`_ZNSs4_Rep10_M_disposeERKSaIcE' 
can not be used when making a shared object; recompile with -fPIC 
/usr/pkg/bin/gnu-ld: final link failed: Bad value

The "linker.options" file is attached.  

Both the reported file and the GNU library that houses that symbol are
relocatable:

$ find ../../ -name ShrinkWrapping.o | xargs file
../../lib/CodeGen/Release/ShrinkWrapping.o: ELF 64-bit LSB relocatable,
x86-64, version 1 (SYSV), not stripped

The symbol is not defined in clang.  It's part of standard string
(wrapped for your viewing pleasure)

$ c++filt _ZNSs4_Rep10_M_disposeERKSaIcE
std::basic_string<char, 
		std::char_traits<char>, 
		std::allocator<char> >::_Rep::
	_M_dispose(std::allocator<char> const&)

nm(1) reports it's defined as a "weak  symbol that has not been
specifically tagged as a weak object symbol" (that's what the "W"
means):

$ find /usr/lib -name \*.a | xargs nm -o \
	| grep _ZNSs4_Rep10_M_disposeERKSaIcE \
	| grep -v i386 

/usr/lib/libstdc++_p.a:string-inst.po:0000000000000000 
	W _ZNSs4_Rep10_M_disposeERKSaIcE 
/usr/lib/libstdc++.a:string-inst.o:0000000000000000 
	W _ZNSs4_Rep10_M_disposeERKSaIcE 
/usr/lib/libstdc++_pic.a:string-inst.so:0000000000000000 
	W _ZNSs4_Rep10_M_disposeERKSaIcE

That object file, string-inst.o, is relocatable:  

$ ar x /usr/lib/libstdc++.a string-inst.o && file string-inst.o 
string-inst.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV),
not stripped  

$ nm string-inst.o | grep _ZNSs4_Rep10_M_disposeERKSaIcE
0000000000000000 W _ZNSs4_Rep10_M_disposeERKSaIcE

I don't really understand weak symbols, but I'm prepared to believe
this symbol, because it's based on a template, isn't really in the
library.  But if not, the compiler should have generated it, and it's
not defined among the .o files.  At least, nm never gives it a "T".  

It doesn't always happen.  The other clang file that uses that symbol,
GCOVProfiling.o, is successfully incorporated into a library,
libLLVMInstrumentation.a:

$ find ../.. -name \*.a 
	| while read F; do \
		ar t $F | grep GCOVProfiling $F && echo $F; \
	  done 
ar: ../../test/Archive/MacOSX.a: Malformed archive
Binary file ../../Release/lib/libLLVMInstrumentation.a matches
../../Release/lib/libLLVMInstrumentation.a
$ ar t ../../Release/lib/libLLVMInstrumentation.a
AddressSanitizer.o
EdgeProfiling.o
FunctionBlackList.o
GCOVProfiling.o
Instrumentation.o
OptimalEdgeProfiling.o
PathProfiling.o
ProfilingUtils.o
ThreadSanitizer.o

Thanks for any help.  I'm happy to try suggestions.  

--jkl

[-- Attachment #2: linker.options --]
[-- Type: application/octet-stream, Size: 1019 bytes --]

	-R '$ORIGIN' 
	-L/usr/pkgsrc/wip/clang/work/llvm/Release/lib 
	-L/usr/pkgsrc/wip/clang/work/llvm/Release/lib 
	-L/usr/lib 
	-R/usr/lib 
	-R/usr/pkg/lib 
	-shared 
	-o /usr/pkgsrc/wip/clang/work/llvm/Release/lib/libLTO.so 
	   /usr/pkgsrc/wip/clang/work/llvm/tools/lto/Release/LTOCodeGenerator.o 
	   /usr/pkgsrc/wip/clang/work/llvm/tools/lto/Release/LTOModule.o 
	   /usr/pkgsrc/wip/clang/work/llvm/tools/lto/Release/lto.o 
	-lLLVMMCDisassembler 
	-lLLVMBitWriter 
	-lLLVMLinker 
	-lLLVMArchive 
	-lLLVMBitReader 
	-lLLVMipo 
	-lLLVMVectorize 
	-lLLVMX86AsmParser 
	-lLLVMX86Disassembler 
	-lLLVMX86CodeGen 
	-lLLVMSelectionDAG 
	-lLLVMAsmPrinter 
	-lLLVMMCParser 
	-lLLVMCodeGen 
	-lLLVMScalarOpts 
	-lLLVMInstCombine 
	-lLLVMTransformUtils 
	-lLLVMipa 
	-lLLVMAnalysis 
	-lLLVMX86Desc 
	-lLLVMX86Info 
	-lLLVMTarget 
	-lLLVMX86AsmPrinter 
	-lLLVMMC 
	-lLLVMObject 
	-lLLVMX86Utils 
	-lLLVMCore 
	-lLLVMSupport 
	-lpthread 
	-lm 
	--version-script /usr/pkgsrc/wip/clang/work/llvm/tools/lto/Release/lto.exports.map 
	

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-21  0:47 ld unpredictable lookup failure building shared library James K. Lowden
@ 2012-04-21  1:04 ` Ian Lance Taylor
  2012-04-21 18:30   ` James K. Lowden
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Lance Taylor @ 2012-04-21  1:04 UTC (permalink / raw)
  To: James K. Lowden; +Cc: binutils

"James K. Lowden" <jklowden@schemamania.org> writes:

> I'm attempting to build clang on x86_64.  ld fails to look up a symbol
> from std::string, recommending -fPIC, but file(1) reports the object
> file is relocatable.

Relocatable is not the same as -fPIC Relocatable just means relocatable
at link tie.  The -fPIC option constructs an object file that is
relocatable at runtime.  Have you tried actually using the -fPIC option
when compiling the file?

Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-21  1:04 ` Ian Lance Taylor
@ 2012-04-21 18:30   ` James K. Lowden
  2012-04-21 19:01     ` Ian Lance Taylor
  0 siblings, 1 reply; 9+ messages in thread
From: James K. Lowden @ 2012-04-21 18:30 UTC (permalink / raw)
  To: binutils

On Fri, 20 Apr 2012 17:54:39 -0700
Ian Lance Taylor <iant@google.com> wrote:

> "James K. Lowden" <jklowden@schemamania.org> writes:
> 
> > I'm attempting to build clang on x86_64.  ld fails to look up a
> > symbol from std::string, recommending -fPIC, but file(1) reports
> > the object file is relocatable.
> 
> Relocatable is not the same as -fPIC Relocatable just means
> relocatable at link tie.  The -fPIC option constructs an object file
> that is relocatable at runtime.  

Good to know, thanks.  How to verify a .o file was compiled with
-fPIC?  The linker knows, but readelf and objdump don't seem to say.
For a .so, perhaps that's what "DYNAMIC" means?  

$ objdump -x /usr/lib/libstdc++.so | sed '/DYN/q'

/usr/lib/libstdc++.so:     file format elf64-x86-64
/usr/lib/libstdc++.so
architecture: i386:x86-64, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED

> Have you tried actually using the -fPIC option when compiling the
> file?

Yes.  The clang makefiles use -fPIC for everything afaict.  I have
verified -fPIC appears on the command-line for the clang object file
under discussion, ShrinkWrapping.o.  The command is:

c++ \
-I/[src]/llvm/include \
-I/[src]/llvm/lib/CodeGen \
-DNDEBUG \
-D_GNU_SOURCE \
-D__STDC_CONSTANT_MACROS \
-D__STDC_FORMAT_MACROS \
-D__STDC_LIMIT_MACROS \
-O1 \
-fomit-frame-pointer \
-fvisibility-inlines-hidden \
-fno-exceptions \
-fno-rtti \
-fPIC \
-Woverloaded-virtual \
-Wcast-qual \
-pedantic \
-Wno-long-long \
-Wall \
-W \
-Wno-unused-parameter \
-Wwrite-strings \
-c \
-MMD \
-MP \
-MF "/[src]/llvm/lib/CodeGen/Release/ShrinkWrapping.d.tmp" \
-MT "/[src]/llvm/lib/CodeGen/Release/ShrinkWrapping.o" \
-MT "/[src]/llvm/lib/CodeGen/Release/ShrinkWrapping.d" \
ShrinkWrapping.cpp \
-o /[src]/llvm/lib/CodeGen/Release/ShrinkWrapping.o 

I haven't researched how string-inst.o in libstdc++ was compiled, but
another clang reference to the same std::string symbol links fine.  (It
creates a different clang .so.)

I satisfied myself where the symbol is defined, btw:  

$ nm -o libstdc++*[ao] \
	| grep _ZNSs4_Rep10_M_disposeERKSaIcE \
	| grep -v i386 
libstdc++.a:string-inst.o:0000000000000000 W
	_ZNSs4_Rep10_M_disposeERKSaIcE 
libstdc++.so:000000000006f830 W	<== here
	_ZNSs4_Rep10_M_disposeERKSaIcE 
libstdc++_p.a:string-inst.po:0000000000000000 W 
	_ZNSs4_Rep10_M_disposeERKSaIcE
libstdc++_pic.a:string-inst.so:0000000000000000 W
	_ZNSs4_Rep10_M_disposeERKSaIcE

Thanks for your interest.  I'm happy to provide any further information
needed to get to the bottom of this.  

--jkl

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-21 18:30   ` James K. Lowden
@ 2012-04-21 19:01     ` Ian Lance Taylor
  2012-04-22  2:17       ` James K. Lowden
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Lance Taylor @ 2012-04-21 19:01 UTC (permalink / raw)
  To: James K. Lowden; +Cc: binutils

"James K. Lowden" <jklowden@schemamania.org> writes:

> Good to know, thanks.  How to verify a .o file was compiled with
> -fPIC?

There is no simply way to tell, unfortunately.  It's a matter of the
relocations.  Some relocations can occur in a -fPIC object, some can
not.

> For a .so, perhaps that's what "DYNAMIC" means?  

No.


From your original message, the linker said this:

> $ /usr/pkg/bin/gnu-ld @linker.options
> /usr/pkg/bin/gnu-ld: /usr/pkgsrc/wip/clang/work/llvm/Release/lib/libLLVMCodeGen.a (ShrinkWrapping.o):
> relocation R_X86_64_PC32 against undefined symbol
>        `_ZNSs4_Rep10_M_disposeERKSaIcE'
> can not be used when making a shared object; recompile with -fPIC

I assume that this is the GNU linker and that you introduced the line
breaks.  The linker is quite correct: a R_X86_64_PC32 relocation against
an undefined symbol can not be used in a shared object.  Such a
relocation should never be generated by the compiler when using -fPIC.
So something is wrong.  The most likely problem is that the symbol is
not defined in any object file included in the link.  I know that you
showed that another instance of libstdc++.so has a weak definition for
the symbol, but you should see whether the symbol is defined in the
specific link that is causing this error.

Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-21 19:01     ` Ian Lance Taylor
@ 2012-04-22  2:17       ` James K. Lowden
  2012-04-22 18:33         ` Ian Lance Taylor
  0 siblings, 1 reply; 9+ messages in thread
From: James K. Lowden @ 2012-04-22  2:17 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: binutils

On Sat, 21 Apr 2012 11:29:37 -0700
Ian Lance Taylor <iant@google.com> wrote:
> "James K. Lowden" <jklowden@schemamania.org> writes:
> 
> > How to verify a .o file was compiled with -fPIC?
> 
> There is no simply way to tell, unfortunately.  It's a matter of the
> relocations.  Some relocations can occur in a -fPIC object, some can
> not.

Thank you.  At least I understand the limit of what can be shown here.  

> From your original message, the linker said this:
> 
> > $ /usr/pkg/bin/gnu-ld @linker.options
> > /usr/pkg/bin/gnu-ld: /usr/pkgsrc/wip/clang/work/llvm/Release/lib/libLLVMCodeGen.a
> > (ShrinkWrapping.o): relocation R_X86_64_PC32 against undefined
> > symbol `_ZNSs4_Rep10_M_disposeERKSaIcE'
> > can not be used when making a shared object; recompile with -fPIC
> 
> I assume that this is the GNU linker and that you introduced the line
> breaks.  

Yes. 

> The linker is quite correct: a R_X86_64_PC32 relocation
> against an undefined symbol can not be used in a shared object.  Such
> a relocation should never be generated by the compiler when using
> -fPIC. 

I appreciate your help, Ian.  I hope to find a way to get you the
information you need to isolate the cause of the problem. 

This is not the only R_X86_64_PC32 in the file; it's just the only one
the linker doesn't like:

$ readelf -a ../../lib/CodeGen/Release/ShrinkWrapping.o \
	| sed -Ene '/R_X86_64_PC32/p' | wc -l
     762

Regarding this particular symbol, though, objdump finds it and readelf
does not:

$ objdump -x ../../lib/CodeGen/Release/ShrinkWrapping.o | grep
_ZNSs4_Rep10_M_disposeERKSaIcE 
0000000000000000         *UND* 0000000000000000
	_ZNSs4_Rep10_M_disposeERKSaIcE 
000000000000004e R_X86_64_PC32
	_ZNSs4_Rep10_M_disposeERKSaIcE+0xfffffffffffffffc

$ readelf -rw  ../../lib/CodeGen/Release/ShrinkWrapping.o \
	| grep -c '_ZNSs4_Rep10_M_disposeERKSaIcE' 
0

I suspect that's evidence of an error on the part of the compiler.  

I found a message from you from a year ago regarding -fPIC asking about
PLT and GOT:

$ readelf -r ../../lib/CodeGen/Release/ShrinkWrapping.o \
	| awk '/PLT|GOT/ {print $3}' | sort | uniq -c 
104 R_X86_64_GOTPCREL
 259 R_X86_64_PLT32

> So something is wrong.  The most likely problem is that the
> symbol is not defined in any object file included in the link.  

That statement confuses me.  Either the compiler emitted incorrect code
("should never be generated") or the code is OK by the linker line is
wrong ("not defined in any object file included in the link").  Unless  
the compiler was supposed to generate code but mistakenly opted to punt
over to the linker by emitting R_X86_64_PC32 instead?  Perhaps because
it's a template we're talking about?  

We know where the symbol is defined i.e., where the object code is that
we want the linker to find

$ nm /usr/lib/libstdc++.so | grep _ZNSs4_Rep10_M_disposeERKSaIcE
000000000006f830 W _ZNSs4_Rep10_M_disposeERKSaIcE

and we know where it's referenced, because that's in the linker message.
What do you mean by "not defined in any object file"?  Does the library
not count as "an object file"?  Does the symbol in ShrinkWrapping.o not
count as a "definition"?   (I am not in the least trying to be
flippant.  I hope that's clear.)  

> I know that you showed that another instance of libstdc++.so has a
> weak definition for the symbol, but you should see whether the symbol
> is defined in the specific link that is causing this error.

I feel like I'm half a step behind you.  To me, libstdc++.so doesn't
have an "instance".  It's a file on the path and the linker uses it.
What is this instance of which you speak?  

I also don't understand "symbol is defined in the specific link".  I
guess you mean "among the things mentioned on the linker command line"
but I'm afraid I've used all the tools I know.  I scanned the tree
of .o files in the project directory for any mention of that symbol.  I
found three, and they all look like the one in ShrinkWrapping.o (as far
as nm shows).  

Thanks again for your time and interest.  I'm crossing my fingers,
hoping there's enough floundering around and guesses here to help you
lead me in the right direction.  

--jkl

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-22  2:17       ` James K. Lowden
@ 2012-04-22 18:33         ` Ian Lance Taylor
  2012-04-22 19:21           ` James K. Lowden
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Lance Taylor @ 2012-04-22 18:33 UTC (permalink / raw)
  To: James K. Lowden; +Cc: binutils

"James K. Lowden" <jklowden@schemamania.org> writes:

> On Sat, 21 Apr 2012 11:29:37 -0700
> Ian Lance Taylor <iant@google.com> wrote:
>> "James K. Lowden" <jklowden@schemamania.org> writes:
>> 
>> > How to verify a .o file was compiled with -fPIC?
>> 
>> There is no simply way to tell, unfortunately.  It's a matter of the
>> relocations.  Some relocations can occur in a -fPIC object, some can
>> not.
>
> Thank you.  At least I understand the limit of what can be shown here.  
>
>> From your original message, the linker said this:
>> 
>> > $ /usr/pkg/bin/gnu-ld @linker.options
>> > /usr/pkg/bin/gnu-ld: /usr/pkgsrc/wip/clang/work/llvm/Release/lib/libLLVMCodeGen.a
>> > (ShrinkWrapping.o): relocation R_X86_64_PC32 against undefined
>> > symbol `_ZNSs4_Rep10_M_disposeERKSaIcE'
>> > can not be used when making a shared object; recompile with -fPIC
>> 
>> I assume that this is the GNU linker and that you introduced the line
>> breaks.  
>
> Yes. 
>
>> The linker is quite correct: a R_X86_64_PC32 relocation
>> against an undefined symbol can not be used in a shared object.  Such
>> a relocation should never be generated by the compiler when using
>> -fPIC. 
>
> I appreciate your help, Ian.  I hope to find a way to get you the
> information you need to isolate the cause of the problem. 
>
> This is not the only R_X86_64_PC32 in the file; it's just the only one
> the linker doesn't like:

As the error message says, it's a problem to have a R_X86_64_PC32
relocation against an undefined symbol.  It's not a problem to have such
a relocation against a defined symbol.


> Regarding this particular symbol, though, objdump finds it and readelf
> does not:
>
> $ objdump -x ../../lib/CodeGen/Release/ShrinkWrapping.o | grep
> _ZNSs4_Rep10_M_disposeERKSaIcE 
> 0000000000000000         *UND* 0000000000000000
> 	_ZNSs4_Rep10_M_disposeERKSaIcE 
> 000000000000004e R_X86_64_PC32
> 	_ZNSs4_Rep10_M_disposeERKSaIcE+0xfffffffffffffffc
>
> $ readelf -rw  ../../lib/CodeGen/Release/ShrinkWrapping.o \
> 	| grep -c '_ZNSs4_Rep10_M_disposeERKSaIcE' 
> 0

The readelf program is (in my opinion) broken in that you have to use
the --wide option to get reliable results.

>> So something is wrong.  The most likely problem is that the
>> symbol is not defined in any object file included in the link.  
>
> That statement confuses me.  Either the compiler emitted incorrect code
> ("should never be generated") or the code is OK by the linker line is
> wrong ("not defined in any object file included in the link").  Unless  
> the compiler was supposed to generate code but mistakenly opted to punt
> over to the linker by emitting R_X86_64_PC32 instead?  Perhaps because
> it's a template we're talking about?  

The compiler generates single object files.  Having a R_X86_64_PC32
relocation in a single object file is neither right nor wrong.  When
using -fPIC it is normal to have a R_X86_64_PC32 relocation for a local
symbol or for a symbol with non-default visibility.  The linker sees all
the object files at the same time.  When the linker is generating a
shared library, there must be a definition for ever symbol referenced by
an R_X86_64_PC32 relocation.  If there is no definition, the linker will
reject the program, because the dynamic linker will be unable to resolve
the relocation at runtime.  Some dynamic relocations are OK at runtime,
but R_X86_64_PC32 is not, because the symbol definition is likely to be
out of range from the reference.


> We know where the symbol is defined i.e., where the object code is that
> we want the linker to find
>
> $ nm /usr/lib/libstdc++.so | grep _ZNSs4_Rep10_M_disposeERKSaIcE
> 000000000006f830 W _ZNSs4_Rep10_M_disposeERKSaIcE
>
> and we know where it's referenced, because that's in the linker message.
> What do you mean by "not defined in any object file"?  Does the library
> not count as "an object file"?  Does the symbol in ShrinkWrapping.o not
> count as a "definition"?   (I am not in the least trying to be
> flippant.  I hope that's clear.)  

I don't know whether you are linking libstdc++.so itself or not.  If you
are, then for some reason in your link the symbol is not defined.  If
you are not, then the definition in libstdc++.so does not count.  For an
R_X86_64_PC32 relocation to be resolvable, the symbol must be defined in
one of the object files passed to the linker, not in a shared library
included in the link.

Hope that makes more sense.

Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-22 18:33         ` Ian Lance Taylor
@ 2012-04-22 19:21           ` James K. Lowden
  2012-04-23  1:58             ` Ian Lance Taylor
  0 siblings, 1 reply; 9+ messages in thread
From: James K. Lowden @ 2012-04-22 19:21 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: binutils

On Sat, 21 Apr 2012 22:39:34 -0700
Ian Lance Taylor <iant@google.com> wrote:
> "James K. Lowden" <jklowden@schemamania.org> writes:
>> > $ /usr/pkg/bin/gnu-ld @linker.options
>> > /usr/pkg/bin/gnu-ld: /usr/pkgsrc/wip/clang/work/llvm/Release/lib/libLLVMCodeGen.a
>> > (ShrinkWrapping.o): relocation R_X86_64_PC32 against undefined
>> > symbol `_ZNSs4_Rep10_M_disposeERKSaIcE'
>> > can not be used when making a shared object; recompile with -fPIC
> 
> As the error message says, it's a problem to have a R_X86_64_PC32
> relocation against an undefined symbol.  It's not a problem to have
> such a relocation against a defined symbol.

Hi Ian, 

I think I get it now.  In my own words:

1.  With -fPIC, the compiler may generate R_X86_64_PC32.  It is
"normal" to have an R_X86_64_PC32 against *defined* symbols, but not
strictly necessary.  

2.  Such defined symbols must  be local or have non-default
visibility.  (As a C programmer, I think that means such symbols have
no external linkage i.e. they are declared static.  But that
contradicts #1.)

3.  The object file in question was compiled with -fPIC and furthermore
contains artifacts of being compiled with -fPIC.  

4.  When the linker is generating a shared library, every R_X86_64_PC32
relocation (directive? instruction?) must refer to a symbol defined in
an object file passed to the linker, not in any shared library.  

5.  The linker reports finding a symbol with an R_X86_64_PC32
relocation for an undefined symbol.  That appears to be correct,
insofar as I have been unable to find a definition for that symbol
among the object files mentioned on the linker command line using
objdump and readelf.  

6.  Ergo, the the build is in error.  

If I could generate one tiny .o with a definition for
_ZNSs4_Rep10_M_disposeERKSaIcE, the link would work. But that should
not be necessary because the compiler should never have emitted a file
lacking that definition in the first place.  (Although the symbol can
be defined in another .o, the compiler has no right to assume there
will be another .o.  It's possible to build a .so using just one .cpp
file.)

Or the compiler is in error?  Much rests on the meaning of "normal"
above.  It is unclear to me why the compiler would sometimes not define
a symbol that cannot be drawn from a library.  

> > $ readelf -rw  ../../lib/CodeGen/Release/ShrinkWrapping.o \
> > 	| grep -c '_ZNSs4_Rep10_M_disposeERKSaIcE' 
> > 0
> 
> The readelf program is (in my opinion) broken in that you have to use
> the --wide option to get reliable results.

That's what -rw was meant to do but didn't.  Here it is corrected
(wrapped for email):

$ readelf -rW  ../../lib/CodeGen/Release/ShrinkWrapping.o \
	| grep '_ZNSs4_Rep10_M_disposeERKSaIcE' 
000000000000004e  0000013300000002
R_X86_64_PC32          0000000000000000 _ZNSs4_Rep10_M_disposeERKSaIcE
+ fffffffffffffffc

Apart from truncating symbols by default, the readelf program has
another serious flaw: the documentation is long on inputs and silent on
the meaning of the outputs.  

I'm guessing that the zeros for the "value" in column 3 indicate the
symbol appears at offset 0x4E but is not defined.  That of course
is consistent with the linker message.  

> I don't know whether you are linking libstdc++.so itself or not.  

No.  Just to clarify, the -o on the command line indicated a clang
shared object.  It references libstdc++ implicitly.  

> Hope that makes more sense.

Yes, I'm in the process of building gcc 4.7 now to see if that changes
anything.  I would like to understand whether or not the compiler can
legitimately choose not to define the symbol, and if so how it
decides.  That would also tell me (I hope) how to generate a tiny file
defining the symbol to satisfy the linker.  

> the symbol definition is likely to be out of range from the reference.

http://www.technovelty.org/code/c/amd64-pic.html

From that post I gather that R_X86_64_PC32 is 32-bits, permitting the
symbol to be defined at an address up to 2 GB distant from the
reference.  The linker can control this; it knows all the static
offsets and can bomb out if the whole .so cannot be contructed in that
space.  On a 64-bit architecture, no such guarantee can be made by the
runtime linker.  It could "try" but it would sometimes fail, and fail
catastrophically.  Discretion being the better part of valor, it
instead insists on larger (?) relocations that can assuredly be
resolved at runtime.  

I would be interested in your comments on my surmise.  I'm sure others
would, too.  

Thank you again for your help.  It has been quite an education.  

Regards, 

--jkl

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-22 19:21           ` James K. Lowden
@ 2012-04-23  1:58             ` Ian Lance Taylor
  2012-04-23  6:54               ` James K. Lowden
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Lance Taylor @ 2012-04-23  1:58 UTC (permalink / raw)
  To: James K. Lowden; +Cc: binutils

"James K. Lowden" <jklowden@schemamania.org> writes:

> 1.  With -fPIC, the compiler may generate R_X86_64_PC32.  It is
> "normal" to have an R_X86_64_PC32 against *defined* symbols, but not
> strictly necessary.  
>
> 2.  Such defined symbols must  be local or have non-default
> visibility.  (As a C programmer, I think that means such symbols have
> no external linkage i.e. they are declared static.  But that
> contradicts #1.)
>
> 3.  The object file in question was compiled with -fPIC and furthermore
> contains artifacts of being compiled with -fPIC.  
>
> 4.  When the linker is generating a shared library, every R_X86_64_PC32
> relocation (directive? instruction?) must refer to a symbol defined in
> an object file passed to the linker, not in any shared library.  
>
> 5.  The linker reports finding a symbol with an R_X86_64_PC32
> relocation for an undefined symbol.  That appears to be correct,
> insofar as I have been unable to find a definition for that symbol
> among the object files mentioned on the linker command line using
> objdump and readelf.  
>
> 6.  Ergo, the the build is in error.  
>
> If I could generate one tiny .o with a definition for
> _ZNSs4_Rep10_M_disposeERKSaIcE, the link would work. But that should
> not be necessary because the compiler should never have emitted a file
> lacking that definition in the first place.  (Although the symbol can
> be defined in another .o, the compiler has no right to assume there
> will be another .o.  It's possible to build a .so using just one .cpp
> file.)
>
> Or the compiler is in error?  Much rests on the meaning of "normal"
> above.  It is unclear to me why the compiler would sometimes not define
> a symbol that cannot be drawn from a library.  

That all looks basically right.  I would say that the compiler is in
error here.  The compiler is certainly permitted to refer to a symbol
drawn from a library.  But when compiling with -fPIC, any such reference
must not use a R_X86_64_PC32 relocation (it should use a R_X86_64_PLT32
relocation instead).  So the error is that compiler emitted a PC32
relocation when it should have emitted a PLT32 relocation.


>> > $ readelf -rw  ../../lib/CodeGen/Release/ShrinkWrapping.o \
>> > 	| grep -c '_ZNSs4_Rep10_M_disposeERKSaIcE' 
>> > 0
>> 
>> The readelf program is (in my opinion) broken in that you have to use
>> the --wide option to get reliable results.
>
> That's what -rw was meant to do but didn't.  Here it is corrected
> (wrapped for email):
>
> $ readelf -rW  ../../lib/CodeGen/Release/ShrinkWrapping.o \
> 	| grep '_ZNSs4_Rep10_M_disposeERKSaIcE' 
> 000000000000004e  0000013300000002
> R_X86_64_PC32          0000000000000000 _ZNSs4_Rep10_M_disposeERKSaIcE
> + fffffffffffffffc
>
> Apart from truncating symbols by default, the readelf program has
> another serious flaw: the documentation is long on inputs and silent on
> the meaning of the outputs.  
>
> I'm guessing that the zeros for the "value" in column 3 indicate the
> symbol appears at offset 0x4E but is not defined.  That of course
> is consistent with the linker message.  

That's not quite right.  The 0x4e is the address of the relocation.  The
readelf -r option just dumps the relocation information, and the
relocation refers to the symbol table.  To see the entry in the symbol
table, you need to use the readelf -s option.

Ian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ld unpredictable lookup failure building shared library
  2012-04-23  1:58             ` Ian Lance Taylor
@ 2012-04-23  6:54               ` James K. Lowden
  0 siblings, 0 replies; 9+ messages in thread
From: James K. Lowden @ 2012-04-23  6:54 UTC (permalink / raw)
  To: Ian Lance Taylor; +Cc: binutils

On Sun, 22 Apr 2012 12:21:08 -0700
Ian Lance Taylor <iant@google.com> wrote:
> "James K. Lowden" <jklowden@schemamania.org> writes:
> >
> > It is unclear to me why the compiler would sometimes not
> > define a symbol that cannot be drawn from a library.  
> 
> I would say that the compiler is in
> error here.  The compiler is certainly permitted to refer to a symbol
> drawn from a library.  But when compiling with -fPIC, any such
> reference must not use a R_X86_64_PC32 relocation (it should use a
> R_X86_64_PLT32 relocation instead).  So the error is that compiler
> emitted a PC32 relocation when it should have emitted a PLT32
> relocation.

Excellent, thanks, that's clear.  I am now hopeful that gcc 4.7 will
not exhibit the same behavior.  

I at last understand the domain from which the term R_X86_64_PC32 is
taken (elf) and found what appears to be the authoritative
documentation:  

http://refspecs.linuxbase.org/elf/
http://refspecs.linuxbase.org/elf/x86_64-abi-0.95.pdf (cf. table 4.10)

I remember when "medium model" meant two 64 KB segments, one for code
and one for data.  Plus ça change.  

> > $ readelf -rW  ../../lib/CodeGen/Release/ShrinkWrapping.o \
> > 	| grep '_ZNSs4_Rep10_M_disposeERKSaIcE' 
> > 000000000000004e  0000013300000002
> > R_X86_64_PC32          0000000000000000
> > _ZNSs4_Rep10_M_disposeERKSaIcE
> > + fffffffffffffffc
> 
> The 0x4e is the address of the relocation.
> The readelf -r option just dumps the relocation information, and the
> relocation refers to the symbol table.  To see the entry in the symbol
> table, you need to use the readelf -s option.

For the record, readelf reports the symbol as zero:

$ readelf -sW  ../../lib/CodeGen/Release/ShrinkWrapping.o \
	| sed -Ene '/Num:|_ZNSs4_Rep10_M_disposeERKSaIcE/p' 
Num:    Value          Size Type   Bind   Vis      Ndx Name 
307: 0000000000000000     0 NOTYPE GLOBAL DEFAULT  UND
_ZNSs4_Rep10_M_disposeERKSaIcE

which I assume means "undefined" given what else I've learned here.  

Thanks again for taking the time to diagnose and explain what the
message meant in terms I can understand.  I've been programming a long
time and I've seen compiler bugs before, but always on the input side.
C++ is a big language, and the creative programmer eventually comes up
with a valid construct that doesn't parse.  This is the first time I've
seen invalid object code exit a compiler.  

Regards, 

--jkl

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-04-23  1:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-21  0:47 ld unpredictable lookup failure building shared library James K. Lowden
2012-04-21  1:04 ` Ian Lance Taylor
2012-04-21 18:30   ` James K. Lowden
2012-04-21 19:01     ` Ian Lance Taylor
2012-04-22  2:17       ` James K. Lowden
2012-04-22 18:33         ` Ian Lance Taylor
2012-04-22 19:21           ` James K. Lowden
2012-04-23  1:58             ` Ian Lance Taylor
2012-04-23  6:54               ` James K. Lowden

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).