public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol
@ 2019-03-19  2:07 Martin McClure
  2019-03-19  6:30 ` H.J. Lu
  0 siblings, 1 reply; 7+ messages in thread
From: Martin McClure @ 2019-03-19  2:07 UTC (permalink / raw)
  To: binutils

tl;dr: The linker produced PLT entries for undefined symbols with 
R_X86_64_PLT32 relocations up through Ubuntu 16.04; but produces calls 
to empty PLT entries in Ubuntu 18.04. Has this usage ever been legal, 
and if so how can it be made to work now?

(very simplified) reproduction case, x86_64:

----

library.c

int answer() {
     return 42;
}

----

executable.c

#include <dlfcn.h>

int answer();

int main()
{
     void *lib = dlopen("./library.so", RTLD_LAZY | RTLD_GLOBAL);
     if (!lib) {
         printf("dlopen failed");
     }
     printf("The answer is %d\n", answer());
     return 0;
}

----


Compile and link:

gcc -fpic library.c -c -o library.o
gcc -shared -o library.so library.o

gcc -c -fPIC executable.c -o executable.o
gcc executable.o -lc -ldl -Wl,--unresolved-symbols=ignore-all -o executable

---

This produces a functional executable in Ubuntu versions up to and 
including 16.04 (gcc 5.4.0, ld 2.26.1) but fails in Ubuntu 18.04 (gcc 
7.3.0, ld 2.30). Another difference is that Ubuntu 18.04 configures with 
--enable-default-pie. Full configuration info below for reference.

In the success case, ld produces a PLT entry for the symbol "answer" and 
the call to answer() goes to that entry.
In the failure case, the call to answer() goes to an offset in the PLT, 
but that offset in the PLT is empty (zeroes).

THE QUESTION: Is it supported usage to expect the linker to produce a 
PLT entry from a R_X86_64_PLT32 relocation, even if the symbol is 
undefined? And if so, is the failure I'm seeing a matter of incorrect 
invocation, or a bug?

Why do I want this to work? The code base I work with has been using 
this for quite a few years (long before I got involved). There are two 
shared libraries that each implement the same few hundred functions, but 
implement them differently (one locally, the other via RPC). By having 
the executable dlopen() the version it wants to use, it can choose the 
implementation at runtime rather than at link time.

Thanks for considering this question. Let me know if I've been unclear 
or omitted any useful information.

Regards,
-Martin




Configuration for Ubuntu 16.04:

Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 
5.4.0-6ubuntu1~16.04.11' 
--with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs 
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ 
--prefix=/usr --program-suffix=-5 --enable-shared 
--enable-linker-build-id --libexecdir=/usr/lib 
--without-included-gettext --enable-threads=posix --libdir=/usr/lib 
--enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-vtable-verify --enable-libmpx --enable-plugin 
--with-system-zlib --disable-browser-plugin --enable-java-awt=gtk 
--enable-gtk-cairo 
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre 
--enable-java-home 
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 
--with-arch-directory=amd64 
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc 
--enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic 
--enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

Configuration for Ubuntu 18.04:

Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 
7.3.0-27ubuntu1~18.04' 
--with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs 
--enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ 
--prefix=/usr --with-gcc-major-version-only --program-suffix=-7 
--program-prefix=x86_64-linux-gnu- --enable-shared 
--enable-linker-build-id --libexecdir=/usr/lib 
--without-included-gettext --enable-threads=posix --libdir=/usr/lib 
--enable-nls --with-sysroot=/ --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-vtable-verify --enable-libmpx --enable-plugin 
--enable-default-pie --with-system-zlib --with-target-system-zlib 
--enable-objc-gc=auto --enable-multiarch --disable-werror 
--with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 
--enable-multilib --with-tune=generic 
--enable-offload-targets=nvptx-none --without-cuda-driver 
--enable-checking=release --build=x86_64-linux-gnu 
--host=x86_64-linux-gnu --target=x86_64-linux-gnu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol
  2019-03-19  2:07 Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol Martin McClure
@ 2019-03-19  6:30 ` H.J. Lu
  2019-03-19 21:44   ` Martin McClure
  2019-03-25 19:28   ` Michael Matz
  0 siblings, 2 replies; 7+ messages in thread
From: H.J. Lu @ 2019-03-19  6:30 UTC (permalink / raw)
  To: Martin McClure; +Cc: Binutils

On Tue, Mar 19, 2019 at 10:08 AM Martin McClure
<martin.mcclure@gemtalksystems.com> wrote:
>
> tl;dr: The linker produced PLT entries for undefined symbols with
> R_X86_64_PLT32 relocations up through Ubuntu 16.04; but produces calls
> to empty PLT entries in Ubuntu 18.04. Has this usage ever been legal,
> and if so how can it be made to work now?
>
> (very simplified) reproduction case, x86_64:
>
> ----
>
> library.c
>
> int answer() {
>      return 42;
> }
>
> ----
>
> executable.c
>
> #include <dlfcn.h>
>
> int answer();
>
> int main()
> {
>      void *lib = dlopen("./library.so", RTLD_LAZY | RTLD_GLOBAL);
>      if (!lib) {
>          printf("dlopen failed");
>      }
>      printf("The answer is %d\n", answer());
>      return 0;
> }
>
> ----
>
>
> Compile and link:
>
> gcc -fpic library.c -c -o library.o
> gcc -shared -o library.so library.o
>
> gcc -c -fPIC executable.c -o executable.o
> gcc executable.o -lc -ldl -Wl,--unresolved-symbols=ignore-all -o executable
>
> ---
>
> This produces a functional executable in Ubuntu versions up to and
> including 16.04 (gcc 5.4.0, ld 2.26.1) but fails in Ubuntu 18.04 (gcc
> 7.3.0, ld 2.30). Another difference is that Ubuntu 18.04 configures with
> --enable-default-pie. Full configuration info below for reference.
>
> In the success case, ld produces a PLT entry for the symbol "answer" and
> the call to answer() goes to that entry.
> In the failure case, the call to answer() goes to an offset in the PLT,
> but that offset in the PLT is empty (zeroes).
>
> THE QUESTION: Is it supported usage to expect the linker to produce a
> PLT entry from a R_X86_64_PLT32 relocation, even if the symbol is
> undefined? And if so, is the failure I'm seeing a matter of incorrect
> invocation, or a bug?
>
> Why do I want this to work? The code base I work with has been using
> this for quite a few years (long before I got involved). There are two
> shared libraries that each implement the same few hundred functions, but
> implement them differently (one locally, the other via RPC). By having
> the executable dlopen() the version it wants to use, it can choose the
> implementation at runtime rather than at link time.

Since answer is undefined, its behavior is undefined.

-- 
H.J.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol
  2019-03-19  6:30 ` H.J. Lu
@ 2019-03-19 21:44   ` Martin McClure
  2019-03-25 19:28   ` Michael Matz
  1 sibling, 0 replies; 7+ messages in thread
From: Martin McClure @ 2019-03-19 21:44 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Binutils

On 3/18/19 11:30 PM, H.J. Lu wrote:
> On Tue, Mar 19, 2019 at 10:08 AM Martin McClure
> <martin.mcclure@gemtalksystems.com> wrote:
>> tl;dr: The linker produced PLT entries for undefined symbols with
>> R_X86_64_PLT32 relocations up through Ubuntu 16.04; but produces calls
>> to empty PLT entries in Ubuntu 18.04. Has this usage ever been legal,
>> and if so how can it be made to work now?
>>
[...]
> Since answer is undefined, its behavior is undefined.
>
Thank you, that is useful information.

Regards,

-Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol
  2019-03-19  6:30 ` H.J. Lu
  2019-03-19 21:44   ` Martin McClure
@ 2019-03-25 19:28   ` Michael Matz
  2019-03-25 19:33     ` Martin McClure
  1 sibling, 1 reply; 7+ messages in thread
From: Michael Matz @ 2019-03-25 19:28 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Martin McClure, Binutils

Hi,

On Tue, 19 Mar 2019, H.J. Lu wrote:

> > library.c
> >
> > int answer() {
> >      return 42;
> > }
> >
> > ----
> >
> > executable.c
> >
> > #include <dlfcn.h>
> >
> > int answer();
> >
> > int main()
> > {
> >      void *lib = dlopen("./library.so", RTLD_LAZY | RTLD_GLOBAL);
> >      if (!lib) {
> >          printf("dlopen failed");
> >      }
> >      printf("The answer is %d\n", answer());
> >      return 0;
> > }
> 
> Since answer is undefined, its behavior is undefined.

You're making this sound more clear-cut than it is, and I disagree with 
it.  Clearly, at runtime, the symbol 'answer' is resolvable just fine, the 
loaded library contains a global definition.  So with lazy resolution it'd 
work.  Note that the user explicitely requested the acceptance of 
unresolved symbols with --unresolved-symbols=ignore-all (and did not 
request non-lazy loading), so I would fully expect that unresolved global 
symbols will be made dynamic symbols even for executables.

Martin: there is a work around for you for now: declare the functions in 
question as weak in the objects making use of them:

------
int answer() __attribute__((weak));
...
     printf("The answer is %d\n", answer());
...
------

You might or might not have to use the '-z dynamic-undefined-weak' link 
editor option for this to work.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol
  2019-03-25 19:28   ` Michael Matz
@ 2019-03-25 19:33     ` Martin McClure
  2019-03-25 21:14       ` Martin McClure
  0 siblings, 1 reply; 7+ messages in thread
From: Martin McClure @ 2019-03-25 19:33 UTC (permalink / raw)
  To: Michael Matz, H.J. Lu; +Cc: Binutils

On 3/25/19 12:28 PM, Michael Matz wrote:
> Hi,
> 
> On Tue, 19 Mar 2019, H.J. Lu wrote:
> 
>>> library.c
>>>
>>> int answer() {
>>>       return 42;
>>> }
>>>
>>> ----
>>>
>>> executable.c
>>>
>>> #include <dlfcn.h>
>>>
>>> int answer();
>>>
>>> int main()
>>> {
>>>       void *lib = dlopen("./library.so", RTLD_LAZY | RTLD_GLOBAL);
>>>       if (!lib) {
>>>           printf("dlopen failed");
>>>       }
>>>       printf("The answer is %d\n", answer());
>>>       return 0;
>>> }
>>
>> Since answer is undefined, its behavior is undefined.
> 
> You're making this sound more clear-cut than it is, and I disagree with
> it.  Clearly, at runtime, the symbol 'answer' is resolvable just fine, the
> loaded library contains a global definition.  So with lazy resolution it'd
> work.  Note that the user explicitely requested the acceptance of
> unresolved symbols with --unresolved-symbols=ignore-all (and did not
> request non-lazy loading), so I would fully expect that unresolved global
> symbols will be made dynamic symbols even for executables.
> 
> Martin: there is a work around for you for now: declare the functions in
> question as weak in the objects making use of them:
> 
> ------
> int answer() __attribute__((weak));
> ...
>       printf("The answer is %d\n", answer());
> ...
> ------
> 
> You might or might not have to use the '-z dynamic-undefined-weak' link
> editor option for this to work.
> 

Thanks, Michael. I believe I tried making the symbol weak, and it failed 
to work, but I may have done it wrong, and I probably did not use '-z 
dynamic-undefined-weak', so I'll try again.

Regards,
-Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol
  2019-03-25 19:33     ` Martin McClure
@ 2019-03-25 21:14       ` Martin McClure
  2019-03-26 15:31         ` Michael Matz
  0 siblings, 1 reply; 7+ messages in thread
From: Martin McClure @ 2019-03-25 21:14 UTC (permalink / raw)
  To: Michael Matz, H.J. Lu; +Cc: Binutils

On 3/25/19 12:32 PM, Martin McClure wrote:
> On 3/25/19 12:28 PM, Michael Matz wrote:
>> Hi,
>>
>> On Tue, 19 Mar 2019, H.J. Lu wrote:
>>
>>>> library.c
>>>>
>>>> int answer() {
>>>>       return 42;
>>>> }
>>>>
>>>> ----
>>>>
>>>> executable.c
>>>>
>>>> #include <dlfcn.h>
>>>>
>>>> int answer();
>>>>
>>>> int main()
>>>> {
>>>>       void *lib = dlopen("./library.so", RTLD_LAZY | RTLD_GLOBAL);
>>>>       if (!lib) {
>>>>           printf("dlopen failed");
>>>>       }
>>>>       printf("The answer is %d\n", answer());
>>>>       return 0;
>>>> }
>>>
>>> Since answer is undefined, its behavior is undefined.
>>
>> You're making this sound more clear-cut than it is, and I disagree with
>> it.  Clearly, at runtime, the symbol 'answer' is resolvable just fine, 
>> the
>> loaded library contains a global definition.  So with lazy resolution 
>> it'd
>> work.  Note that the user explicitely requested the acceptance of
>> unresolved symbols with --unresolved-symbols=ignore-all (and did not
>> request non-lazy loading), so I would fully expect that unresolved global
>> symbols will be made dynamic symbols even for executables.
>>
>> Martin: there is a work around for you for now: declare the functions in
>> question as weak in the objects making use of them:
>>
>> ------
>> int answer() __attribute__((weak));
>> ...
>>       printf("The answer is %d\n", answer());
>> ...
>> ------
>>
>> You might or might not have to use the '-z dynamic-undefined-weak' link
>> editor option for this to work.
>>
> 
> Thanks, Michael. I believe I tried making the symbol weak, and it failed 
> to work, but I may have done it wrong, and I probably did not use '-z 
> dynamic-undefined-weak', so I'll try again.

Using the weak symbol worked nicely -- thanks again! The '-z 
dynamic-undefined-weak' option was not required. I like this better than 
our previous use of --unresolved-symbols=ignore-all since that did not 
detect misspelled function names until runtime.

Is there any drawback to using weak symbols? I don't expect any strong 
symbols with these names to be defined anywhere outside of the 
dynamically loaded library.

Is there any better way to implement this kind of pattern? I looked for 
an option something like -l to link a dynamic library, but that unlike 
-l did not add a DT_NEEDED entry for the library, so the specific 
library could be specified at runtime. However, if there is such an 
option I did not find it.

Regards,
-Martin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol
  2019-03-25 21:14       ` Martin McClure
@ 2019-03-26 15:31         ` Michael Matz
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Matz @ 2019-03-26 15:31 UTC (permalink / raw)
  To: Martin McClure; +Cc: H.J. Lu, Binutils

Hi,

On Mon, 25 Mar 2019, Martin McClure wrote:

> Using the weak symbol worked nicely -- thanks again! The '-z 
> dynamic-undefined-weak' option was not required. I like this better than 
> our previous use of --unresolved-symbols=ignore-all since that did not 
> detect misspelled function names until runtime.

With weak symbols you won't have such detection either.  Weak undefined 
symbols will simply turn out to be NULL and crash the same way.

> Is there any drawback to using weak symbols? I don't expect any strong 
> symbols with these names to be defined anywhere outside of the 
> dynamically loaded library.

Then weak symbols won't have any drawback in your scenario.  Weak 
references aren't much different from undefined (in the current DSO) 
global symbols; weak definitions would be, but you don't have those.

> Is there any better way to implement this kind of pattern? I looked for 
> an option something like -l to link a dynamic library, but that unlike 
> -l did not add a DT_NEEDED entry for the library, so the specific 
> library could be specified at runtime. However, if there is such an 
> option I did not find it.

There is one way: --dynamic-list.  You can force certain symbols to be 
dynamic, which luckily still includes undefined symbols:

% cat symbols.list
{
answer;
};
% gcc executable.o -lc -ldl -Wl,--unresolved-symbols=ignore-all \
  -Wl,--dynamic-list,symbols.list  -o executable

(remove the weak again to see it working for real).

But you still need the --unresolved-symbols=ignore-all to not get an 
error, so you still get no nice checking for misspellings.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-03-26 15:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-19  2:07 Behavior change, PLT entries for R_X86_64_PLT32 relocations with undefined symbol Martin McClure
2019-03-19  6:30 ` H.J. Lu
2019-03-19 21:44   ` Martin McClure
2019-03-25 19:28   ` Michael Matz
2019-03-25 19:33     ` Martin McClure
2019-03-25 21:14       ` Martin McClure
2019-03-26 15:31         ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).