* Functions and Global Variables defined in Libraries @ 2023-03-07 12:13 Frederick Virchanza Gotham 2023-03-07 13:05 ` Xi Ruoyao 0 siblings, 1 reply; 9+ messages in thread From: Frederick Virchanza Gotham @ 2023-03-07 12:13 UTC (permalink / raw) To: gcc-help Let's say we have a program that links with a library that exports a global variable and a function. So the library looks like this: int lib_global_variable = 0; void Func(void) { } The main program has the following declarations: extern int lib_global_variable; extern void Func(void); The program links fine and runs fine if we give the linker "-L. -lname_of_library". If we use the program "nm" on the main executable and grep for "lib_global_variable" and "Func", we see that both are listed as undefined symbols: U lib_global_variable U _Z7LibFuncv If we use 'readelf' on the main executable and grep for the same two symbols, we see: 000000003fc8 000700000006 R_X86_64_GLOB_DAT 0000000000000000 lib_global_variable + 0 000000004038 000900000007 R_X86_64_JUMP_SLO 0000000000000000 _Z7LibFuncv + 0 7: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND lib_global_variable 9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _Z7LibFuncv 39: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND lib_global_variable 41: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _Z7LibFuncv I've been doing some testing and tinkering, and I've found that the strategy of using 'dlopen' at runtime to load a library works fine so long as the undefined symbol is listed under R_X86_64_JUMP_SLO. It doesn't work if the symbol is listed under R_X86_64_GLOB_DAT. Typically all undefined functions get listed under R_X86_64_JUMP_SLO, and all global variables get listed under R_X86_64_GLOB_DAT, however it is possible to get functions listed under R_X86_64_GLOB_DAT, and my strategy of using 'dlopen' doesn't work if the function is under R_X86_64_GLOB_DAT. It seems that GNU g++ by default puts the undefined function under R_X86_64_JUMP_SLO, however if you try to use the address of the function at all, for example: cout << (std::uintptr_t)(void*)LibFunc << endl; then the function gets moved to R_X86_64_GLOB_DAT, and then my strategy no longer works as 'dlopen' doesn't resolve the unresolved symbol. So I'd like to ask two questions: (1) Is the R_X86_64_JUMP_SLO category just for functions, or can we put global variables in there too? Is it possible to get 'dlopen' to resolve global variables? (2) Is there any way to stop the GNU linker from putting an undefined function in R_X86_64_GLOB_DAT? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-07 12:13 Functions and Global Variables defined in Libraries Frederick Virchanza Gotham @ 2023-03-07 13:05 ` Xi Ruoyao 2023-03-07 13:25 ` Alexander Monakov 0 siblings, 1 reply; 9+ messages in thread From: Xi Ruoyao @ 2023-03-07 13:05 UTC (permalink / raw) To: cauldwell.thomas, gcc-help Neither R_X86_64_JUMP_SLOT nor R_X86_64_GLOB_DAT are emitted by GCC so the question is off-topic. I'll explain them briefly though. On Tue, 2023-03-07 at 12:13 +0000, Frederick Virchanza Gotham via Gcc- help wrote: > (1) Is the R_X86_64_JUMP_SLO category just for functions, or can we > put global variables in there too? Is it possible to get 'dlopen' to > resolve global variables? No. > (2) Is there any way to stop the GNU linker from putting an undefined > function in R_X86_64_GLOB_DAT? No, unless you avoid extracting its address. You need to understand how R_X86_64_JUMP_SLOT works. When a program or library is loaded, the dynamic linker do nothing for it. When you call a function foo in a shared library, it's implemented by calling a function called foo@plt first. foo@plt is in the main executable. It attempts to load the address of foo from the GOT and jump to the address. As R_X86_64_JUMP_SLOT is not handled by the dynamic linker, on the first call the GOT does not contains the address of foo. Instead it contains the address of a "resolver function". The resolver calculates the real address of foo, fills it into the GOT, then jumps to the address. In the subsequent calls foo@plt will directly jump to foo as the GOT already contains the address of foo. This obviously won't work for a global variable because you can't call it. This won't work for a function pointer either: the value of the function pointer must be the address of foo itself, not foo@plt. Or the result of &foo in the shared library and the main executable will be different, violating the C or C++ standard. Then we must use R_X86_64_JUMP_SLOT which is handled by the dynamic linker. -- Xi Ruoyao <xry111@xry111.site> School of Aerospace Science and Technology, Xidian University ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-07 13:05 ` Xi Ruoyao @ 2023-03-07 13:25 ` Alexander Monakov 2023-03-08 23:08 ` Frederick Virchanza Gotham 0 siblings, 1 reply; 9+ messages in thread From: Alexander Monakov @ 2023-03-07 13:25 UTC (permalink / raw) To: Xi Ruoyao; +Cc: cauldwell.thomas, gcc-help On Tue, 7 Mar 2023, Xi Ruoyao via Gcc-help wrote: > This won't work for a function pointer either: the value of the > function pointer must be the address of foo itself, not foo@plt. Or the > result of &foo in the shared library and the main executable will be > different, violating the C or C++ standard. Then we must use > R_X86_64_JUMP_SLOT which is handled by the dynamic linker. (you probably meant GLOB_DAT in the last statement) This paragraph is inaccurate: traditional non-PIC, non-PIE codegen uses direct symbol references. So, when you have a direct reference to foo in non-PIC main executable, the reference is resolved to its PLT slot, and the address of that PLT slot becomes the canonical address of 'foo' for the whole program. When the main executable is PIE, it may or may not have a PLT slot for 'foo', and if it doesn't, the canonical address of 'foo' is its actual implementation. Alexander ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-07 13:25 ` Alexander Monakov @ 2023-03-08 23:08 ` Frederick Virchanza Gotham 2023-03-09 2:47 ` Xi Ruoyao 0 siblings, 1 reply; 9+ messages in thread From: Frederick Virchanza Gotham @ 2023-03-08 23:08 UTC (permalink / raw) To: gcc-help On Tue, Mar 7, 2023 at 1:05 PM Xi Ruoyao wrote: > > > > (2) Is there any way to stop the GNU linker from putting an undefined > > function in R_X86_64_GLOB_DAT? > > No, unless you avoid extracting its address. I've been doing some major tinkering today. The entry point of an executable is '_start'. So, first I wrote a new entry point in x64 assembler that could differentiate between GUI mode and console mode depending upon the value of 'argc': ; This file contains x86_64 assembler for NASM, also known as x64. ; This file contains two functions: ; static void print8bytes(uint64_t eight_chars,uint64_t new_line); ; extern void pre_start(int argc); section .text print8bytes: ; This is a function that returns void ; Two parameters: ; r9: The 8-byte string to print ; r8: If true, prints a trailing new line ; save all the register values we're going to use push rax push rsi push rdi push rdx ;zero out the registers we are going to need xor rax, rax xor rsi, rsi xor rdi, rdi xor rdx, rdx ;write(int fd, char *msg, unsigned int len) mov al, 1 add di, 1 mov rsi, r9 push rsi mov rsi, rsp mov dl, 8 ; Print 8 bytes at a time syscall pop rsi cmp r8, 1 ; check if r8 is true or false jl no_new_line ;zero out the registers we are going to need xor rax, rax xor rsi, rsi xor rdi, rdi xor rdx, rdx ;write(int fd, char *msg, unsigned int len) mov al, 1 add di, 1 mov rsi, 0x000000000000000a ; new line push rsi mov rsi, rsp mov dl, 1 ; Print just one byte syscall pop rsi no_new_line: ; just a jump label - not a function name pop rdx pop rdi pop rsi pop rax ret global pre_start:function pre_start: ; The 'argc' argument to 'main' is on the top of the stack so ; we will use the frame pointer 'rbp' to keep track of it. push rbp mov rbp, rsp push r9 ; save because we'll use it - pop it back later push r8 ; save because we'll use it - pop it back later mov r8, 0 ; false = don't put trailing new line mov r9, 0x3d3d3d3d3d3d3d3d ; "========" call print8bytes call print8bytes call print8bytes mov r9, 0x6174735f65727020 ; " pre_sta" call print8bytes cmp qword[rbp+8], 2 ; check if argc < 2 jl $+2+10+2 ; if argc < 2 then we want GUI mode mov r9, 0x646d63202d207472 ; "rt - cmd" jmp $+2+10 ; skip the next 10-byte instruction mov r9, 0x495547202d207472 ; "rt - GUI" call print8bytes mov r9, 0x3d3d3d3d3d3d3d3d ; "========" call print8bytes call print8bytes mov r8, 1 ; true = put trailing new line call print8bytes pop r8 pop r9 mov rsp, rbp pop rbp extern _start jmp _start If you see the last line there, I jump straight into _start. So then I build my program with a new entry point as follows: g++ -o prog prog.cpp object_file_from_assembler.o -e pre_start When I run it at the command line, the first thing I get is: ======================== pre_start - GUI======================== and then it continues execution as normal. No problems. So then the next thing I did was I used 'patchelf' to remove the NEEDED for the graphical user interface library: patchelf --remove-needed libgtk-3.so.0 ./prog And then I tried to run it again, but this time around I got back: ./ssh: symbol lookup error: ./ssh: undefined symbol: gtk_true This means that the program falls over ***before*** the entry point is reached. So the part of the Linux operating system that loads executable files is not even going into the entry point for my program, it's falling over before then. I need to stop this happening some how. Perhaps I can put dummy values in the GOT table so that the loader doesn't think they're null? On Tue, Mar 7, 2023 at 1:25 PM Alexander Monakov <amonakov@ispras.ru> wrote: > > > On Tue, 7 Mar 2023, Xi Ruoyao via Gcc-help wrote: > > > This won't work for a function pointer either: the value of the > > function pointer must be the address of foo itself, not foo@plt. Or the > > result of &foo in the shared library and the main executable will be > > different, violating the C or C++ standard. Then we must use > > R_X86_64_JUMP_SLOT which is handled by the dynamic linker. > > (you probably meant GLOB_DAT in the last statement) > > This paragraph is inaccurate: traditional non-PIC, non-PIE codegen uses > direct symbol references. So, when you have a direct reference to foo > in non-PIC main executable, the reference is resolved to its PLT slot, > and the address of that PLT slot becomes the canonical address of 'foo' > for the whole program. > > When the main executable is PIE, it may or may not have a PLT slot for > 'foo', and if it doesn't, the canonical address of 'foo' is its actual > implementation. > > Alexander ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-08 23:08 ` Frederick Virchanza Gotham @ 2023-03-09 2:47 ` Xi Ruoyao 2023-03-09 8:01 ` Frederick Virchanza Gotham 0 siblings, 1 reply; 9+ messages in thread From: Xi Ruoyao @ 2023-03-09 2:47 UTC (permalink / raw) To: cauldwell.thomas, gcc-help On Wed, 2023-03-08 at 23:08 +0000, Frederick Virchanza Gotham via Gcc-help wrote: > When I run it at the command line, the first thing I get is: > > ======================== pre_start - GUI======================== > > and then it continues execution as normal. No problems. > > So then the next thing I did was I used 'patchelf' to remove the > NEEDED for the graphical user interface library: > > patchelf --remove-needed libgtk-3.so.0 ./prog > > And then I tried to run it again, but this time around I got back: > > ./ssh: symbol lookup error: ./ssh: undefined symbol: gtk_true > > This means that the program falls over ***before*** the entry point is reached. It does not happen to me: $ cat t.c #include <gtk/gtk.h> int main() { volatile int flag = 0; if (flag) { volatile int r = gtk_true (); } } $ cc t.c -I /usr/include/gtk-3.0 -I /usr/include/glib-2.0 -I /usr/lib/glib-2.0/include -I/usr/include/pango-1.0 -I/usr/include/harfbuzz -I/usr/include/cairo -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/atk-1.0 $ patchelf --remove-needed libgtk-3.so.0 ./a.out $ objdump -d | grep 'call.*gtk' 117f: e8 ac fe ff ff call 1030 <gtk_true@plt> $ objdump -T | grep gtk 0000000000000000 DF *UND* 0000000000000000 Base gtk_true $ { readelf -d a.out | grep gtk; } || echo "nothing" nothing $ ./a.out && echo "fine" fine I guess it does not work for you because your distro has enabled BIND_NOW (-Wl,-z,now) by default. And anyway your question has nothing related to GCC. Try to find a more proper channel to discuss it. -- Xi Ruoyao <xry111@xry111.site> School of Aerospace Science and Technology, Xidian University ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-09 2:47 ` Xi Ruoyao @ 2023-03-09 8:01 ` Frederick Virchanza Gotham 2023-03-09 8:05 ` Xi Ruoyao 0 siblings, 1 reply; 9+ messages in thread From: Frederick Virchanza Gotham @ 2023-03-09 8:01 UTC (permalink / raw) To: gcc-help On Thu, Mar 9, 2023 at 2:48 AM Xi Ruoyao <xry111@xry111.site> wrote: > > I guess it does not work for you because your distro has enabled > BIND_NOW (-Wl,-z,now) by default. This can be gotten around as the GNU linker allows us to build the executable with -Wl,-z,lazy. > int main() > { > volatile int flag = 0; > if (flag) { > volatile int r = gtk_true (); > } > } Please add one more line to 'main': void *volatile p = (void*)gtk_true; and test it again. > And anyway your question has nothing related to GCC. Try to find a more > proper channel to discuss it. It is related to the GNU compiler suite, specifically the linker 'ld' and how it generates the tables (GLOBAL_DATA,JUMP_SLOT,got,plt). ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-09 8:01 ` Frederick Virchanza Gotham @ 2023-03-09 8:05 ` Xi Ruoyao 2023-03-09 9:27 ` Frederick Virchanza Gotham 0 siblings, 1 reply; 9+ messages in thread From: Xi Ruoyao @ 2023-03-09 8:05 UTC (permalink / raw) To: cauldwell.thomas, gcc-help On Thu, 2023-03-09 at 08:01 +0000, Frederick Virchanza Gotham via Gcc- help wrote: > Please add one more line to 'main': > > void *volatile p = (void*)gtk_true; > > and test it again. As I've explained you can't do that. > > And anyway your question has nothing related to GCC. Try to find a > more > > proper channel to discuss it. > > > It is related to the GNU compiler suite, specifically the linker 'ld' > and how it generates the tables (GLOBAL_DATA,JUMP_SLOT,got,plt). GNU linker is not a part of GCC. -- Xi Ruoyao <xry111@xry111.site> School of Aerospace Science and Technology, Xidian University ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-09 8:05 ` Xi Ruoyao @ 2023-03-09 9:27 ` Frederick Virchanza Gotham 2023-03-09 9:53 ` Andrew Haley 0 siblings, 1 reply; 9+ messages in thread From: Frederick Virchanza Gotham @ 2023-03-09 9:27 UTC (permalink / raw) To: Xi Ruoyao; +Cc: gcc-help On Thu, Mar 9, 2023 at 8:06 AM Xi Ruoyao <xry111@xry111.site> wrote: > > > GNU linker is not a part of GCC. I thought the name 'gcc' was used to lump everything in together, gcc + g++ + ld + cc1 and a few others. I went looking for a forum / mailing list specifically for 'ld' but I couldn't find one. Is there one? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Functions and Global Variables defined in Libraries 2023-03-09 9:27 ` Frederick Virchanza Gotham @ 2023-03-09 9:53 ` Andrew Haley 0 siblings, 0 replies; 9+ messages in thread From: Andrew Haley @ 2023-03-09 9:53 UTC (permalink / raw) To: gcc-help On 3/9/23 09:27, Frederick Virchanza Gotham via Gcc-help wrote: > I went looking for a forum / mailing list specifically for 'ld' but I > couldn't find one. Is there one? https://sourceware.org/mailman/listinfo/binutils -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-03-09 9:53 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-03-07 12:13 Functions and Global Variables defined in Libraries Frederick Virchanza Gotham 2023-03-07 13:05 ` Xi Ruoyao 2023-03-07 13:25 ` Alexander Monakov 2023-03-08 23:08 ` Frederick Virchanza Gotham 2023-03-09 2:47 ` Xi Ruoyao 2023-03-09 8:01 ` Frederick Virchanza Gotham 2023-03-09 8:05 ` Xi Ruoyao 2023-03-09 9:27 ` Frederick Virchanza Gotham 2023-03-09 9:53 ` Andrew Haley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).