public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* Why GROUP(...) rather than INPUT(...) is used here?
@ 2023-11-17  9:47 rednoah
  2023-11-20 15:23 ` Nick Clifton
  0 siblings, 1 reply; 3+ messages in thread
From: rednoah @ 2023-11-17  9:47 UTC (permalink / raw)
  To: binutils

[-- Attachment #1: Type: text/plain, Size: 3747 bytes --]


Hi all:
If adding -v in gcc linking process for a simple m.c building, it shows -lgcc_s
is used during link:
$ gcc -v m.c -Wl,-t
....
 /usr/lib/gcc/x86_64-linux-gnu/9/collect2 -plugin /usr/lib/gcc/x86_64-linux-gnu/9/liblto_plugin.so
 -plugin-opt=/usr/lib/gcc/x86_64-linux-gnu/9/lto-wrapper -plugin-opt=-fresolution=/tmp/cccTnaHb.res
 -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc
 -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --build-id --eh-frame-hdr -m elf_x86_64
 --hash-style=gnu --as-needed -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -z now -z relro
 /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/Scrt1.o
 /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o
 /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9
 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib
 -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/9/../../..
 /tmp/ccYxryTe.o -t -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc --push-state --as-needed -lgcc_s
 --pop-state /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o
/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/Scrt1.o
/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o
/usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o
/tmp/ccYxryTe.o
/usr/lib/gcc/x86_64-linux-gnu/9/libgcc.a <= libgcc.a is processed for the 1st time
/usr/lib/gcc/x86_64-linux-gnu/9/libgcc_s.so
/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libgcc_s.so.1
/usr/lib/gcc/x86_64-linux-gnu/9/libgcc.a <= libgcc.a is processed for the 2nd time
/usr/lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/libc.so
...
the "-Wl,-t" output shows libgcc.a is processed twice as marked above.

on ubuntu 20.04, libgcc_s.so is actually a script.
$ cat  /usr/lib/gcc/x86_64-linux-gnu/9/libgcc_s.so
/* GNU ld script
   Use the shared library, but some functions are only in
   the static library.  */
GROUP ( libgcc_s.so.1 -lgcc )

The "-lgcc" linked here is actually libgcc.a. "libgcc_s.so.1" is an elf
shared object so there is only one archive file "libgcc.a" in GROUP(...).
GROUP(...) equals to "--start-group ... --end-group" and it is to enclose
two or more archive files to search undefined symbols repeatedly. There
seems no need to uss it when there is only one archive file in it. Maybe
INPUT(...) is a better choice here?

To verify it I write some toy .c, there seems no problem:
m.c:
int _start(){
    return b1();
}
b0.c:
int b0() {
    return 0;
}
b1.c:
int b1() {
    return b0();
}
$ gcc -c -fpic *.c
$ ar rcs b.a b0.o b1.o <=b0.o is in front of b1.o
$ gcc -nostdlib -Wl,-t,-Map=1.map,-verbose m.o b.a
......
/usr/bin/ld: mode elf_x86_64
attempt to open m.o succeeded
m.o
attempt to open b.a succeeded
b.a
(b.a)b1.o
(b.a)b0.o

In archive b.a, b0.o is in front of b1.o. b1.o has reference to
b0 in b0.o. Although b.a is processed only once, no any error. I
guess it may be achieved by the archive index, which contains all
symbols of each .o in b.a.

The following is to emulate the afront mentioned gcc's way, the
m.o and b.a are enclosed by --start/end-group:
$ gcc -nostdlib -Wl,-t,-Map=1.map,-verbose '-Wl,-(' m.o b.a '-Wl,-)'
...
/usr/bin/ld: mode elf_x86_64
attempt to open m.o succeeded
m.o
attempt to open b.a succeeded
b.a
(b.a)b1.o
(b.a)b0.o
b.a

No error and b.a is proccessed twice, just like libgcc.a is
processed twice.
Could anyone help explain why libgcc_s.so script use GROUP(...)
for only one archive: GROUP ( libgcc_s.so.1 -lgcc ).
Can it be replaced with INPUT(...)?

Thanks

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Why GROUP(...) rather than INPUT(...) is used here?
  2023-11-17  9:47 Why GROUP(...) rather than INPUT(...) is used here? rednoah
@ 2023-11-20 15:23 ` Nick Clifton
  2023-11-22  0:00   ` rednoah
  0 siblings, 1 reply; 3+ messages in thread
From: Nick Clifton @ 2023-11-20 15:23 UTC (permalink / raw)
  To: rednoah, binutils

Hi rednoah,

> GROUP ( libgcc_s.so.1 -lgcc )
> 
> The "-lgcc" linked here is actually libgcc.a. "libgcc_s.so.1" is an elf
> shared object so there is only one archive file "libgcc.a" in GROUP(...).
> GROUP(...) equals to "--start-group ... --end-group" and it is to enclose
> two or more archive files to search undefined symbols repeatedly. There
> seems no need to uss it when there is only one archive file in it. Maybe
> INPUT(...) is a better choice here?

Well... there is a theoretical difference.  The thing is, the GROUP
construct means that if there is code in libgcc.a that needs functions
provided by libgcc_s.so.1 then it will be included even if there are
no other code that needs libgcc_s.so.1.

Here is a rather contrived example:

   $ cat main.c

   extern void atexit (int);
   int __dso_handle;
   int main (void)
   {
	atexit (0);
         return 0;
   }

  $ cat /usr/lib64/libc.so

   /* GNU ld script
      Use the shared library, but some functions are only in
      the static library, so try that secondarily.  */
   OUTPUT_FORMAT(elf64-x86-64)
   GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a  AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )

I am running this test on my Fedora 38 box, so the libraries are
slightly different from the ones you are using, but the test does
show my first point which is that main.c does not call any
functions in the shared C library (libc.so.6) but it does call
a function in the static C library (libc_nonshared.a):

So, if I compile and then link my program with the "libc.so" fake
C library, everything works:

   $ gcc -c -fPIC main.c
   $ ld -e 0 main.o -L/usr/lib64 --as-needed -lc

The GROUP command has caused the linker to deduce that the real
shared C library is needed:

   $ ldd a.out
   linux-vdso.so.1 (0x00007fff6665e000)
   libc.so.6 => /lib64/libc.so.6 (0x000014fb0b800000)
   /lib/ld64.so.1 => /lib64/ld-linux-x86-64.so.2 (0x000014fb0b9f9000)

But if I create an alternative version of libc.so that uses INPUT
directives instead of a GROUP directive:

   $ cat libfred.so

   /* GNU ld script
      Use the shared library, but some functions are only in
      the static library, so try that secondarily.  */
   OUTPUT_FORMAT(elf64-x86-64)
   INPUT(/lib64/libc.so.6)
   GROUP(/usr/lib64/libc_nonshared.a)
   INPUT(/lib64/ld-linux-x86-64.so.2)

and then try to use it...

   $ ld -e 0 main.o -L/usr/lib64 --as-needed -L. -lfred
   ld: /usr/lib64/libc_nonshared.a(atexit.oS): in function `atexit':
   (.text+0xe): undefined reference to `__cxa_atexit'

...the link fails.

Of course this can be fixed by moving the INPUT(/lib64/libc.so.6)
to after the GROUP(/usr/lib64/libc_nonshared.a).  But what if there
is a function in libc.so.6 that needs code in libc_nonshared.a ?

This is all mostly theoretical of course, since in real life
these circumstances are very unlikely to occur.  But why take the
chance ?  Using GROUP() works just as well as INPUT(), does not
cost much more (since there is only one static library involved
and it is not that big), and it means that the glibc maintainers
do not have to worry about some future scenario where an unexpected
dependency between the libraries does occur.

Cheers
   Nick






^ permalink raw reply	[flat|nested] 3+ messages in thread

* Why GROUP(...) rather than INPUT(...) is used here?
  2023-11-20 15:23 ` Nick Clifton
@ 2023-11-22  0:00   ` rednoah
  0 siblings, 0 replies; 3+ messages in thread
From: rednoah @ 2023-11-22  0:00 UTC (permalink / raw)
  To: Nick Clifton; +Cc: binutils

[-- Attachment #1: Type: text/plain, Size: 3442 bytes --]

Hi Nick:
I got it. Thanks very much for your detailed explanation.
regards
At 2023-11-20 23:23:06, "Nick Clifton" <nickc@redhat.com> wrote:
>Hi rednoah,
>
>> GROUP ( libgcc_s.so.1 -lgcc )
>> 
>> The "-lgcc" linked here is actually libgcc.a. "libgcc_s.so.1" is an elf
>> shared object so there is only one archive file "libgcc.a" in GROUP(...).
>> GROUP(...) equals to "--start-group ... --end-group" and it is to enclose
>> two or more archive files to search undefined symbols repeatedly. There
>> seems no need to uss it when there is only one archive file in it. Maybe
>> INPUT(...) is a better choice here?
>
>Well... there is a theoretical difference.  The thing is, the GROUP
>construct means that if there is code in libgcc.a that needs functions
>provided by libgcc_s.so.1 then it will be included even if there are
>no other code that needs libgcc_s.so.1.
>
>Here is a rather contrived example:
>
>   $ cat main.c
>
>   extern void atexit (int);
>   int __dso_handle;
>   int main (void)
>   {
>	atexit (0);
>         return 0;
>   }
>
>  $ cat /usr/lib64/libc.so
>
>   /* GNU ld script
>      Use the shared library, but some functions are only in
>      the static library, so try that secondarily.  */
>   OUTPUT_FORMAT(elf64-x86-64)
>   GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a  AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )
>
>I am running this test on my Fedora 38 box, so the libraries are
>slightly different from the ones you are using, but the test does
>show my first point which is that main.c does not call any
>functions in the shared C library (libc.so.6) but it does call
>a function in the static C library (libc_nonshared.a):
>
>So, if I compile and then link my program with the "libc.so" fake
>C library, everything works:
>
>   $ gcc -c -fPIC main.c
>   $ ld -e 0 main.o -L/usr/lib64 --as-needed -lc
>
>The GROUP command has caused the linker to deduce that the real
>shared C library is needed:
>
>   $ ldd a.out
>   linux-vdso.so.1 (0x00007fff6665e000)
>   libc.so.6 => /lib64/libc.so.6 (0x000014fb0b800000)
>   /lib/ld64.so.1 => /lib64/ld-linux-x86-64.so.2 (0x000014fb0b9f9000)
>
>But if I create an alternative version of libc.so that uses INPUT
>directives instead of a GROUP directive:
>
>   $ cat libfred.so
>
>   /* GNU ld script
>      Use the shared library, but some functions are only in
>      the static library, so try that secondarily.  */
>   OUTPUT_FORMAT(elf64-x86-64)
>   INPUT(/lib64/libc.so.6)
>   GROUP(/usr/lib64/libc_nonshared.a)
>   INPUT(/lib64/ld-linux-x86-64.so.2)
>
>and then try to use it...
>
>   $ ld -e 0 main.o -L/usr/lib64 --as-needed -L. -lfred
>   ld: /usr/lib64/libc_nonshared.a(atexit.oS): in function `atexit':
>   (.text+0xe): undefined reference to `__cxa_atexit'
>
>...the link fails.
>
>Of course this can be fixed by moving the INPUT(/lib64/libc.so.6)
>to after the GROUP(/usr/lib64/libc_nonshared.a).  But what if there
>is a function in libc.so.6 that needs code in libc_nonshared.a ?
>
>This is all mostly theoretical of course, since in real life
>these circumstances are very unlikely to occur.  But why take the
>chance ?  Using GROUP() works just as well as INPUT(), does not
>cost much more (since there is only one static library involved
>and it is not that big), and it means that the glibc maintainers
>do not have to worry about some future scenario where an unexpected
>dependency between the libraries does occur.
>
>Cheers
>   Nick
>
>
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-11-22  0:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-17  9:47 Why GROUP(...) rather than INPUT(...) is used here? rednoah
2023-11-20 15:23 ` Nick Clifton
2023-11-22  0:00   ` rednoah

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).