* [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" @ 2022-05-24 13:46 Wilco Dijkstra 2022-05-24 17:28 ` maskray 2022-05-24 21:58 ` H.J. Lu 0 siblings, 2 replies; 21+ messages in thread From: Wilco Dijkstra @ 2022-05-24 13:46 UTC (permalink / raw) To: Szabolcs Nagy, maskray; +Cc: 'GNU C Library' Hi, >> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: >> no GOT-generating relocation in the first place. We should change GCC's behaviour to match this - is this something that applies to all targets? >> * gold and lld reject copy relocation on a STV_PROTECTED symbol. >> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses >> GOT-generating relocation when accessing an default visibility >> external symbol which avoids copy relocation. Would it be reasonable to add a way to override settings for binaries? For example if all imported symbols are marked with the correct visibility, PIE binaries could avoid using GOT for default visibility external symbols to get better performance. And non-PIE binaries could force GOT accesses for non-default visibility to avoid copy relocations and support protected visibility. Cheers, Wilco ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-24 13:46 [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" Wilco Dijkstra @ 2022-05-24 17:28 ` maskray 2022-05-24 21:58 ` H.J. Lu 1 sibling, 0 replies; 21+ messages in thread From: maskray @ 2022-05-24 17:28 UTC (permalink / raw) To: Wilco Dijkstra; +Cc: Szabolcs Nagy, 'GNU C Library' On 2022-05-24, Wilco Dijkstra wrote: >Hi, Hi Wilco, >>> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: >>> no GOT-generating relocation in the first place. > >We should change GCC's behaviour to match this - is this something that >applies to all targets? I have a blog post on the topic about copy relocations and protected symbols: https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected "The GCC pessimization applies to all ports with #define TARGET_BINDS_LOCAL_P default_binds_local_p_2." The aarch64 pessimization was due to the side affect of a commit attempting to fix another issue: commit cbddf64c0243816b45e6680754a251c603245dbc (From-SVN: r222992) >>> * gold and lld reject copy relocation on a STV_PROTECTED symbol. >>> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses >>> GOT-generating relocation when accessing an default visibility >>> external symbol which avoids copy relocation. > >Would it be reasonable to add a way to override settings for binaries? >For example if all imported symbols are marked with the correct visibility, >PIE binaries could avoid using GOT for default visibility external symbols to >get better performance. And non-PIE binaries could force GOT accesses for >non-default visibility to avoid copy relocations and support protected visibility. In Clang, -fno-direct-access-external-data is such an option:) --- Regarding function symbols, there is a bit which should be fixed: "Protected function symbols and canonical PLT entries" ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-24 13:46 [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" Wilco Dijkstra 2022-05-24 17:28 ` maskray @ 2022-05-24 21:58 ` H.J. Lu 2022-05-25 17:13 ` Wilco Dijkstra 1 sibling, 1 reply; 21+ messages in thread From: H.J. Lu @ 2022-05-24 21:58 UTC (permalink / raw) To: Wilco Dijkstra; +Cc: Szabolcs Nagy, maskray, GNU C Library On Tue, May 24, 2022 at 6:47 AM Wilco Dijkstra via Libc-alpha <libc-alpha@sourceware.org> wrote: > > Hi, > > >> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: > >> no GOT-generating relocation in the first place. > > We should change GCC's behaviour to match this - is this something that > applies to all targets? > > >> * gold and lld reject copy relocation on a STV_PROTECTED symbol. > >> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses > >> GOT-generating relocation when accessing an default visibility > >> external symbol which avoids copy relocation. > > Would it be reasonable to add a way to override settings for binaries? > For example if all imported symbols are marked with the correct visibility, > PIE binaries could avoid using GOT for default visibility external symbols to > get better performance. And non-PIE binaries could force GOT accesses for > non-default visibility to avoid copy relocations and support protected visibility. All imported symbols can be marked with the default visibility and all exported symbols, in both executables and shared libraries, can be marked with the protected visibility. These require code changes. -- H.J. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-24 21:58 ` H.J. Lu @ 2022-05-25 17:13 ` Wilco Dijkstra 2022-05-25 18:21 ` Florian Weimer 2022-05-25 20:10 ` maskray 0 siblings, 2 replies; 21+ messages in thread From: Wilco Dijkstra @ 2022-05-25 17:13 UTC (permalink / raw) To: H.J. Lu; +Cc: Szabolcs Nagy, maskray, GNU C Library Hi H.J., > All imported symbols can be marked with the default visibility and all > exported symbols, in both executables and shared libraries, can be > marked with the protected visibility. These require code changes. I meant doing this automatically using an option so most code requires no source changes. If commonly used libraries mark their exported symbols, most code (PIC, PIE and non-PIE) could be compiled using this option and produce efficient code without copy relocations and only using GOT indirections when needed (ie. accessing an exported symbol in another .so). It would also imply -fno-semantic-interposition for non-exported symbols. Currently there is no way to achieve this using options (eg. -fvisibility only affects definitions), and LLVM and GCC disagree on many details. Cheers, Wilco ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-25 17:13 ` Wilco Dijkstra @ 2022-05-25 18:21 ` Florian Weimer 2022-05-25 20:44 ` H.J. Lu 2022-05-25 20:10 ` maskray 1 sibling, 1 reply; 21+ messages in thread From: Florian Weimer @ 2022-05-25 18:21 UTC (permalink / raw) To: Wilco Dijkstra via Libc-alpha; +Cc: H.J. Lu, Wilco Dijkstra, Szabolcs Nagy * Wilco Dijkstra via Libc-alpha: > Hi H.J., > >> All imported symbols can be marked with the default visibility and all >> exported symbols, in both executables and shared libraries, can be >> marked with the protected visibility. These require code changes. > > I meant doing this automatically using an option so most code requires no > source changes. If commonly used libraries mark their exported symbols, > most code (PIC, PIE and non-PIE) could be compiled using this option and > produce efficient code without copy relocations and only using GOT > indirections when needed (ie. accessing an exported symbol in another .so). > It would also imply -fno-semantic-interposition for non-exported symbols. > Currently there is no way to achieve this using options (eg. -fvisibility only > affects definitions), and LLVM and GCC disagree on many details. What about -flto (with a linker plugin)? Thanks, Florian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-25 18:21 ` Florian Weimer @ 2022-05-25 20:44 ` H.J. Lu 2022-05-26 19:17 ` Wilco Dijkstra 0 siblings, 1 reply; 21+ messages in thread From: H.J. Lu @ 2022-05-25 20:44 UTC (permalink / raw) To: Florian Weimer Cc: Wilco Dijkstra via Libc-alpha, Wilco Dijkstra, Szabolcs Nagy On Wed, May 25, 2022 at 11:21 AM Florian Weimer <fweimer@redhat.com> wrote: > > * Wilco Dijkstra via Libc-alpha: > > > Hi H.J., > > > >> All imported symbols can be marked with the default visibility and all > >> exported symbols, in both executables and shared libraries, can be > >> marked with the protected visibility. These require code changes. > > > > I meant doing this automatically using an option so most code requires no > > source changes. If commonly used libraries mark their exported symbols, "export" describes what happens to a symbol at the library build time. An exported symbol can be imported to a library user. Without LTO, compilers can't tell if an external symbol will be imported or exported. > > most code (PIC, PIE and non-PIE) could be compiled using this option and > > produce efficient code without copy relocations and only using GOT > > indirections when needed (ie. accessing an exported symbol in another .so). > > It would also imply -fno-semantic-interposition for non-exported symbols. > > Currently there is no way to achieve this using options (eg. -fvisibility only > > affects definitions), and LLVM and GCC disagree on many details. > > What about -flto (with a linker plugin)? > LTO certainly works. An undefined symbol will be imported from a shared library. -- H.J. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-25 20:44 ` H.J. Lu @ 2022-05-26 19:17 ` Wilco Dijkstra 2022-05-26 19:25 ` Florian Weimer 0 siblings, 1 reply; 21+ messages in thread From: Wilco Dijkstra @ 2022-05-26 19:17 UTC (permalink / raw) To: H.J. Lu, Florian Weimer; +Cc: Wilco Dijkstra via Libc-alpha, Szabolcs Nagy Hi, >> What about -flto (with a linker plugin)? > > > LTO certainly works. An undefined symbol will be imported from a shared > library. LTO allows you to determine which symbols are imported indeed, so the compiler can use a GOT indirection without requiring explicit import annotations. However LTO does not solve all the code quality issues of FPIC - there is still no inlining, internal calls indirect via the PLT and non-static globals use a GOT indirection. The goal here is to have a new option that generates efficient code by default that just works for most projects. Cheers, Wilco ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-26 19:17 ` Wilco Dijkstra @ 2022-05-26 19:25 ` Florian Weimer 2022-05-26 20:03 ` Wilco Dijkstra 0 siblings, 1 reply; 21+ messages in thread From: Florian Weimer @ 2022-05-26 19:25 UTC (permalink / raw) To: Wilco Dijkstra; +Cc: H.J. Lu, Wilco Dijkstra via Libc-alpha, Szabolcs Nagy * Wilco Dijkstra: > Hi, > >>> What about -flto (with a linker plugin)? >> >> >> LTO certainly works. An undefined symbol will be imported from a shared >> library. > > LTO allows you to determine which symbols are imported indeed, so the > compiler can use a GOT indirection without requiring explicit import > annotations. > > However LTO does not solve all the code quality issues of FPIC - there > is still no inlining, internal calls indirect via the PLT and > non-static globals use a GOT indirection. Could you share a small example that exhibits these shortcomings? Thanks, Florian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-26 19:25 ` Florian Weimer @ 2022-05-26 20:03 ` Wilco Dijkstra 2022-05-26 21:27 ` H.J. Lu ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Wilco Dijkstra @ 2022-05-26 20:03 UTC (permalink / raw) To: Florian Weimer; +Cc: H.J. Lu, Wilco Dijkstra via Libc-alpha, Szabolcs Nagy Hi Florian, Sure, something basic like this shows the issues: int x; int f(void) { return ++x; } int main(void) { return f(); } compile with -O2 -fPIC -flto -shared: 00000000000004f0 <f@plt>: 4f0: b0000090 adrp x16, 11000 <__cxa_finalize@GLIBC_2.17> 4f4: f9400611 ldr x17, [x16, #8] 4f8: 91002210 add x16, x16, #0x8 4fc: d61f0220 br x17 0000000000000510 <main>: 510: 17fffff8 b 4f0 <f@plt> 0000000000000600 <f>: 600: 90000081 adrp x1, 10000 <__FRAME_END__+0xf8f8> 604: f947e821 ldr x1, [x1, #4048] 608: b9400020 ldr w0, [x1] 60c: 11000400 add w0, w0, #0x1 610: b9000020 str w0, [x1] 614: d65f03c0 ret So f() does not get inlined into main, it is redirected via PLT, and it uses a GOT indirection to access the global. The underlying problem is that ELF assumes by default that you want interposition/export for all symbols. In reality that is almost never needed. LLVM makes -fno-semantic-interposition the default, which solves the PLT and inlining issue, but not the unnecessary GOT indirections. Cheers, Wilco ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-26 20:03 ` Wilco Dijkstra @ 2022-05-26 21:27 ` H.J. Lu 2022-05-27 12:43 ` Florian Weimer 2022-05-31 7:42 ` Fangrui Song 2 siblings, 0 replies; 21+ messages in thread From: H.J. Lu @ 2022-05-26 21:27 UTC (permalink / raw) To: Wilco Dijkstra Cc: Florian Weimer, Wilco Dijkstra via Libc-alpha, Szabolcs Nagy On Thu, May 26, 2022 at 1:03 PM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > > Hi Florian, > > Sure, something basic like this shows the issues: > > int x; > int f(void) { return ++x; } > int main(void) { return f(); } > > compile with -O2 -fPIC -flto -shared: -fPIC is the issue. x can be marked as "export" with the protected visibility. GCC 12 with -fvisibility=protected -flto -mno-direct-extern-access works on x86-64: 0000000000001040 <main>: 1040: 8b 05 d6 2f 00 00 mov 0x2fd6(%rip),%eax # 401c <x> 1046: 83 c0 01 add $0x1,%eax 1049: 89 05 cd 2f 00 00 mov %eax,0x2fcd(%rip) # 401c <x> 104f: c3 ret 0000000000001110 <f>: 1110: 8b 05 06 2f 00 00 mov 0x2f06(%rip),%eax # 401c <x> 1116: 83 c0 01 add $0x1,%eax 1119: 89 05 fd 2e 00 00 mov %eax,0x2efd(%rip) # 401c <x> 111f: c3 ret > > 00000000000004f0 <f@plt>: > 4f0: b0000090 adrp x16, 11000 <__cxa_finalize@GLIBC_2.17> > 4f4: f9400611 ldr x17, [x16, #8] > 4f8: 91002210 add x16, x16, #0x8 > 4fc: d61f0220 br x17 > > 0000000000000510 <main>: > 510: 17fffff8 b 4f0 <f@plt> > > 0000000000000600 <f>: > 600: 90000081 adrp x1, 10000 <__FRAME_END__+0xf8f8> > 604: f947e821 ldr x1, [x1, #4048] > 608: b9400020 ldr w0, [x1] > 60c: 11000400 add w0, w0, #0x1 > 610: b9000020 str w0, [x1] > 614: d65f03c0 ret > > So f() does not get inlined into main, it is redirected via PLT, and it uses a GOT > indirection to access the global. The underlying problem is that ELF assumes by > default that you want interposition/export for all symbols. In reality that is > almost never needed. LLVM makes -fno-semantic-interposition the default, > which solves the PLT and inlining issue, but not the unnecessary GOT indirections. > > Cheers, > Wilco -- H.J. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-26 20:03 ` Wilco Dijkstra 2022-05-26 21:27 ` H.J. Lu @ 2022-05-27 12:43 ` Florian Weimer 2022-05-31 2:03 ` H.J. Lu 2022-05-31 7:42 ` Fangrui Song 2 siblings, 1 reply; 21+ messages in thread From: Florian Weimer @ 2022-05-27 12:43 UTC (permalink / raw) To: Wilco Dijkstra; +Cc: H.J. Lu, Wilco Dijkstra via Libc-alpha, Szabolcs Nagy * Wilco Dijkstra: > Hi Florian, > > Sure, something basic like this shows the issues: > > int x; > int f(void) { return ++x; } > int main(void) { return f(); } > > compile with -O2 -fPIC -flto -shared: > > 00000000000004f0 <f@plt>: > 4f0: b0000090 adrp x16, 11000 <__cxa_finalize@GLIBC_2.17> > 4f4: f9400611 ldr x17, [x16, #8] > 4f8: 91002210 add x16, x16, #0x8 > 4fc: d61f0220 br x17 > > 0000000000000510 <main>: > 510: 17fffff8 b 4f0 <f@plt> > > 0000000000000600 <f>: > 600: 90000081 adrp x1, 10000 <__FRAME_END__+0xf8f8> > 604: f947e821 ldr x1, [x1, #4048] > 608: b9400020 ldr w0, [x1] > 60c: 11000400 add w0, w0, #0x1 > 610: b9000020 str w0, [x1] > 614: d65f03c0 ret Can you link with a version script? { local: *; global: main; }; Exporting symbols inhibits some optimizations even if interposition is assumed not to happen because the behavior of public entry points needs to be preserved. With a version script, the set of entry points can be greatly reduced, enabling further optimizations. Thanks, Florian ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-27 12:43 ` Florian Weimer @ 2022-05-31 2:03 ` H.J. Lu 2022-05-31 7:49 ` Fangrui Song 0 siblings, 1 reply; 21+ messages in thread From: H.J. Lu @ 2022-05-31 2:03 UTC (permalink / raw) To: Florian Weimer Cc: Wilco Dijkstra, Wilco Dijkstra via Libc-alpha, Szabolcs Nagy On Fri, May 27, 2022 at 5:43 AM Florian Weimer <fweimer@redhat.com> wrote: > > * Wilco Dijkstra: > > > Hi Florian, > > > > Sure, something basic like this shows the issues: > > > > int x; > > int f(void) { return ++x; } > > int main(void) { return f(); } > > > > compile with -O2 -fPIC -flto -shared: > > > > 00000000000004f0 <f@plt>: > > 4f0: b0000090 adrp x16, 11000 <__cxa_finalize@GLIBC_2.17> > > 4f4: f9400611 ldr x17, [x16, #8] > > 4f8: 91002210 add x16, x16, #0x8 > > 4fc: d61f0220 br x17 > > > > 0000000000000510 <main>: > > 510: 17fffff8 b 4f0 <f@plt> > > > > 0000000000000600 <f>: > > 600: 90000081 adrp x1, 10000 <__FRAME_END__+0xf8f8> > > 604: f947e821 ldr x1, [x1, #4048] > > 608: b9400020 ldr w0, [x1] > > 60c: 11000400 add w0, w0, #0x1 > > 610: b9000020 str w0, [x1] > > 614: d65f03c0 ret > > Can you link with a version script? > > { local: *; global: main; }; > > Exporting symbols inhibits some optimizations even if interposition is > assumed not to happen because the behavior of public entry points needs > to be preserved. With a version script, the set of entry points can be > greatly reduced, enabling further optimizations. > > Thanks, > Florian > Should -fno-semantic-interposition imply -fvisibility=protected -mno-direct-extern-access? -- H.J. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-31 2:03 ` H.J. Lu @ 2022-05-31 7:49 ` Fangrui Song 2022-05-31 9:42 ` Wilco Dijkstra 2022-05-31 13:47 ` H.J. Lu 0 siblings, 2 replies; 21+ messages in thread From: Fangrui Song @ 2022-05-31 7:49 UTC (permalink / raw) To: H.J. Lu; +Cc: Florian Weimer, Szabolcs Nagy, libc-alpha, Wilco Dijkstra On 2022-05-30, H.J. Lu via Libc-alpha wrote: >On Fri, May 27, 2022 at 5:43 AM Florian Weimer <fweimer@redhat.com> wrote: >> >> * Wilco Dijkstra: >> >> > Hi Florian, >> > >> > Sure, something basic like this shows the issues: >> > >> > int x; >> > int f(void) { return ++x; } >> > int main(void) { return f(); } >> > >> > compile with -O2 -fPIC -flto -shared: >> > >> > 00000000000004f0 <f@plt>: >> > 4f0: b0000090 adrp x16, 11000 <__cxa_finalize@GLIBC_2.17> >> > 4f4: f9400611 ldr x17, [x16, #8] >> > 4f8: 91002210 add x16, x16, #0x8 >> > 4fc: d61f0220 br x17 >> > >> > 0000000000000510 <main>: >> > 510: 17fffff8 b 4f0 <f@plt> >> > >> > 0000000000000600 <f>: >> > 600: 90000081 adrp x1, 10000 <__FRAME_END__+0xf8f8> >> > 604: f947e821 ldr x1, [x1, #4048] >> > 608: b9400020 ldr w0, [x1] >> > 60c: 11000400 add w0, w0, #0x1 >> > 610: b9000020 str w0, [x1] >> > 614: d65f03c0 ret >> >> Can you link with a version script? >> >> { local: *; global: main; }; >> >> Exporting symbols inhibits some optimizations even if interposition is >> assumed not to happen because the behavior of public entry points needs >> to be preserved. With a version script, the set of entry points can be >> greatly reduced, enabling further optimizations. >> >> Thanks, >> Florian >> > >Should -fno-semantic-interposition imply -fvisibility=protected >-mno-direct-extern-access? Unfortunately, no. -fvisibility=protected imposes stricter requirement than -fno-semantic-interposition. See https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected#protected-function-symbols-and-canonical-plt-entries __attribute__((visibility("protected"))) void *foo() { return (void *)foo; } GNU ld does not support this on several architectures, including the common x86-32/x86-64/aarch64. % gcc -m32 -fpic -shared -fuse-ld=bfd b.c /usr/bin/ld.bfd: /tmp/cc3Ay0Gh.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object /usr/bin/ld.bfd: final link failed: bad value collect2: error: ld returned 1 exit status % aarch64-linux-gnu-gcc -fpic -shared -fuse-ld=bfd b.c /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/bin/ld.bfd: /tmp/ccJf24eh.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `foo' which may bind externally can not be used when making a shared object; recompile with -fPIC /tmp/ccJf24eh.o: in function `foo': b.c:(.text+0x0): dangerous relocation: unsupported relocation collect2: error: ld returned 1 exit status ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-31 7:49 ` Fangrui Song @ 2022-05-31 9:42 ` Wilco Dijkstra 2022-05-31 13:47 ` H.J. Lu 1 sibling, 0 replies; 21+ messages in thread From: Wilco Dijkstra @ 2022-05-31 9:42 UTC (permalink / raw) To: Fangrui Song, H.J. Lu; +Cc: Florian Weimer, Szabolcs Nagy, libc-alpha Hi Fangrui, >>Should -fno-semantic-interposition imply -fvisibility=protected >>-mno-direct-extern-access? > > Unfortunately, no. -fvisibility=protected imposes stricter requirement > than -fno-semantic-interposition. See > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected#protected-function-symbols-and-canonical-plt-entries > > __attribute__((visibility("protected"))) void *foo() { > return (void *)foo; > } > > GNU ld does not support this on several architectures, including the common x86-32/x86-64/aarch64. So you're saying this should only give an error if the executable takes the address of foo without a GOT indirection? In principle compilers could use a GOT indirection when taking the address of function symbols to make this work (either in the executable or for protected function symbols in shared objects). Note in the executable case, the linker can relax the GOT relocations so it would have almost zero overhead. Wilco ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-31 7:49 ` Fangrui Song 2022-05-31 9:42 ` Wilco Dijkstra @ 2022-05-31 13:47 ` H.J. Lu 1 sibling, 0 replies; 21+ messages in thread From: H.J. Lu @ 2022-05-31 13:47 UTC (permalink / raw) To: Fangrui Song; +Cc: Florian Weimer, Szabolcs Nagy, GNU C Library, Wilco Dijkstra On Tue, May 31, 2022 at 12:49 AM Fangrui Song <maskray@google.com> wrote: > > On 2022-05-30, H.J. Lu via Libc-alpha wrote: > >On Fri, May 27, 2022 at 5:43 AM Florian Weimer <fweimer@redhat.com> wrote: > >> > >> * Wilco Dijkstra: > >> > >> > Hi Florian, > >> > > >> > Sure, something basic like this shows the issues: > >> > > >> > int x; > >> > int f(void) { return ++x; } > >> > int main(void) { return f(); } > >> > > >> > compile with -O2 -fPIC -flto -shared: > >> > > >> > 00000000000004f0 <f@plt>: > >> > 4f0: b0000090 adrp x16, 11000 <__cxa_finalize@GLIBC_2.17> > >> > 4f4: f9400611 ldr x17, [x16, #8] > >> > 4f8: 91002210 add x16, x16, #0x8 > >> > 4fc: d61f0220 br x17 > >> > > >> > 0000000000000510 <main>: > >> > 510: 17fffff8 b 4f0 <f@plt> > >> > > >> > 0000000000000600 <f>: > >> > 600: 90000081 adrp x1, 10000 <__FRAME_END__+0xf8f8> > >> > 604: f947e821 ldr x1, [x1, #4048] > >> > 608: b9400020 ldr w0, [x1] > >> > 60c: 11000400 add w0, w0, #0x1 > >> > 610: b9000020 str w0, [x1] > >> > 614: d65f03c0 ret > >> > >> Can you link with a version script? > >> > >> { local: *; global: main; }; > >> > >> Exporting symbols inhibits some optimizations even if interposition is > >> assumed not to happen because the behavior of public entry points needs > >> to be preserved. With a version script, the set of entry points can be > >> greatly reduced, enabling further optimizations. > >> > >> Thanks, > >> Florian > >> > > > >Should -fno-semantic-interposition imply -fvisibility=protected > >-mno-direct-extern-access? > > Unfortunately, no. -fvisibility=protected imposes stricter requirement > than -fno-semantic-interposition. See > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected#protected-function-symbols-and-canonical-plt-entries > > __attribute__((visibility("protected"))) void *foo() { > return (void *)foo; > } > > GNU ld does not support this on several architectures, including the common x86-32/x86-64/aarch64. > > % gcc -m32 -fpic -shared -fuse-ld=bfd b.c > /usr/bin/ld.bfd: /tmp/cc3Ay0Gh.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object > /usr/bin/ld.bfd: final link failed: bad value > collect2: error: ld returned 1 exit status Add -mno-direct-extern-access with binutils 2.39 or 2.38 branch works. > % aarch64-linux-gnu-gcc -fpic -shared -fuse-ld=bfd b.c > /usr/lib/gcc-cross/aarch64-linux-gnu/11/../../../../aarch64-linux-gnu/bin/ld.bfd: /tmp/ccJf24eh.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `foo' which may bind externally can not be used when making a shared object; recompile with -fPIC > /tmp/ccJf24eh.o: in function `foo': > b.c:(.text+0x0): dangerous relocation: unsupported relocation > collect2: error: ld returned 1 exit status > > -- H.J. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-26 20:03 ` Wilco Dijkstra 2022-05-26 21:27 ` H.J. Lu 2022-05-27 12:43 ` Florian Weimer @ 2022-05-31 7:42 ` Fangrui Song 2 siblings, 0 replies; 21+ messages in thread From: Fangrui Song @ 2022-05-31 7:42 UTC (permalink / raw) To: Wilco Dijkstra Cc: Florian Weimer, Szabolcs Nagy, Wilco Dijkstra via Libc-alpha On 2022-05-26, Wilco Dijkstra via Libc-alpha wrote: >Hi Florian, > >Sure, something basic like this shows the issues: > >int x; >int f(void) { return ++x; } >int main(void) { return f(); } > >compile with -O2 -fPIC -flto -shared: > >00000000000004f0 <f@plt>: > 4f0: b0000090 adrp x16, 11000 <__cxa_finalize@GLIBC_2.17> > 4f4: f9400611 ldr x17, [x16, #8] > 4f8: 91002210 add x16, x16, #0x8 > 4fc: d61f0220 br x17 > >0000000000000510 <main>: > 510: 17fffff8 b 4f0 <f@plt> > >0000000000000600 <f>: > 600: 90000081 adrp x1, 10000 <__FRAME_END__+0xf8f8> > 604: f947e821 ldr x1, [x1, #4048] > 608: b9400020 ldr w0, [x1] > 60c: 11000400 add w0, w0, #0x1 > 610: b9000020 str w0, [x1] > 614: d65f03c0 ret > >So f() does not get inlined into main, it is redirected via PLT, and it uses a GOT >indirection to access the global. The underlying problem is that ELF assumes by >default that you want interposition/export for all symbols. In reality that is >almost never needed. > >LLVM makes -fno-semantic-interposition the default, >which solves the PLT and inlining issue, but not the unnecessary GOT indirections. LLVM/Clang's behavior is more complex, but stating "-fno-semantic-interposition the default" is incorrect. See https://maskray.me/blog/2021-05-09-fno-semantic-interposition#clang--fno-semantic-interposition https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup#Detailed_Description used to be incorrect, and I sent them the updates we can see today. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-25 17:13 ` Wilco Dijkstra 2022-05-25 18:21 ` Florian Weimer @ 2022-05-25 20:10 ` maskray 1 sibling, 0 replies; 21+ messages in thread From: maskray @ 2022-05-25 20:10 UTC (permalink / raw) To: Wilco Dijkstra; +Cc: H.J. Lu, Szabolcs Nagy, GNU C Library On 2022-05-25, Wilco Dijkstra wrote: >Hi H.J., > >> All imported symbols can be marked with the default visibility and all >> exported symbols, in both executables and shared libraries, can be >> marked with the protected visibility. These require code changes. > >I meant doing this automatically using an option so most code requires no >source changes. If commonly used libraries mark their exported symbols, >most code (PIC, PIE and non-PIE) could be compiled using this option and >produce efficient code without copy relocations and only using GOT >indirections when needed (ie. accessing an exported symbol in another .so). >It would also imply -fno-semantic-interposition for non-exported symbols. >Currently there is no way to achieve this using options (eg. -fvisibility only >affects definitions), and LLVM and GCC disagree on many details. > >Cheers, >Wilco Working out-of-the-box with -fpic and -fpie is trivial. Working out-of-the-box with -fno-pic is not, as some code may rely on the traditional codegen behavior that absolute resolutions are used. I think the situation is worse on x86-32. For protected data symbols, it's not a problem: (a) Exported data symbols from shared object are rare. This is discouraged as exported data symbols is part of ABI and makes DSO upgrading difficult. (b) Making exported data symbols exported is even rarer. (c) In addition, many aarch64 OSes (Android, Chrome OS, FreeBSD, some musl+llvm based Linux distributions, etc) use lld which does not allow copy relocation on protected data symbols. I haven't heard they have found any breakage due to the lld linker error. (d) -fno-pic is out of favor now with the adoption of default -fpie. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 0/3] Simplify ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and revert aarch64/arm's extern protected data handling @ 2022-05-01 6:06 Fangrui Song 2022-05-01 6:06 ` [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" Fangrui Song 0 siblings, 1 reply; 21+ messages in thread From: Fangrui Song @ 2022-05-01 6:06 UTC (permalink / raw) To: libc-alpha, Adhemerval Zanella, Szabolcs Nagy Say both a.so and b.so define protected var and the executable copy relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange semantics: a.so accesses the copy in the executable while b.so accesses its own. This behavior requires that (a) the compiler emits GOT-generating relocations (b) the linker produces GLOB_DAT instead of RELATIVE. Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT will bind to the executable (normal behavior). For aarch64/arm it makes sense to restore the original behavior and don't pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very unlikely used by anyone. * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: no GOT-generating relocation in the first place. * gold and lld reject copy relocation on a STV_PROTECTED symbol. * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses GOT-generating relocation when accessing an default visibility external symbol which avoids copy relocation. Fangrui Song (3): elf: Remove ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA check for non-DL_EXTERN_PROTECTED_DATA ports Revert "[AArch64][BZ #17711] Fix extern protected data handling" Revert "[ARM][BZ #17711] Fix extern protected data handling" elf/dl-lookup.c | 46 ++++++++++++------------------------ sysdeps/aarch64/dl-machine.h | 13 +++++----- sysdeps/aarch64/dl-sysdep.h | 2 -- sysdeps/arm/dl-machine.h | 10 +++----- sysdeps/arm/dl-sysdep.h | 2 -- 5 files changed, 24 insertions(+), 49 deletions(-) -- 2.36.0.464.gb9c8b46e94-goog ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-01 6:06 [PATCH 0/3] Simplify ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and revert aarch64/arm's extern protected data handling Fangrui Song @ 2022-05-01 6:06 ` Fangrui Song 2022-05-23 20:10 ` Szabolcs Nagy 0 siblings, 1 reply; 21+ messages in thread From: Fangrui Song @ 2022-05-01 6:06 UTC (permalink / raw) To: libc-alpha, Adhemerval Zanella, Szabolcs Nagy This reverts commit 0910702c4d2cf9e8302b35c9519548726e1ac489. Say both a.so and b.so define protected var and the executable copy relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange semantics: a.so accesses the copy in the executable while b.so accesses its own. This behavior requires that (a) the compiler emits GOT-generating relocations (b) the linker produces GLOB_DAT instead of RELATIVE. Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT will bind to the executable (normal behavior). For aarch64 it makes sense to restore the original behavior and don't pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very unlikely used by anyone. * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: no GOT-generating relocation in the first place. * gold and lld reject copy relocation on a STV_PROTECTED symbol. * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses GOT-generating relocation when accessing an default visibility external symbol which avoids copy relocation. --- sysdeps/aarch64/dl-machine.h | 13 ++++++------- sysdeps/aarch64/dl-sysdep.h | 2 -- 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/sysdeps/aarch64/dl-machine.h b/sysdeps/aarch64/dl-machine.h index b40050a981..530952a736 100644 --- a/sysdeps/aarch64/dl-machine.h +++ b/sysdeps/aarch64/dl-machine.h @@ -182,13 +182,12 @@ _dl_start_user: \n\ "); #define elf_machine_type_class(type) \ - ((((type) == AARCH64_R(JUMP_SLOT) \ - || (type) == AARCH64_R(TLS_DTPMOD) \ - || (type) == AARCH64_R(TLS_DTPREL) \ - || (type) == AARCH64_R(TLS_TPREL) \ - || (type) == AARCH64_R(TLSDESC)) * ELF_RTYPE_CLASS_PLT) \ - | (((type) == AARCH64_R(COPY)) * ELF_RTYPE_CLASS_COPY) \ - | (((type) == AARCH64_R(GLOB_DAT)) * ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA)) + ((((type) == R_AARCH64_JUMP_SLOT || \ + (type) == R_AARCH64_TLS_DTPMOD || \ + (type) == R_AARCH64_TLS_DTPREL || \ + (type) == R_AARCH64_TLS_TPREL || \ + (type) == R_AARCH64_TLSDESC) * ELF_RTYPE_CLASS_PLT) \ + | (((type) == R_AARCH64_COPY) * ELF_RTYPE_CLASS_COPY)) #define ELF_MACHINE_JMP_SLOT AARCH64_R(JUMP_SLOT) diff --git a/sysdeps/aarch64/dl-sysdep.h b/sysdeps/aarch64/dl-sysdep.h index 667786671c..ac69f414f3 100644 --- a/sysdeps/aarch64/dl-sysdep.h +++ b/sysdeps/aarch64/dl-sysdep.h @@ -21,5 +21,3 @@ /* _dl_argv cannot be attribute_relro, because _dl_start_user might write into it after _dl_start returns. */ #define DL_ARGV_NOT_RELRO 1 - -#define DL_EXTERN_PROTECTED_DATA -- 2.36.0.464.gb9c8b46e94-goog ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-01 6:06 ` [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" Fangrui Song @ 2022-05-23 20:10 ` Szabolcs Nagy 2022-05-23 20:17 ` Fangrui Song 0 siblings, 1 reply; 21+ messages in thread From: Szabolcs Nagy @ 2022-05-23 20:10 UTC (permalink / raw) To: Fangrui Song; +Cc: libc-alpha, Adhemerval Zanella The 04/30/2022 23:06, Fangrui Song wrote: > This reverts commit 0910702c4d2cf9e8302b35c9519548726e1ac489. > > Say both a.so and b.so define protected var and the executable copy > relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange > semantics: a.so accesses the copy in the executable while b.so accesses > its own. This behavior requires that (a) the compiler emits > GOT-generating relocations (b) the linker produces GLOB_DAT instead of > RELATIVE. > > Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT > will bind to the executable (normal behavior). > > For aarch64 it makes sense to restore the original behavior and don't > pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very > unlikely used by anyone. > > * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: > no GOT-generating relocation in the first place. > * gold and lld reject copy relocation on a STV_PROTECTED symbol. > * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses > GOT-generating relocation when accessing an default visibility > external symbol which avoids copy relocation. this looks fine. i guess bfd ld should warn/reject copy relocs too since it wont work well with protected visibility. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> > --- a/sysdeps/aarch64/dl-sysdep.h > +++ b/sysdeps/aarch64/dl-sysdep.h > @@ -21,5 +21,3 @@ > /* _dl_argv cannot be attribute_relro, because _dl_start_user > might write into it after _dl_start returns. */ > #define DL_ARGV_NOT_RELRO 1 > - > -#define DL_EXTERN_PROTECTED_DATA i think this file can be removed after rebase (DL_ARGV_NOT_RELRO got removed) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-23 20:10 ` Szabolcs Nagy @ 2022-05-23 20:17 ` Fangrui Song 2022-05-24 5:13 ` Fangrui Song 0 siblings, 1 reply; 21+ messages in thread From: Fangrui Song @ 2022-05-23 20:17 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: libc-alpha, Adhemerval Zanella On 2022-05-23, Szabolcs Nagy wrote: >The 04/30/2022 23:06, Fangrui Song wrote: >> This reverts commit 0910702c4d2cf9e8302b35c9519548726e1ac489. >> >> Say both a.so and b.so define protected var and the executable copy >> relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange >> semantics: a.so accesses the copy in the executable while b.so accesses >> its own. This behavior requires that (a) the compiler emits >> GOT-generating relocations (b) the linker produces GLOB_DAT instead of >> RELATIVE. >> >> Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT >> will bind to the executable (normal behavior). >> >> For aarch64 it makes sense to restore the original behavior and don't >> pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very >> unlikely used by anyone. >> >> * Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: >> no GOT-generating relocation in the first place. >> * gold and lld reject copy relocation on a STV_PROTECTED symbol. >> * Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses >> GOT-generating relocation when accessing an default visibility >> external symbol which avoids copy relocation. > >this looks fine. i guess bfd ld should warn/reject copy relocs too >since it wont work well with protected visibility. > >Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Thanks for review. I'll check GNU ld's aarch64 port if I get time. >> --- a/sysdeps/aarch64/dl-sysdep.h >> +++ b/sysdeps/aarch64/dl-sysdep.h >> @@ -21,5 +21,3 @@ >> /* _dl_argv cannot be attribute_relro, because _dl_start_user >> might write into it after _dl_start returns. */ >> #define DL_ARGV_NOT_RELRO 1 >> - >> -#define DL_EXTERN_PROTECTED_DATA > >i think this file can be removed after rebase >(DL_ARGV_NOT_RELRO got removed) Ack. Will remove this file. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" 2022-05-23 20:17 ` Fangrui Song @ 2022-05-24 5:13 ` Fangrui Song 0 siblings, 0 replies; 21+ messages in thread From: Fangrui Song @ 2022-05-24 5:13 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: libc-alpha, Adhemerval Zanella On 2022-05-23, Fangrui Song wrote: >On 2022-05-23, Szabolcs Nagy wrote: >>The 04/30/2022 23:06, Fangrui Song wrote: >>>This reverts commit 0910702c4d2cf9e8302b35c9519548726e1ac489. >>> >>>Say both a.so and b.so define protected var and the executable copy >>>relocates var. ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA has strange >>>semantics: a.so accesses the copy in the executable while b.so accesses >>>its own. This behavior requires that (a) the compiler emits >>>GOT-generating relocations (b) the linker produces GLOB_DAT instead of >>>RELATIVE. >>> >>>Without the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA code, b.so's GLOB_DAT >>>will bind to the executable (normal behavior). >>> >>>For aarch64 it makes sense to restore the original behavior and don't >>>pay the ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA cost. The behavior is very >>>unlikely used by anyone. >>> >>>* Clang code generation treats STV_PROTECTED the same way as STV_HIDDEN: >>> no GOT-generating relocation in the first place. >>>* gold and lld reject copy relocation on a STV_PROTECTED symbol. >>>* Nowadays -fpie/-fpic modes are popular. GCC/Clang's codegen uses >>> GOT-generating relocation when accessing an default visibility >>> external symbol which avoids copy relocation. >> >>this looks fine. i guess bfd ld should warn/reject copy relocs too >>since it wont work well with protected visibility. >> >>Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> > >Thanks for review. I'll check GNU ld's aarch64 port if I get time. I sent a GNU ld patch for review: https://sourceware.org/pipermail/binutils/2022-May/120970.html ("[PATCH] aarch64: Disallow copy relocations on protected data") ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-05-31 13:48 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-24 13:46 [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" Wilco Dijkstra 2022-05-24 17:28 ` maskray 2022-05-24 21:58 ` H.J. Lu 2022-05-25 17:13 ` Wilco Dijkstra 2022-05-25 18:21 ` Florian Weimer 2022-05-25 20:44 ` H.J. Lu 2022-05-26 19:17 ` Wilco Dijkstra 2022-05-26 19:25 ` Florian Weimer 2022-05-26 20:03 ` Wilco Dijkstra 2022-05-26 21:27 ` H.J. Lu 2022-05-27 12:43 ` Florian Weimer 2022-05-31 2:03 ` H.J. Lu 2022-05-31 7:49 ` Fangrui Song 2022-05-31 9:42 ` Wilco Dijkstra 2022-05-31 13:47 ` H.J. Lu 2022-05-31 7:42 ` Fangrui Song 2022-05-25 20:10 ` maskray -- strict thread matches above, loose matches on Subject: below -- 2022-05-01 6:06 [PATCH 0/3] Simplify ELF_RTYPE_CLASS_EXTERN_PROTECTED_DATA and revert aarch64/arm's extern protected data handling Fangrui Song 2022-05-01 6:06 ` [PATCH 2/3] Revert "[AArch64][BZ #17711] Fix extern protected data handling" Fangrui Song 2022-05-23 20:10 ` Szabolcs Nagy 2022-05-23 20:17 ` Fangrui Song 2022-05-24 5:13 ` Fangrui Song
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).