* aarch64 TLS optimizations? @ 2019-05-17 13:51 Tom Horsley 2019-05-17 14:59 ` Andrew Haley 2019-05-20 15:43 ` Szabolcs Nagy 0 siblings, 2 replies; 6+ messages in thread From: Tom Horsley @ 2019-05-17 13:51 UTC (permalink / raw) To: gcc I'm trying (for reason too complex to go into) to locate the TLS offset of the tcache_shutting_down variable from malloc in the ubuntu provided glibc on aarch64 ubuntu 18.04. Various "normal" TLS variables appear to operate much like x86_64 with a GOT table entry where the TLS offset of the variable gets stashed. But in the ubuntu glibc there is no GOT entry for that variable, and disassembly of the code shows that it seems to "just know" the offset to use. Is there some kind of magic TLS optimization that can happen for certain variables on aarch64? I'm trying to understand how it could know the offset like it appears to do in the code. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: aarch64 TLS optimizations? 2019-05-17 13:51 aarch64 TLS optimizations? Tom Horsley @ 2019-05-17 14:59 ` Andrew Haley 2019-05-20 15:43 ` Szabolcs Nagy 1 sibling, 0 replies; 6+ messages in thread From: Andrew Haley @ 2019-05-17 14:59 UTC (permalink / raw) To: Tom Horsley, gcc On 5/17/19 2:51 PM, Tom Horsley wrote: > I'm trying (for reason too complex to go into) to > locate the TLS offset of the tcache_shutting_down > variable from malloc in the ubuntu provided > glibc on aarch64 ubuntu 18.04. > > Various "normal" TLS variables appear to operate > much like x86_64 with a GOT table entry where the > TLS offset of the variable gets stashed. > > But in the ubuntu glibc there is no GOT entry for > that variable, and disassembly of the code shows > that it seems to "just know" the offset to use. > > Is there some kind of magic TLS optimization that > can happen for certain variables on aarch64? I'm trying > to understand how it could know the offset like > it appears to do in the code. https://www.fsfla.org/~lxoliva/writeups/TLS/paper-lk2006.pdf -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: aarch64 TLS optimizations? 2019-05-17 13:51 aarch64 TLS optimizations? Tom Horsley 2019-05-17 14:59 ` Andrew Haley @ 2019-05-20 15:43 ` Szabolcs Nagy 2019-05-20 15:59 ` Tom Horsley 1 sibling, 1 reply; 6+ messages in thread From: Szabolcs Nagy @ 2019-05-20 15:43 UTC (permalink / raw) To: Tom Horsley, gcc; +Cc: nd On 17/05/2019 14:51, Tom Horsley wrote: > I'm trying (for reason too complex to go into) to > locate the TLS offset of the tcache_shutting_down > variable from malloc in the ubuntu provided > glibc on aarch64 ubuntu 18.04. > > Various "normal" TLS variables appear to operate > much like x86_64 with a GOT table entry where the > TLS offset of the variable gets stashed. this is more of a glibc question than a gcc one (i.e. libc-help list would be better). tls in glibc uses the initial-exec tls access model, (tls object is at a fixed offset from tp across threads), that requires a GOT entry for the offset which is set up via a R_*_TPREL dynamic reloc at startup time. (note: if a symbol is internal to the module its TPREL reloc is not tied to a symbol, it only has an addend for the offset within the module) > But in the ubuntu glibc there is no GOT entry for > that variable, and disassembly of the code shows > that it seems to "just know" the offset to use. i see adrp+ldr sequences that access GOT entries. e.g. in the objdump of libc.so.6: 00000000000771d0 <__libc_malloc@@GLIBC_2.17>: ... 77400: f00006c0 adrp x0, 152000 <sys_sigabbrev@@GLIBC_2.17+0x278> 77404: f9470c00 ldr x0, [x0, #3608] 77408: d53bd041 mrs x1, tpidr_el0 you can verify that 0x152000 + 3608 == 0x152e18 is indeed a GOT entry (falls into .got) and there is a 0000000000152e18 R_AARCH64_TLS_TPREL64 *ABS*+0x0000000000000010 dynamic relocation for that entry as expected. (but i don't know which symbol this entry is for, only that the symbol must be a local tls sym) > Is there some kind of magic TLS optimization that > can happen for certain variables on aarch64? I'm trying > to understand how it could know the offset like > it appears to do in the code. there is no magic. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: aarch64 TLS optimizations? 2019-05-20 15:43 ` Szabolcs Nagy @ 2019-05-20 15:59 ` Tom Horsley 2019-05-20 17:08 ` Szabolcs Nagy 0 siblings, 1 reply; 6+ messages in thread From: Tom Horsley @ 2019-05-20 15:59 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: gcc, nd On Mon, 20 May 2019 15:43:53 +0000 Szabolcs Nagy wrote: > you can verify that 0x152000 + 3608 == 0x152e18 is > indeed a GOT entry (falls into .got) and there is a > > 0000000000152e18 R_AARCH64_TLS_TPREL64 *ABS*+0x0000000000000010 There are a couple of other TLS variables in malloc, and I suspect this is one of them, where it is actually looking at tcache_shutting_down (verified with debug info and disassembly), it is simply using the tpidr_el0 value still laying around in the register from the 1st TLS reference and loading tcache_shutting_down from an offset which appears for all the world to simply be hard coded, no GOT reference involved. I suppose at some point I'll be forced to understand how to build glibc from the ubuntu source package so I can see exactly what options and ifdefs are used and check the relocations in the malloc.o file from before it is incorporated with libc.so ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: aarch64 TLS optimizations? 2019-05-20 15:59 ` Tom Horsley @ 2019-05-20 17:08 ` Szabolcs Nagy 2019-05-20 17:14 ` Tom Horsley 0 siblings, 1 reply; 6+ messages in thread From: Szabolcs Nagy @ 2019-05-20 17:08 UTC (permalink / raw) To: Tom Horsley; +Cc: nd, gcc On 20/05/2019 16:59, Tom Horsley wrote: > On Mon, 20 May 2019 15:43:53 +0000 > Szabolcs Nagy wrote: > >> you can verify that 0x152000 + 3608 == 0x152e18 is >> indeed a GOT entry (falls into .got) and there is a >> >> 0000000000152e18 R_AARCH64_TLS_TPREL64 *ABS*+0x0000000000000010 > > There are a couple of other TLS variables in malloc, and I > suspect this is one of them, where it is actually looking > at tcache_shutting_down (verified with debug info and disassembly), > it is simply using the tpidr_el0 value still laying around > in the register from the 1st TLS reference and loading > tcache_shutting_down from an offset which appears for all the > world to simply be hard coded, no GOT reference involved. > > I suppose at some point I'll be forced to understand how to build > glibc from the ubuntu source package so I can see exactly > what options and ifdefs are used and check the relocations in > the malloc.o file from before it is incorporated with libc.so in my build of malloc.os in glibc in the symtab i see 84: 0000000000000000 0 TLS LOCAL DEFAULT 10 .LANCHOR3 85: 0000000000000000 8 TLS LOCAL DEFAULT 10 thread_arena 86: 0000000000000008 8 TLS LOCAL DEFAULT 10 tcache 87: 0000000000000010 1 TLS LOCAL DEFAULT 10 tcache_shutting_down and the R_*_TLSIE_* relocs are for .LANCHOR3 + 0, so there will be one GOT entry for the 3 objects and you should see tp + got_value + (0 or 8 or 16) address computation to access the 3 objects. e.g. in __malloc_arena_thread_freeres i see 4e04: d53bd056 mrs x22, tpidr_el0 4e08: 90000015 adrp x21, 0 <_dl_tunable_set_mmap_threshold> 4e08: R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21 .LANCHOR3 4e0c: f94002b5 ldr x21, [x21] 4e0c: R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC .LANCHOR3 4e10: a90153f3 stp x19, x20, [sp, #16] 4e14: 8b1502c0 add x0, x22, x21 // x0 = tp + got_value 4e18: f9400414 ldr x20, [x0, #8] // read from tcache 4e1c: f9001bf7 str x23, [sp, #48] 4e20: b4000234 cbz x20, 4e64 <__malloc_arena_thread_freeres+0x6c> 4e24: 52800021 mov w1, #0x1 // #1 4e28: 91010293 add x19, x20, #0x40 4e2c: 91090297 add x23, x20, #0x240 4e30: f900041f str xzr, [x0, #8] // write to tcache 4e34: 39004001 strb w1, [x0, #16] // write to tchace_shutting_down i doubt ubuntu changed this, but if the offset is a fixed const in the binary that means they moved that variable into the glibc internal pthread struct (which is at a fixed offset from tp). ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: aarch64 TLS optimizations? 2019-05-20 17:08 ` Szabolcs Nagy @ 2019-05-20 17:14 ` Tom Horsley 0 siblings, 0 replies; 6+ messages in thread From: Tom Horsley @ 2019-05-20 17:14 UTC (permalink / raw) To: Szabolcs Nagy; +Cc: nd, gcc On Mon, 20 May 2019 17:07:59 +0000 Szabolcs Nagy wrote: > and the R_*_TLSIE_* relocs are for .LANCHOR3 + 0, > so there will be one GOT entry for the 3 objects > and you should see That may indeed explain what is going on. I'll have to take a closer look at the specific ubuntu libraries I have installed and see if I detect something similar. Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-05-20 17:14 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-17 13:51 aarch64 TLS optimizations? Tom Horsley 2019-05-17 14:59 ` Andrew Haley 2019-05-20 15:43 ` Szabolcs Nagy 2019-05-20 15:59 ` Tom Horsley 2019-05-20 17:08 ` Szabolcs Nagy 2019-05-20 17:14 ` Tom Horsley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).