public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* aarch64 TLS optimizations?
@ 2019-05-17 13:51 Tom Horsley
  2019-05-17 14:59 ` Andrew Haley
  2019-05-20 15:43 ` Szabolcs Nagy
  0 siblings, 2 replies; 6+ messages in thread
From: Tom Horsley @ 2019-05-17 13:51 UTC (permalink / raw)
  To: gcc

I'm trying (for reason too complex to go into) to
locate the TLS offset of the tcache_shutting_down
variable from malloc in the ubuntu provided
glibc on aarch64 ubuntu 18.04.

Various "normal" TLS variables appear to operate
much like x86_64 with a GOT table entry where the
TLS offset of the variable gets stashed.

But in the ubuntu glibc there is no GOT entry for
that variable, and disassembly of the code shows
that it seems to "just know" the offset to use.

Is there some kind of magic TLS optimization that
can happen for certain variables on aarch64? I'm trying
to understand how it could know the offset like
it appears to do in the code.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 TLS optimizations?
  2019-05-17 13:51 aarch64 TLS optimizations? Tom Horsley
@ 2019-05-17 14:59 ` Andrew Haley
  2019-05-20 15:43 ` Szabolcs Nagy
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Haley @ 2019-05-17 14:59 UTC (permalink / raw)
  To: Tom Horsley, gcc

On 5/17/19 2:51 PM, Tom Horsley wrote:
> I'm trying (for reason too complex to go into) to
> locate the TLS offset of the tcache_shutting_down
> variable from malloc in the ubuntu provided
> glibc on aarch64 ubuntu 18.04.
> 
> Various "normal" TLS variables appear to operate
> much like x86_64 with a GOT table entry where the
> TLS offset of the variable gets stashed.
> 
> But in the ubuntu glibc there is no GOT entry for
> that variable, and disassembly of the code shows
> that it seems to "just know" the offset to use.
> 
> Is there some kind of magic TLS optimization that
> can happen for certain variables on aarch64? I'm trying
> to understand how it could know the offset like
> it appears to do in the code.

https://www.fsfla.org/~lxoliva/writeups/TLS/paper-lk2006.pdf



-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 TLS optimizations?
  2019-05-17 13:51 aarch64 TLS optimizations? Tom Horsley
  2019-05-17 14:59 ` Andrew Haley
@ 2019-05-20 15:43 ` Szabolcs Nagy
  2019-05-20 15:59   ` Tom Horsley
  1 sibling, 1 reply; 6+ messages in thread
From: Szabolcs Nagy @ 2019-05-20 15:43 UTC (permalink / raw)
  To: Tom Horsley, gcc; +Cc: nd

On 17/05/2019 14:51, Tom Horsley wrote:
> I'm trying (for reason too complex to go into) to
> locate the TLS offset of the tcache_shutting_down
> variable from malloc in the ubuntu provided
> glibc on aarch64 ubuntu 18.04.
> 
> Various "normal" TLS variables appear to operate
> much like x86_64 with a GOT table entry where the
> TLS offset of the variable gets stashed.

this is more of a glibc question than a gcc one
(i.e. libc-help list would be better).

tls in glibc uses the initial-exec tls access model,
(tls object is at a fixed offset from tp across threads),
that requires a GOT entry for the offset which is set
up via a R_*_TPREL dynamic reloc at startup time.

(note: if a symbol is internal to the module its TPREL
reloc is not tied to a symbol, it only has an addend
for the offset within the module)

> But in the ubuntu glibc there is no GOT entry for
> that variable, and disassembly of the code shows
> that it seems to "just know" the offset to use.

i see adrp+ldr sequences that access GOT entries.

e.g. in the objdump of libc.so.6:

00000000000771d0 <__libc_malloc@@GLIBC_2.17>:
...
   77400:       f00006c0        adrp    x0, 152000 <sys_sigabbrev@@GLIBC_2.17+0x278>
   77404:       f9470c00        ldr     x0, [x0, #3608]
   77408:       d53bd041        mrs     x1, tpidr_el0

you can verify that 0x152000 + 3608 == 0x152e18 is
indeed a GOT entry (falls into .got) and there is a

0000000000152e18 R_AARCH64_TLS_TPREL64  *ABS*+0x0000000000000010

dynamic relocation for that entry as expected.
(but i don't know which symbol this entry is for,
only that the symbol must be a local tls sym)

> Is there some kind of magic TLS optimization that
> can happen for certain variables on aarch64? I'm trying
> to understand how it could know the offset like
> it appears to do in the code.

there is no magic.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 TLS optimizations?
  2019-05-20 15:43 ` Szabolcs Nagy
@ 2019-05-20 15:59   ` Tom Horsley
  2019-05-20 17:08     ` Szabolcs Nagy
  0 siblings, 1 reply; 6+ messages in thread
From: Tom Horsley @ 2019-05-20 15:59 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: gcc, nd

On Mon, 20 May 2019 15:43:53 +0000
Szabolcs Nagy wrote:

> you can verify that 0x152000 + 3608 == 0x152e18 is
> indeed a GOT entry (falls into .got) and there is a
> 
> 0000000000152e18 R_AARCH64_TLS_TPREL64  *ABS*+0x0000000000000010

There are a couple of other TLS variables in malloc, and I
suspect this is one of them, where it is actually looking
at tcache_shutting_down (verified with debug info and disassembly),
it is simply using the tpidr_el0 value still laying around
in the register from the 1st TLS reference and loading
tcache_shutting_down from an offset which appears for all the
world to simply be hard coded, no GOT reference involved.

I suppose at some point I'll be forced to understand how to build
glibc from the ubuntu source package so I can see exactly
what options and ifdefs are used and check the relocations in
the malloc.o file from before it is incorporated with libc.so

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 TLS optimizations?
  2019-05-20 15:59   ` Tom Horsley
@ 2019-05-20 17:08     ` Szabolcs Nagy
  2019-05-20 17:14       ` Tom Horsley
  0 siblings, 1 reply; 6+ messages in thread
From: Szabolcs Nagy @ 2019-05-20 17:08 UTC (permalink / raw)
  To: Tom Horsley; +Cc: nd, gcc

On 20/05/2019 16:59, Tom Horsley wrote:
> On Mon, 20 May 2019 15:43:53 +0000
> Szabolcs Nagy wrote:
> 
>> you can verify that 0x152000 + 3608 == 0x152e18 is
>> indeed a GOT entry (falls into .got) and there is a
>>
>> 0000000000152e18 R_AARCH64_TLS_TPREL64  *ABS*+0x0000000000000010
> 
> There are a couple of other TLS variables in malloc, and I
> suspect this is one of them, where it is actually looking
> at tcache_shutting_down (verified with debug info and disassembly),
> it is simply using the tpidr_el0 value still laying around
> in the register from the 1st TLS reference and loading
> tcache_shutting_down from an offset which appears for all the
> world to simply be hard coded, no GOT reference involved.
> 
> I suppose at some point I'll be forced to understand how to build
> glibc from the ubuntu source package so I can see exactly
> what options and ifdefs are used and check the relocations in
> the malloc.o file from before it is incorporated with libc.so

in my build of malloc.os in glibc in the symtab i see

    84: 0000000000000000     0 TLS     LOCAL  DEFAULT   10 .LANCHOR3
    85: 0000000000000000     8 TLS     LOCAL  DEFAULT   10 thread_arena
    86: 0000000000000008     8 TLS     LOCAL  DEFAULT   10 tcache
    87: 0000000000000010     1 TLS     LOCAL  DEFAULT   10 tcache_shutting_down

and the R_*_TLSIE_* relocs are for .LANCHOR3 + 0,
so there will be one GOT entry for the 3 objects
and you should see

tp + got_value + (0 or 8 or 16)

address computation to access the 3 objects.

e.g. in __malloc_arena_thread_freeres i see

    4e04:	d53bd056 	mrs	x22, tpidr_el0
    4e08:	90000015 	adrp	x21, 0 <_dl_tunable_set_mmap_threshold>	4e08: R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21	.LANCHOR3
    4e0c:	f94002b5 	ldr	x21, [x21]	4e0c: R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC	.LANCHOR3
    4e10:	a90153f3 	stp	x19, x20, [sp, #16]
    4e14:	8b1502c0 	add	x0, x22, x21   // x0 = tp + got_value
    4e18:	f9400414 	ldr	x20, [x0, #8]  // read from tcache
    4e1c:	f9001bf7 	str	x23, [sp, #48]
    4e20:	b4000234 	cbz	x20, 4e64 <__malloc_arena_thread_freeres+0x6c>
    4e24:	52800021 	mov	w1, #0x1                   	// #1
    4e28:	91010293 	add	x19, x20, #0x40
    4e2c:	91090297 	add	x23, x20, #0x240
    4e30:	f900041f 	str	xzr, [x0, #8] // write to tcache
    4e34:	39004001 	strb	w1, [x0, #16] // write to tchace_shutting_down

i doubt ubuntu changed this, but if the offset is
a fixed const in the binary that means they moved
that variable into the glibc internal pthread struct
(which is at a fixed offset from tp).


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: aarch64 TLS optimizations?
  2019-05-20 17:08     ` Szabolcs Nagy
@ 2019-05-20 17:14       ` Tom Horsley
  0 siblings, 0 replies; 6+ messages in thread
From: Tom Horsley @ 2019-05-20 17:14 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: nd, gcc

On Mon, 20 May 2019 17:07:59 +0000
Szabolcs Nagy wrote:

> and the R_*_TLSIE_* relocs are for .LANCHOR3 + 0,
> so there will be one GOT entry for the 3 objects
> and you should see

That may indeed explain what is going on. I'll
have to take a closer look at the specific
ubuntu libraries I have installed and see if I
detect something similar. Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-05-20 17:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-17 13:51 aarch64 TLS optimizations? Tom Horsley
2019-05-17 14:59 ` Andrew Haley
2019-05-20 15:43 ` Szabolcs Nagy
2019-05-20 15:59   ` Tom Horsley
2019-05-20 17:08     ` Szabolcs Nagy
2019-05-20 17:14       ` Tom Horsley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).