public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* tls access fails after runtime fix symbol lookups change
@ 2021-05-10 21:57 Stan Cox
  2021-05-14 20:42 ` Stan Cox
  0 siblings, 1 reply; 2+ messages in thread
From: Stan Cox @ 2021-05-10 21:57 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: systemtap

Sultan,
The thread local storage access stopped working after:

     runtime: fix symbol lookups when the first section isn't executable

     Some binaries are linked in such a way that there are VMA address
     range gaps, indicated by non-zero load offsets.  The runtime needs to
     not lose those offsets to enable a proper mapping back & forth from
     addresses to symbols.
  runtime/sym.c             | 12 ++++++------
  runtime/task_finder_vma.c |  9 +++++++--
  runtime/unwind.c          |  2 +-
  runtime/vma.c             | 31 ++++++++++---------------------
  4 files changed, 24 insertions(+), 30 deletions(-)


More specifically, what fails is accessing the module list; which is the 
list of executable and libraries.  That list is used to find the correct 
entry for a given module.   A quick way to test this is:
RUNTESTFLAGS='tls.exp' make installcheck

After the above is run, a more specific check can be done by executing 
this (access module list directly) in the testsuite directory.

% ../../install/bin/stap -DTIF_IA32=99 --disable-cache --runtime=kernel 
-g -c /work/scox/systemtap/bld/testsuite/tls1.x -e 'probe 
process.function("main") 
{printf("%#lx\n",user_long_error(&@var("_rtld_global","/usr/lib64/ld-linux-x86-64.so.2")->_dl_ns[0]->_ns_loaded));}' 
; ../../install/bin/stap -DTIF_IA32=99 --disable-cache --runtime=kernel 
-g -c /work/scox/systemtap/bld/testsuite/tls1.x -e 'probe 
process.function("main") 
{printf("%#s\n",(@var("_rtld_global","/usr/lib64/ld-linux-x86-64.so.2")->_dl_ns[0]->_ns_loaded$));}'
tls counter for 1: 2/3
tls counter for 2: 3/4
0x7efc51d761a0
tls counter for 1: 2/3
tls counter for 2: 3/4
{.l_addr=0, .l_name="", .l_ld=0x403dd8, .l_next=0x7f11692c5750, 
.l_prev=0x0, .l_real=0x7f11692c51a0, .l_ns=0, .l_libname=0x7f11692c5728, 
.l_info=[...], .l_phdr=0x400040, .l_entry=4198496, .l_phnum=12, 
.l_ldnum=0, .l_searchlist={...}, .l_symbolic_searchlist={...}, 
.l_loader=0x0, .l_versions=0x7f1169272530, .l_nversions=4, 
.l_nbuckets=1, .l_gnu_bitmask_idxbits=0, .l_gnu_shift=0, 
.l_gnu_bitmask=0x400350, <union>={...}, <union>={...}, 
.l_direct_opencount=1, .l_type=0, .l_relocated=1, .l_init_called=1, 
.l_globa...

Failing result:

tls counter for 1: 2/3
tls counter for 2: 3/4
ERROR: read fault [man error::fault] at 0x7f84a4ac9000 near identifier 
'user_long_error' at 
/work/scox/teststap/bld/../install/share/systemtap/tapset/uconversions.stp:687:10
WARNING: Number of errors: 1, skipped probes: 0
WARNING: /work/scox/teststap/bld/../install/bin/staprun exited with 
status: 1
Pass 5: run failed.  [man error::pass5]
tls counter for 1: 2/3
tls counter for 2: 3/4
ERROR

Change to user mode (--runtime=dyninst), and it works okay.


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: tls access fails after runtime fix symbol lookups change
  2021-05-10 21:57 tls access fails after runtime fix symbol lookups change Stan Cox
@ 2021-05-14 20:42 ` Stan Cox
  0 siblings, 0 replies; 2+ messages in thread
From: Stan Cox @ 2021-05-14 20:42 UTC (permalink / raw)
  To: Sultan Alsawaf; +Cc: systemtap

I poked at this by adding additional debug statements:

dbug_task_vma(1, ...
if (res == -ESRCH /*|| vm_start + offset == addr*/)
     res = stap_add_vma_map_info(tsk->group_leader,
           addr, addr + length,
	  offset, path, module);
else if (res == 0 && vm_end + 1 == addr)
     res = stap_extend_vma_map_info(tsk->group_leader,
	  vm_start, addr + length);

stap -DDEBUG_TASK_FINDER=2 -DDEBUG_TASK_FINDER_VMA=2 --disable-cache 
--runtime=kernel -g -c tls1.x -e '
  probe process.function("main")
   {
    printf("%s\n", 
@var("_rtld_global","/usr/lib64/ld-linux-x86-64.so.2")->_dl_ns[0]->_ns_loaded$);
 
printf("%#lx\n",&@var("_rtld_global","/usr/lib64/ld-linux-x86-64.so.2")->_dl_ns[0]->_ns_loaded);
   }'


and notice:

The module map that tls accesses is at 0x7fb941037000
This is with vm_start+offset==addr commented out

Immediately before condition in question:

_stp_vma_mmap_cb:188: At vm check res 0/-3 vm_start+offset==addr 1 
vm_end+1==addr 0 vm_start 0x7fb94100b000 vm_end 0x7fb94100c000 offset 
0x22000 addr 0x7fb94102d000 length 0x9000 addr+length 0x7fb941036000

So both conditions are false so no stap_add_vma_map_info or
stap_extend_vma_map_info

and a bit later:

_stp_vma_mmap_cb:158: mmap_cb: tsk 88624:88624 path 
/usr/lib64/ld-2.32.so, addr 0x7fb941036000, length 0x00003000, offset 
0x2a000, flags 0x810087

_stp_vma_mmap_cb:188: At vm check res 0/-3 vm_start+offset==addr 0 
vm_end+1==addr 0 vm_start 0x7fb94100b000 vm_end 0x7fb94100c000 offset 
0x2a000 addr 0x7fb941036000 length 0x3000 addr+length 0x7fb941039000

again both conditions are false, so neither stap_add_vma_map_info nor
stap_extend_vma_map_info, (so how is it stap knows about 0x7fb941037000?)

works as expected

{.l_addr=0, .l_name="", .l_ld=0x403dd8, .l_next=0x7fb941038750, 
.l_prev=0x0, .l_real=0x7fb9410381a0, .l_ns=0, .l_libname=0x7fb941038728, 
.l_info=[...], .l_phdr=0x400040, .l_entry=4198496, .l_phnum=12, 
.l_ldnum=0, .l_searchlist={...}, .l_symbolic_searchlist={...}, 
.l_loader=0x0, .l_versions=0x7fb940fe5530, .l_nversions=4, 
.l_nbuckets=1, .l_gnu_bitmask_idxbits=0, .l_gnu_shift=0, 
.l_gnu_bitmask=0x400350, <union>={...}, <union>={...}, 
.l_direct_opencount=1, .l_type=0, .l_relocated=1, .l_init_called=1, .l_globa
0x7fb941037000

Now if vm_start+offset==addr is active where the module map is at 
0x7fe9fa65e000

_stp_vma_mmap_cb:188: At vm check res 0/-3 vm_start+offset==addr 1 
vm_end+1==addr 0 vm_start 0x7fe9fa608000 vm_end 0x7fe9fa629000 offset 
0x2a000 addr 0x7fe9fa632000 length 0x3000 addr+length 0x7fe9fa635000

In this case stap_add_vma_map_info does get called

stap_add_vma_map_info:223: stap_add_vma_map_info adding 
'/usr/lib64/ld-2.32.so' for 83155 [7fe9fa632000-7fe9fa635000 2a000]

ERROR
0x7fe9fa65e000

That seems to be the primary difference.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-05-14 20:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-10 21:57 tls access fails after runtime fix symbol lookups change Stan Cox
2021-05-14 20:42 ` Stan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).