public inbox for gdb@sourceware.org
 help / color / mirror / Atom feed
* fail to attach to process on Solaris
@ 2011-08-22 15:05 Burkhardt, Glenn
  2011-09-02 21:24 ` Pedro Alves
  0 siblings, 1 reply; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-08-22 15:05 UTC (permalink / raw)
  To: gdb

I can't seem to attach to a process that's running multiple threads.
Any help is appreciated.  I've included the print output from the whole
session, but the problem is the line:

procfs: couldn't find pid 15719 (kernel thread 67) in procinfo list.


Version info:
GNU gdb (GDB) 7.3
gcc (GCC) 4.5.2
SunOS 5.9 Generic_118558-28 sun4u sparc SUNW,Sun-Fire-V440



rms $ gdb rms.sparc 15719
GNU gdb (GDB) 7.3
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show
copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.9".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from
/home/glenn.burkhardt/targets/sol/ata0a/rms/rms.sparc...done.
Attaching to program
`/home/glenn.burkhardt/targets/sol/ata0a/rms/rms.sparc', process 15719
[New process 15719]
Retry #1:
Retry #2:
Retry #3:
Retry #4:
Reading symbols from /usr/lib/watchmalloc.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/watchmalloc.so.1
Reading symbols from /usr/openwin/lib/libXaw.so.5...(no debugging
symbols found)...done.
Loaded symbols for /usr/openwin/lib/libXaw.so.5
Reading symbols from /usr/openwin/lib/libXmu.so.4...(no debugging
symbols found)...done.
Loaded symbols for /usr/openwin/lib/libXmu.so.4
Reading symbols from /usr/openwin/lib/libXt.so.4...(no debugging symbols
found)...done.
Loaded symbols for /usr/openwin/lib/libXt.so.4
Reading symbols from /usr/openwin/lib/libSM.so.6...(no debugging symbols
found)...done.
Loaded symbols for /usr/openwin/lib/libSM.so.6
Reading symbols from /usr/openwin/lib/libICE.so.6...(no debugging
symbols found)...done.
Loaded symbols for /usr/openwin/lib/libICE.so.6
Reading symbols from
/itek/raptor/build/rci/rciDisplay/openGL.sparc/libGL.so...done.
Loaded symbols for
/itek/raptor/build/rci/rciDisplay/openGL.sparc/libGL.so
Reading symbols from
/itek/raptor/build/rci/rciDisplay/openGL.sparc/libGLU.so...done.
Loaded symbols for
/itek/raptor/build/rci/rciDisplay/openGL.sparc/libGLU.so
Reading symbols from /usr/openwin/lib/libXext.so.0...(no debugging
symbols found)...done.
Loaded symbols for /usr/openwin/lib/libXext.so.0
Reading symbols from /usr/openwin/lib/libX11.so.4...(no debugging
symbols found)...done.
Loaded symbols for /usr/openwin/lib/libX11.so.4
Reading symbols from /usr/lib/librt.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/librt.so.1
Reading symbols from /usr/lib/libm.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libm.so.1
Reading symbols from /usr/lib/libsocket.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libsocket.so.1
Reading symbols from /usr/lib/libnsl.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libnsl.so.1
Reading symbols from /usr/lib/libpthread.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libpthread.so.1
Reading symbols from /usr/lib/libresolv.so.2...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libresolv.so.2
Reading symbols from /usr/lib/libdl.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /usr/lib/libc.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libc.so.1
Reading symbols from /usr/openwin/lib/libXi.so.5...(no debugging symbols
found)...done.
Loaded symbols for /usr/openwin/lib/libXi.so.5
Reading symbols from /usr/lib/libaio.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libaio.so.1
Reading symbols from /usr/lib/libmd5.so.1...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libmd5.so.1
Reading symbols from /usr/lib/libmp.so.2...(no debugging symbols
found)...done.
Loaded symbols for /usr/lib/libmp.so.2
Reading symbols from
/usr/platform/SUNW,Sun-Fire-V440/lib/libc_psr.so.1...(no debugging
symbols found)...done.
Loaded symbols for /usr/platform/SUNW,Sun-Fire-V440/lib/libc_psr.so.1
Reading symbols from /usr/lib/libthread.so.1...(no debugging symbols
found)...done.
[Thread debugging using libthread_db enabled]
[New LWP    65        ]
[New LWP    63        ]
[New LWP    62        ]
[New LWP    60        ]
[New LWP    59        ]
[New LWP    58        ]
[New LWP    57        ]
[New LWP    56        ]
[New LWP    55        ]
[New LWP    54        ]
[New LWP    53        ]
[New LWP    52        ]
[New LWP    50        ]
[New LWP    49        ]
[New LWP    48        ]
[New LWP    47        ]
[New LWP    46        ]
[New LWP    45        ]
[New LWP    44        ]
[New LWP    43        ]
[New LWP    42        ]
[New LWP    41        ]
[New LWP    40        ]
[New LWP    39        ]
[New LWP    38        ]
[New LWP    37        ]
[New LWP    36        ]
[New LWP    35        ]
[New LWP    34        ]
[New LWP    33        ]
[New LWP    32        ]
[New LWP    31        ]
[New LWP    30        ]
[New LWP    29        ]
[New LWP    28        ]
[New LWP    27        ]
[New LWP    26        ]
[New LWP    25        ]
[New LWP    24        ]
[New LWP    23        ]
[New LWP    22        ]
[New LWP    21        ]
[New LWP    20        ]
[New LWP    19        ]
[New LWP    18        ]
[New LWP    17        ]
[New LWP    16        ]
[New LWP    15        ]
[New LWP    14        ]
[New LWP    13        ]
[New LWP    12        ]
[New LWP    11        ]
[New LWP    10        ]
[New LWP    9        ]
[New LWP    8        ]
[New LWP    7        ]
[New LWP    6        ]
[New LWP    5        ]
[New LWP    4        ]
[New LWP    3        ]
[New LWP    51        ]
[New Thread 1 (LWP 1)]
[New Thread 3        ]
[New Thread 4 (LWP 4)]
[New Thread 5 (LWP 5)]
[New Thread 6        ]
[New Thread 7 (LWP 7)]
[New Thread 8 (LWP 8)]
[New Thread 9        ]
[New Thread 10 (LWP 10)]
[New Thread 11 (LWP 11)]
[New Thread 12        ]
[New Thread 13 (LWP 13)]
[New Thread 14 (LWP 14)]
[New Thread 15        ]
[New Thread 16 (LWP 16)]
[New Thread 17 (LWP 17)]
[New Thread 18        ]
[New Thread 19 (LWP 19)]
[New Thread 20 (LWP 20)]
[New Thread 21        ]
[New Thread 22        ]
[New Thread 23        ]
[New Thread 24        ]
[New Thread 25        ]
[New Thread 26        ]
[New Thread 27        ]
[New Thread 28        ]
[New Thread 29 (LWP 29)]
[New Thread 30        ]
[New Thread 31 (LWP 31)]
[New Thread 32 (LWP 32)]
[New Thread 33 (LWP 33)]
[New Thread 34        ]
[New Thread 35        ]
[New Thread 36        ]
[New Thread 37        ]
[New Thread 38        ]
[New Thread 39        ]
[New Thread 40        ]
[New Thread 41        ]
[New Thread 42 (LWP 42)]
[New Thread 43 (LWP 43)]
[New Thread 44 (LWP 44)]
[New Thread 45        ]
[New Thread 46        ]
[New Thread 47        ]
[New Thread 48        ]
[New Thread 49        ]
[New Thread 50        ]
[New Thread 51        ]
[New Thread 52        ]
[New Thread 53        ]
[New Thread 54        ]
[New Thread 55        ]
[New Thread 56        ]
[New Thread 57        ]
[New Thread 58 (LWP 58)]
[New Thread 59 (LWP 59)]
[New Thread 60        ]
[New Thread 62        ]
[New Thread 63        ]
[New Thread 65 (LWP 65)]
[New Thread 2        ]
[New Thread 61        ]
[New Thread 64        ]
[New Thread 66        ]
[New Thread 67        ]
Loaded symbols for /usr/lib/libthread.so.1
Reading symbols from /lib/ld.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/ld.so.1
[Switching to Thread 1 (LWP 1)]
0xfeb1f24c in _read () from /usr/lib/libc.so.1
(gdb) whe
#0  0xfeb1f24c in _read () from /usr/lib/libc.so.1
#1  0xfeb0f504 in _filbuf () from /usr/lib/libc.so.1
#2  0xfeb11c4c in fgets () from /usr/lib/libc.so.1
#3  0x005fe25c in shell_run (shell=0xa17bc0) at cmd.c:144
#4  0x0009ab0c in main (argc=1, argv=0xffbff37c) at rms.c:99
(gdb) c
Continuing.
procfs: couldn't find pid 15719 (kernel thread 67) in procinfo list.
(gdb) q

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: fail to attach to process on Solaris
  2011-08-22 15:05 fail to attach to process on Solaris Burkhardt, Glenn
@ 2011-09-02 21:24 ` Pedro Alves
  2011-09-20 23:22   ` Burkhardt, Glenn
  0 siblings, 1 reply; 12+ messages in thread
From: Pedro Alves @ 2011-09-02 21:24 UTC (permalink / raw)
  To: gdb; +Cc: Burkhardt, Glenn


On Monday 22 August 2011 16:05:10, Burkhardt, Glenn wrote:
> I can't seem to attach to a process that's running multiple threads.
> Any help is appreciated.  I've included the print output from the whole
> session, but the problem is the line:
> 
> procfs: couldn't find pid 15719 (kernel thread 67) in procinfo list.

Been a while since I looked at the solaris backend, so I can't help
much off hand.  I'm affraid you'll need to debug this yourself.
Starting with "set debug infrun 1" may help.  I'd start by
understanding why does gdb think 67 should be in the list.

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fail to attach to process on Solaris
  2011-09-02 21:24 ` Pedro Alves
@ 2011-09-20 23:22   ` Burkhardt, Glenn
  2011-09-21 14:26     ` Pedro Alves
  0 siblings, 1 reply; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-09-20 23:22 UTC (permalink / raw)
  To: Pedro Alves, gdb

The problem appears that thread debug library has callback for register
get operation that's connected to "sol-thread.c:ps_lgetregs()".  In the
case that fails, the thread exists, but the calling sequence tries to
lookup registers for a LWP with the same ID as the thread.

#0  find_procinfo_or_die (pid=12276, tid=67) at procfs.c:489
#1  0x000a1cd0 in procfs_fetch_registers (ops=0x7293d8,
regcache=0x71b1d0, 
    regnum=-1) at procfs.c:3483
#2  0x0012feec in sol_thread_fetch_registers (ops=0x718a70,
regcache=0x71b1d0, 
    regnum=-1) at sol-thread.c:457
#3  0x00231af0 in target_fetch_registers (regcache=0x71b1d0, regno=-1)
    at target.c:3417
#4  0x00130e48 in ps_lgetregs (ph=0x700998, lwpid=67,
gregset=0xffbfe37c)
    at sol-thread.c:923
#5  0xff0735dc in td_thr_getgregs () from /usr/lib/libthread_db.so.1
#6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70,
regcache=0x71b3b0, 
    regnum=68) at sol-thread.c:473 

For this stack trace of 'gdb', 'sol_thread_fetch_registers()' is passed 

(gdb) frame
#6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70,
regcache=0x71b3b0, 
    regnum=68) at sol-thread.c:473
473       val = p_td_thr_getgregs (&thandle, gregset);
(gdb) p *regcache
$24 = {descr = 0x84fc40, aspace = 0x7aa258, registers = 0x846c48 "", 
  register_status = 0x14f37c0 "", readonly_p = 0, ptid = {pid = 12276, 
    lwp = 0, tid = 67}}

So it's looking for registers from a thread that's not associated with
an LWP.  But the
function 'ps_lgetregs()' is always looking for the registers on the LWP
list.

I can't see how the callback 'ps_lgetregs()' is connected to the thread
debug library.  In fact, the documentation for the thread debug library
seems sparse.  I've only been able to find out about it in the man pages
and comments section of sol-thread.c  So any pointers to documentation
would be helpful.

Also, any theory of operation about gdb's register caching would also be
helpful.

Thanks.

> -----Original Message-----
> From: Pedro Alves [mailto:pedro@codesourcery.com] 
> Sent: Friday, September 02, 2011 5:24 PM
> To: gdb@sourceware.org
> Cc: Burkhardt, Glenn
> Subject: Re: fail to attach to process on Solaris
> 
> 
> On Monday 22 August 2011 16:05:10, Burkhardt, Glenn wrote:
> > I can't seem to attach to a process that's running multiple threads.
> > Any help is appreciated.  I've included the print output from the 
> > whole session, but the problem is the line:
> > 
> > procfs: couldn't find pid 15719 (kernel thread 67) in procinfo list.
> 
> Been a while since I looked at the solaris backend, so I 
> can't help much off hand.  I'm affraid you'll need to debug 
> this yourself.
> Starting with "set debug infrun 1" may help.  I'd start by 
> understanding why does gdb think 67 should be in the list.
> 
> --
> Pedro Alves
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: fail to attach to process on Solaris
  2011-09-20 23:22   ` Burkhardt, Glenn
@ 2011-09-21 14:26     ` Pedro Alves
  2011-09-21 16:19       ` Burkhardt, Glenn
  2011-09-21 16:28       ` Burkhardt, Glenn
  0 siblings, 2 replies; 12+ messages in thread
From: Pedro Alves @ 2011-09-21 14:26 UTC (permalink / raw)
  To: Burkhardt, Glenn; +Cc: gdb

On Wednesday 21 September 2011 00:22:21, Burkhardt, Glenn wrote:
> The problem appears that thread debug library has callback for register
> get operation that's connected to "sol-thread.c:ps_lgetregs()".  In the
> case that fails, the thread exists, but the calling sequence tries to
> lookup registers for a LWP with the same ID as the thread.

This is Solaris 9, with the default 1:1 model thread library, right?

> #0  find_procinfo_or_die (pid=12276, tid=67) at procfs.c:489
> #1  0x000a1cd0 in procfs_fetch_registers (ops=0x7293d8,
> regcache=0x71b1d0, 
>     regnum=-1) at procfs.c:3483
> #2  0x0012feec in sol_thread_fetch_registers (ops=0x718a70,
> regcache=0x71b1d0, 
>     regnum=-1) at sol-thread.c:457
> #3  0x00231af0 in target_fetch_registers (regcache=0x71b1d0, regno=-1)
>     at target.c:3417
> #4  0x00130e48 in ps_lgetregs (ph=0x700998, lwpid=67,
> gregset=0xffbfe37c)
>     at sol-thread.c:923
> #5  0xff0735dc in td_thr_getgregs () from /usr/lib/libthread_db.so.1
> #6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70,
> regcache=0x71b3b0, 
>     regnum=68) at sol-thread.c:473 

But what is the rest of the stack trace?  IOW, where's this being
called from?

> 
> For this stack trace of 'gdb', 'sol_thread_fetch_registers()' is passed 
> 
> (gdb) frame
> #6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70,
> regcache=0x71b3b0, 
>     regnum=68) at sol-thread.c:473
> 473       val = p_td_thr_getgregs (&thandle, gregset);
> (gdb) p *regcache
> $24 = {descr = 0x84fc40, aspace = 0x7aa258, registers = 0x846c48 "", 
>   register_status = 0x14f37c0 "", readonly_p = 0, ptid = {pid = 12276, 
>     lwp = 0, tid = 67}}
> 
> So it's looking for registers from a thread that's not associated with
> an LWP.  But the
> function 'ps_lgetregs()' is always looking for the registers on the LWP
> list.
> 
> I can't see how the callback 'ps_lgetregs()' is connected to the thread
> debug library.  In fact, the documentation for the thread debug library
> seems sparse.  I've only been able to find out about it in the man pages
> and comments section of sol-thread.c  So any pointers to documentation
> would be helpful.

That's about all there is...  Luckily or not, glibc copied the same
interface out of Solaris, so people who understand the Linux version
can understand the Solaris' one with ease.  Older Solaris versions
supported an M:N thread model, where multiple user space
threads would be mapped to the same kernel thread (LWP), and sometimes
even to no kernel thread (LWP) (when they're idle).  libthread_db.so is a
library the system provides, that debuggers load into their own address
space, that serves as bridge between user threads, and however they're
mapped underneath.  So in this case, GDB wants to fetch the registers
of some thread.  It asks libthread_db.so for its registers.  libthread_db.so
internally knows that that thread is mapped into LWP 67, and to serve
GDB's initial request, it needs to fetch the registers of LWP 67.  libthread_db.so
can't read registers off of an LWP itself, but the debugger client can.  So
libthread_db.so calls back info the debugger through the `ps_lgetregs' function
of the proc_service interface (see man ps_lgetregs).
ps_lgetregs ends up recursing into sol_thread_fetch_registers, but this
time, inferior_ptid points directly into an LWP, so we just pass the
request directly to the LWP support layer in procfs.c.  It's at this
point that things are failing for some reason.

So, next step would be understanding whether LWP 67 really still exists or not
at the failure point.  Can you find that out peeking at /proc/... from the
command line?  Maybe the LWP had just exited while GDB was attaching to the
process, but GDB hadn't processed the exit event yet?  Or has GDB failed in the
thread->lwp id mappings somewhere?

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fail to attach to process on Solaris
  2011-09-21 14:26     ` Pedro Alves
@ 2011-09-21 16:19       ` Burkhardt, Glenn
  2011-09-21 16:46         ` Pedro Alves
  2011-09-21 16:28       ` Burkhardt, Glenn
  1 sibling, 1 reply; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-09-21 16:19 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

Thank you very much for your response.

I did peek at /proc from the command line when the breakpoint in
find_procinfo_or_die() was hit - there was no corresponding LWP.  Nor,
does it seem, that any LWP with the same thread number ever existed.

Here's more extensive info, with a complete stack trace, and some
preliminary info printed after gdb attaches.  This time, the thread
number that triggered the problem was 65.

I've also filed this info under bug 13212.  I added
"puts(pi->pathname);" in procfs.c after line 725, to log creation of
procinfo_list entries.

(gdb) c
Continuing.
procfs: couldn't find pid 12276 (kernel thread 67) in procinfo list.
(gdb)

Apparently, the function 'procfs.c:find_procinfo_or_die()' fails, and
the
exception it throws forces gdb back to the command prompt.

There's some confusion about which threads are in LWP's, and which ones
aren't.
 When this failure occurs, this is the stack trace:
Breakpoint 1, find_procinfo_or_die (pid=16946, tid=65) at procfs.c:489
489           if (tid)
(gdb) whe
#0  find_procinfo_or_die (pid=16946, tid=65) at procfs.c:489
#1  0x000a1cd0 in procfs_fetch_registers (ops=0x7293d8,
regcache=0x71b1d0, 
    regnum=-1) at procfs.c:3483
#2  0x0012feec in sol_thread_fetch_registers (ops=0x718a70,
regcache=0x71b1d0, 
    regnum=-1) at sol-thread.c:457
#3  0x00231af0 in target_fetch_registers (regcache=0x71b1d0, regno=-1)
    at target.c:3417
#4  0x00130e48 in ps_lgetregs (ph=0x700998, lwpid=65,
gregset=0xffbfe37c)
    at sol-thread.c:923
#5  0xff0735dc in td_thr_getgregs () from /usr/lib/libthread_db.so.1
#6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70,
regcache=0x71b3b0, 
    regnum=68) at sol-thread.c:473
#7  0x00231af0 in target_fetch_registers (regcache=0x71b3b0, regno=68)
    at target.c:3417
#8  0x0016dd6c in regcache_raw_read (regcache=0x71b3b0, regnum=68, 
    buf=0xffbfe5c8 "") at regcache.c:604
#9  0x0016e5e0 in regcache_cooked_read (regcache=0x71b3b0, regnum=68, 
    buf=0xffbfe5c8 "") at regcache.c:695
#10 0x0016ea0c in regcache_cooked_read_unsigned (regcache=0x71b3b0,
regnum=68, 
    val=0xffbfe638) at regcache.c:746
#11 0x0016fb18 in regcache_read_pc (regcache=0x71b3b0) at regcache.c:990
#12 0x001f43bc in switch_to_thread (ptid=...) at thread.c:1005
#13 0x0008c024 in switch_to_program_space_and_thread (pspace=0x8264f0)
    at progspace.c:493
#14 0x0014e89c in insert_breakpoint_locations () at breakpoint.c:1895
#15 0x0014e7a4 in insert_breakpoints () at breakpoint.c:1856
#16 0x001da9ec in proceed (addr=18446744073709551615, 
    siggnal=TARGET_SIGNAL_DEFAULT, step=0) at infrun.c:2056
#17 0x001d1084 in continue_1 (all_threads=0) at infcmd.c:701
#18 0x001d1464 in continue_command (args=0x0, from_tty=1) at
infcmd.c:793
#19 0x000e276c in do_cfunc (c=0x7435a0, args=0x0, from_tty=1)
    at ./cli/cli-decode.c:67
#20 0x000e60f8 in cmd_func (cmd=0x7435a0, args=0x0, from_tty=1)
    at ./cli/cli-decode.c:1777
#21 0x0006cfa4 in execute_command (p=0x71a879 "", from_tty=1) at
top.c:428
#22 0x00201e38 in command_handler (command=0x71a878 "c") at
event-top.c:499
#23 0x00202844 in command_line_handler (rl=0x9e7ec8 "c") at
event-top.c:704
#24 0x00372098 in rl_callback_read_char () at callback.c:205
#25 0x00200fc0 in rl_callback_read_char_wrapper (client_data=0x0)
---Type <return> to continue, or q <return> to quit---
    at event-top.c:177
#26 0x00201c8c in stdin_event_handler (error=0, client_data=0x0)
    at event-top.c:434
#27 0x001ffc44 in handle_file_event (data=Cannot access memory at
address 0x0
) at event-loop.c:831
#28 0x001fed3c in process_event () at event-loop.c:402
#29 0x001feeb4 in gdb_do_one_event (data=0x0) at event-loop.c:467
#30 0x001f6c20 in catch_errors (func=0x1fed58 <gdb_do_one_event>, 
    func_args=0x0, errstring=0x5473d0 "", mask=6) at exceptions.c:521
#31 0x00104d48 in tui_command_loop (data=0x0) at ./tui/tui-interp.c:172
#32 0x001f7b20 in current_interp_command_loop () at interps.c:291
#33 0x0005defc in captured_command_loop (data=0x0) at ./main.c:228
#34 0x001f6c20 in catch_errors (func=0x5deec <captured_command_loop>, 
    func_args=0x0, errstring=0x5296e0 "", mask=6) at exceptions.c:521
#35 0x0005f750 in captured_main (data=0xffbff340) at ./main.c:936
#36 0x001f6c20 in catch_errors (func=0x5df48 <captured_main>, 
    func_args=0xffbff340, errstring=0x5296e0 "", mask=6) at
exceptions.c:521
#37 0x0005f794 in gdb_main (args=0xffbff340) at ./main.c:945
#38 0x0005d924 in main (argc=3, argv=0xffbff3b4) at gdb.c:35

Some relevant data:

(gdb) up 4
#4  0x00130e48 in ps_lgetregs (ph=0x700998, lwpid=65,
gregset=0xffbfe37c)
    at sol-thread.c:923
923       target_fetch_registers (regcache, -1);
(gdb) p *regcache
$1 = {descr = 0x84fc40, aspace = 0x7aa258, registers = 0x187c9d0 "", 
  register_status = 0x8e13f0 "", readonly_p = 0, ptid = {pid = 16946, 
    lwp = 65, tid = 0}}
(gdb) up 2
#6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70,
regcache=0x71b3b0, 
    regnum=68) at sol-thread.c:473
473       val = p_td_thr_getgregs (&thandle, gregset);
(gdb) p *regcache
$2 = {descr = 0x84fc40, aspace = 0x7aa258, registers = 0x846c48 "", 
  register_status = 0x14f37c0 "", readonly_p = 0, ptid = {pid = 16946, 
    lwp = 0, tid = 65}}

glenn.burkhardt $ ls /proc/16946/lwp/
1/   13/  17/  21/  25/  29/  32/  36/  4/   43/  47/  50/  54/  58/
62/  8/
10/  14/  18/  22/  26/  3/   33/  37/  40/  44/  48/  51/  55/  59/
63/  9/
11/  15/  19/  23/  27/  30/  34/  38/  41/  45/  49/  52/  56/  6/
66/
12/  16/  20/  24/  28/  31/  35/  39/  42/  46/  5/   53/  57/  60/  7/

And after issuing the 'attach' command, gdb prints:
[New process 16946]
[Thread debugging using libthread_db enabled]
/proc/16946/lwp/59
/proc/16946/lwp/3
/proc/16946/lwp/4
/proc/16946/lwp/5
/proc/16946/lwp/6
/proc/16946/lwp/7
/proc/16946/lwp/8
/proc/16946/lwp/9
/proc/16946/lwp/10
/proc/16946/lwp/11
/proc/16946/lwp/12
/proc/16946/lwp/13
/proc/16946/lwp/14
/proc/16946/lwp/15
/proc/16946/lwp/16
/proc/16946/lwp/17
/proc/16946/lwp/18
/proc/16946/lwp/19
/proc/16946/lwp/20
/proc/16946/lwp/21
/proc/16946/lwp/22
/proc/16946/lwp/23
/proc/16946/lwp/24
/proc/16946/lwp/25
/proc/16946/lwp/26
/proc/16946/lwp/27
/proc/16946/lwp/28
/proc/16946/lwp/29
/proc/16946/lwp/30
/proc/16946/lwp/31
/proc/16946/lwp/32
/proc/16946/lwp/33
/proc/16946/lwp/34
/proc/16946/lwp/35
/proc/16946/lwp/36
/proc/16946/lwp/37
/proc/16946/lwp/38
/proc/16946/lwp/39
/proc/16946/lwp/40
/proc/16946/lwp/41
/proc/16946/lwp/42
/proc/16946/lwp/43
/proc/16946/lwp/44
/proc/16946/lwp/45
/proc/16946/lwp/46
/proc/16946/lwp/47
/proc/16946/lwp/48
/proc/16946/lwp/49
/proc/16946/lwp/50
/proc/16946/lwp/51
/proc/16946/lwp/52
/proc/16946/lwp/53
/proc/16946/lwp/54
/proc/16946/lwp/55
/proc/16946/lwp/56
/proc/16946/lwp/57
/proc/16946/lwp/58
/proc/16946/lwp/60
/proc/16946/lwp/62
/proc/16946/lwp/63
/proc/16946/lwp/66
[New LWP    66        ]
[New LWP    63        ]
[New LWP    62        ]
[New LWP    60        ]
[New LWP    58        ]
[New LWP    57        ]
[New LWP    56        ]
[New LWP    55        ]
[New LWP    54        ]
[New LWP    53        ]
[New LWP    52        ]
[New LWP    51        ]
[New LWP    50        ]
[New LWP    49        ]
[New LWP    48        ]
[New LWP    47        ]
[New LWP    46        ]
[New LWP    45        ]
[New LWP    44        ]
[New LWP    43        ]
[New LWP    42        ]
[New LWP    41        ]
[New LWP    40        ]
[New LWP    39        ]
[New LWP    38        ]
[New LWP    37        ]
[New LWP    36        ]
[New LWP    35        ]
[New LWP    34        ]
[New LWP    33        ]
[New LWP    32        ]
[New LWP    31        ]
[New LWP    30        ]
[New LWP    29        ]
[New LWP    28        ]
[New LWP    27        ]
[New LWP    26        ]
[New LWP    25        ]
[New LWP    24        ]
[New LWP    23        ]
[New LWP    22        ]
[New LWP    21        ]
[New LWP    20        ]
[New LWP    19        ]
[New LWP    18        ]
[New LWP    17        ]
[New LWP    16        ]
[New LWP    15        ]
[New LWP    14        ]
[New LWP    13        ]
[New LWP    12        ]
[New LWP    11        ]
[New LWP    10        ]
[New LWP    9        ]
[New LWP    8        ]
[New LWP    7        ]
[New LWP    6        ]
[New LWP    5        ]
[New LWP    4        ]
[New LWP    3        ]
[New LWP    59        ]
[New Thread 1 (LWP 1)]
[New Thread 3        ]
[New Thread 4 (LWP 4)]
[New Thread 5 (LWP 5)]
[New Thread 6        ]
[New Thread 7 (LWP 7)]
[New Thread 8 (LWP 8)]
[New Thread 9        ]
[New Thread 10 (LWP 10)]
[New Thread 11 (LWP 11)]
[New Thread 12        ]
[New Thread 13 (LWP 13)]
[New Thread 14 (LWP 14)]
[New Thread 15        ]
[New Thread 16 (LWP 16)]
[New Thread 17 (LWP 17)]
[New Thread 18        ]
[New Thread 19 (LWP 19)]
[New Thread 20 (LWP 20)]
[New Thread 21        ]
[New Thread 22        ]
[New Thread 23        ]
[New Thread 24 (LWP 24)]
[New Thread 25        ]
[New Thread 26        ]
[New Thread 27        ]
[New Thread 28 (LWP 28)]
[New Thread 29 (LWP 29)]
[New Thread 30        ]
[New Thread 31 (LWP 31)]
[New Thread 32 (LWP 32)]
[New Thread 33 (LWP 33)]
[New Thread 34        ]
[New Thread 35        ]
[New Thread 36        ]
[New Thread 37        ]
[New Thread 38        ]
[New Thread 39        ]
[New Thread 40        ]
[New Thread 41        ]
[New Thread 42        ]
[New Thread 43 (LWP 43)]
[New Thread 44        ]
[New Thread 45 (LWP 45)]
[New Thread 46        ]
[New Thread 47        ]
[New Thread 48        ]
[New Thread 49        ]
[New Thread 50        ]
[New Thread 51        ]
[New Thread 52        ]
[New Thread 53        ]
[New Thread 54        ]
[New Thread 55        ]
[New Thread 56 (LWP 56)]
[New Thread 57        ]
[New Thread 58 (LWP 58)]
[New Thread 59 (LWP 59)]
[New Thread 60        ]
[New Thread 62        ]
[New Thread 63        ]
[New Thread 66 (LWP 66)]
[New Thread 2        ]
[New Thread 61        ]
[New Thread 64        ]
[New Thread 65        ]

So, thread 65 isn't executing in an LWP.  But the call to ps_lgetregs()
gets
made assuming that the registers from LWP 65 are wanted, instead of
thread 65. 
There's no LWP 65 on the procinfo_list, so the call fails.


> -----Original Message-----
> From: Pedro Alves [mailto:pedro@codesourcery.com] 
> Sent: Wednesday, September 21, 2011 10:26 AM
> To: Burkhardt, Glenn
> Cc: gdb@sourceware.org
> Subject: Re: fail to attach to process on Solaris
> 
> On Wednesday 21 September 2011 00:22:21, Burkhardt, Glenn wrote:
> > The problem appears that thread debug library has callback for 
> > register get operation that's connected to 
> > "sol-thread.c:ps_lgetregs()".  In the case that fails, the thread 
> > exists, but the calling sequence tries to lookup registers 
> for a LWP with the same ID as the thread.
> 
> This is Solaris 9, with the default 1:1 model thread library, right?
> 
> > #0  find_procinfo_or_die (pid=12276, tid=67) at procfs.c:489
> > #1  0x000a1cd0 in procfs_fetch_registers (ops=0x7293d8, 
> > regcache=0x71b1d0,
> >     regnum=-1) at procfs.c:3483
> > #2  0x0012feec in sol_thread_fetch_registers (ops=0x718a70, 
> > regcache=0x71b1d0,
> >     regnum=-1) at sol-thread.c:457
> > #3  0x00231af0 in target_fetch_registers 
> (regcache=0x71b1d0, regno=-1)
> >     at target.c:3417
> > #4  0x00130e48 in ps_lgetregs (ph=0x700998, lwpid=67,
> > gregset=0xffbfe37c)
> >     at sol-thread.c:923
> > #5  0xff0735dc in td_thr_getgregs () from /usr/lib/libthread_db.so.1
> > #6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70, 
> > regcache=0x71b3b0,
> >     regnum=68) at sol-thread.c:473
> 
> But what is the rest of the stack trace?  IOW, where's this 
> being called from?
> 
> > 
> > For this stack trace of 'gdb', 'sol_thread_fetch_registers()' is 
> > passed
> > 
> > (gdb) frame
> > #6  0x0012fff8 in sol_thread_fetch_registers (ops=0x718a70, 
> > regcache=0x71b3b0,
> >     regnum=68) at sol-thread.c:473
> > 473       val = p_td_thr_getgregs (&thandle, gregset);
> > (gdb) p *regcache
> > $24 = {descr = 0x84fc40, aspace = 0x7aa258, registers = 
> 0x846c48 "", 
> >   register_status = 0x14f37c0 "", readonly_p = 0, ptid = 
> {pid = 12276, 
> >     lwp = 0, tid = 67}}
> > 
> > So it's looking for registers from a thread that's not 
> associated with 
> > an LWP.  But the function 'ps_lgetregs()' is always looking for the 
> > registers on the LWP list.
> > 
> > I can't see how the callback 'ps_lgetregs()' is connected to the 
> > thread debug library.  In fact, the documentation for the 
> thread debug 
> > library seems sparse.  I've only been able to find out 
> about it in the 
> > man pages and comments section of sol-thread.c  So any pointers to 
> > documentation would be helpful.
> 
> That's about all there is...  Luckily or not, glibc copied 
> the same interface out of Solaris, so people who understand 
> the Linux version can understand the Solaris' one with ease.  
> Older Solaris versions supported an M:N thread model, where 
> multiple user space threads would be mapped to the same 
> kernel thread (LWP), and sometimes even to no kernel thread 
> (LWP) (when they're idle).  libthread_db.so is a library the 
> system provides, that debuggers load into their own address 
> space, that serves as bridge between user threads, and 
> however they're mapped underneath.  So in this case, GDB 
> wants to fetch the registers of some thread.  It asks 
> libthread_db.so for its registers.  libthread_db.so 
> internally knows that that thread is mapped into LWP 67, and 
> to serve GDB's initial request, it needs to fetch the 
> registers of LWP 67.  libthread_db.so can't read registers 
> off of an LWP itself, but the debugger client can.  So 
> libthread_db.so calls back info the debugger through the 
> `ps_lgetregs' function of the proc_service interface (see man 
> ps_lgetregs).
> ps_lgetregs ends up recursing into 
> sol_thread_fetch_registers, but this time, inferior_ptid 
> points directly into an LWP, so we just pass the request 
> directly to the LWP support layer in procfs.c.  It's at this 
> point that things are failing for some reason.
> 
> So, next step would be understanding whether LWP 67 really 
> still exists or not at the failure point.  Can you find that 
> out peeking at /proc/... from the command line?  Maybe the 
> LWP had just exited while GDB was attaching to the process, 
> but GDB hadn't processed the exit event yet?  Or has GDB failed in the
> thread->lwp id mappings somewhere?
> 
> --
> Pedro Alves
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fail to attach to process on Solaris
  2011-09-21 14:26     ` Pedro Alves
  2011-09-21 16:19       ` Burkhardt, Glenn
@ 2011-09-21 16:28       ` Burkhardt, Glenn
  1 sibling, 0 replies; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-09-21 16:28 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

gdb $ uname -a
SunOS bos0ux02 5.9 Generic_118558-28 sun4u sparc SUNW,Sun-Fire-V440
gcc (GCC) 4.5.2
gdb $ ./gdb --version
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show
copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.9".

I'm fairly sure the default thread library was used. The executable was
linked with -lpthread, and uses

/usr/lib/libpthread.so.1
/usr/lib/libthread.so.1


> -----Original Message-----
> From: Pedro Alves [mailto:pedro@codesourcery.com] 
> Sent: Wednesday, September 21, 2011 10:26 AM
> To: Burkhardt, Glenn
> Cc: gdb@sourceware.org
> Subject: Re: fail to attach to process on Solaris
> 
> This is Solaris 9, with the default 1:1 model thread library, right?
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: fail to attach to process on Solaris
  2011-09-21 16:19       ` Burkhardt, Glenn
@ 2011-09-21 16:46         ` Pedro Alves
  2011-09-21 17:16           ` Burkhardt, Glenn
  0 siblings, 1 reply; 12+ messages in thread
From: Pedro Alves @ 2011-09-21 16:46 UTC (permalink / raw)
  To: Burkhardt, Glenn; +Cc: gdb

On Wednesday 21 September 2011 17:18:25, Burkhardt, Glenn wrote:
> Thank you very much for your response.
> 
> I did peek at /proc from the command line when the breakpoint in
> find_procinfo_or_die() was hit - there was no corresponding LWP.  Nor,
> does it seem, that any LWP with the same thread number ever existed.
> 
> Here's more extensive info, with a complete stack trace, and some
> preliminary info printed after gdb attaches.  This time, the thread
> number that triggered the problem was 65.

Okay.  I assume your program isn't spawning and exiting threads
quickly in succession, otherwise, we'd see LWP ids much higher.

It's libthread_db.so that maps a thread to a LWP id, so we
may be missing some state checks and getting back a stale id.
Try the "maint info sol-threads" command (I never noticed this
command before), and let's see what state does libthread_db.so
think the thread is in.  I see that linux-thread-db.c (the glibc/linux
fork of this code) has extra checks for ignoring threads in some states
that the Solaris code doesn't have.

Please don't top post.  That has a tendency of making one
forget to answer questions.  :-)  Here it is again:

> This is Solaris 9, with the default 1:1 model thread library, right?

-- 
Pedro Alves

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fail to attach to process on Solaris
  2011-09-21 16:46         ` Pedro Alves
@ 2011-09-21 17:16           ` Burkhardt, Glenn
  2011-09-21 17:39             ` Pedro Alves
  0 siblings, 1 reply; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-09-21 17:16 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

[-- Attachment #1: Type: text/plain, Size: 2573 bytes --]

> -----Original Message-----
> From: Pedro Alves [mailto:pedro@codesourcery.com] 
> Sent: Wednesday, September 21, 2011 12:46 PM
> To: Burkhardt, Glenn
> Cc: gdb@sourceware.org
> Subject: Re: fail to attach to process on Solaris
> 
> On Wednesday 21 September 2011 17:18:25, Burkhardt, Glenn wrote:
> > Thank you very much for your response.
> > 
> > I did peek at /proc from the command line when the breakpoint in
> > find_procinfo_or_die() was hit - there was no corresponding 
> LWP.  Nor, 
> > does it seem, that any LWP with the same thread number ever existed.
> > 
> > Here's more extensive info, with a complete stack trace, and some 
> > preliminary info printed after gdb attaches.  This time, the thread 
> > number that triggered the problem was 65.
> 
> Okay.  I assume your program isn't spawning and exiting 
> threads quickly in succession, otherwise, we'd see LWP ids 
> much higher.
> 
> It's libthread_db.so that maps a thread to a LWP id, so we 
> may be missing some state checks and getting back a stale id.
> Try the "maint info sol-threads" command (I never noticed 
> this command before), and let's see what state does 
> libthread_db.so think the thread is in.  I see that 
> linux-thread-db.c (the glibc/linux fork of this code) has 
> extra checks for ignoring threads in some states that the 
> Solaris code doesn't have.
> 
> Please don't top post.  That has a tendency of making one 
> forget to answer questions.  :-)  Here it is again:
> 
> > This is Solaris 9, with the default 1:1 model thread library, right?
> 
> --
> Pedro Alves
> 

So, this time the first thread to fail is #68, and the maint command
shows the thread as having a 'zombie' LWP:

 - Sleep func: 0x6ccfa4
user   thread #67, lwp 67, (active)    startfunc: bootStrap
user   thread #69, lwp 69, (asleep)    startfunc: bootStrap
 - Sleep func: 0x6ccfa4
user   thread #2, lwp 2, (zombie)    startfunc: bootStrap
user   thread #61, lwp 61, (zombie)    startfunc: bootStrap
user   thread #64, lwp 64, (zombie)    startfunc: bootStrap
user   thread #65, lwp 65, (zombie)    startfunc: bootStrap
user   thread #70, lwp 70, (zombie)    startfunc: bootStrap
user   thread #66, lwp 66, (zombie)    startfunc: bootStrap
user   thread #68, lwp 68, (zombie)    startfunc: bootStrap
(gdb) c
Continuing.
procfs: couldn't find pid 16946 (kernel thread 68) in procinfo list.
(gdb) det
Detaching from program:
/home/glenn.burkhardt/targets/sol/ata0a/rms/rms.sparc, process 16946


A complete log of the session is attached.

[-- Attachment #2: gdb.log --]
[-- Type: application/octet-stream, Size: 8192 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: fail to attach to process on Solaris
  2011-09-21 17:16           ` Burkhardt, Glenn
@ 2011-09-21 17:39             ` Pedro Alves
  2011-09-21 18:38               ` Burkhardt, Glenn
  2011-09-21 18:45               ` Burkhardt, Glenn
  0 siblings, 2 replies; 12+ messages in thread
From: Pedro Alves @ 2011-09-21 17:39 UTC (permalink / raw)
  To: Burkhardt, Glenn; +Cc: gdb

On Wednesday 21 September 2011 18:15:48, Burkhardt, Glenn wrote:

> > It's libthread_db.so that maps a thread to a LWP id, so we 
> > may be missing some state checks and getting back a stale id.
> > Try the "maint info sol-threads" command (I never noticed 
> > this command before), and let's see what state does 
> > libthread_db.so think the thread is in.  I see that 
> > linux-thread-db.c (the glibc/linux fork of this code) has 
> > extra checks for ignoring threads in some states that the 
> > Solaris code doesn't have.

> So, this time the first thread to fail is #68, and the maint command
> shows the thread as having a 'zombie' LWP:
> 
>  - Sleep func: 0x6ccfa4
> user   thread #67, lwp 67, (active)    startfunc: bootStrap
> user   thread #69, lwp 69, (asleep)    startfunc: bootStrap
>  - Sleep func: 0x6ccfa4
> user   thread #2, lwp 2, (zombie)    startfunc: bootStrap
> user   thread #61, lwp 61, (zombie)    startfunc: bootStrap
> user   thread #64, lwp 64, (zombie)    startfunc: bootStrap
> user   thread #65, lwp 65, (zombie)    startfunc: bootStrap
> user   thread #70, lwp 70, (zombie)    startfunc: bootStrap
> user   thread #66, lwp 66, (zombie)    startfunc: bootStrap
> user   thread #68, lwp 68, (zombie)    startfunc: bootStrap
> (gdb) c
> Continuing.
> procfs: couldn't find pid 16946 (kernel thread 68) in procinfo list.
> (gdb) det
> Detaching from program:
> /home/glenn.burkhardt/targets/sol/ata0a/rms/rms.sparc, process 16946
> 
> 
> A complete log of the session is attached.

It got trimmed where it began to be interesting.  :-(

Okay, the linux code ignores zombie threads, like in the patch
below.  Does that help?  There's a couple more places where it
ignores zombie threads, that we may need to bring over as well.
Look for TD_THR_ZOMBIE in linux-thread-db.c.

-- 
Pedro Alves

---
 gdb/sol-thread.c |    3 +++
 1 file changed, 3 insertions(+)

Index: src/gdb/sol-thread.c
===================================================================
--- src.orig/gdb/sol-thread.c	2011-03-01 16:00:06.000000000 +0000
+++ src/gdb/sol-thread.c	2011-09-21 18:34:30.029928904 +0100
@@ -1177,6 +1177,9 @@ sol_find_new_threads_callback (const td_
   if (retval != TD_OK)
     return -1;
 
+  if (ti.ti_state == TD_THR_UNKNOWN || ti.ti_state == TD_THR_ZOMBIE)
+    return 0;			/* A zombie -- ignore.  */
+
   ptid = BUILD_THREAD (ti.ti_tid, PIDGET (inferior_ptid));
   if (!in_thread_list (ptid) || is_exited (ptid))
     add_thread (ptid);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fail to attach to process on Solaris
  2011-09-21 17:39             ` Pedro Alves
@ 2011-09-21 18:38               ` Burkhardt, Glenn
  2011-09-21 18:45               ` Burkhardt, Glenn
  1 sibling, 0 replies; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-09-21 18:38 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

[-- Attachment #1: Type: text/plain, Size: 386 bytes --]

> -----Original Message-----
> From: Pedro Alves [mailto:pedro@codesourcery.com] 
> Sent: Wednesday, September 21, 2011 1:39 PM
> To: Burkhardt, Glenn
> Cc: gdb@sourceware.org
> Subject: Re: fail to attach to process on Solaris
>
> It got trimmed where it began to be interesting.  :-(

Sorry about that!  I forgot to exit from the 'script' command.  A full
log is attached.

[-- Attachment #2: gdb.log --]
[-- Type: application/octet-stream, Size: 13621 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fail to attach to process on Solaris
  2011-09-21 17:39             ` Pedro Alves
  2011-09-21 18:38               ` Burkhardt, Glenn
@ 2011-09-21 18:45               ` Burkhardt, Glenn
  1 sibling, 0 replies; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-09-21 18:45 UTC (permalink / raw)
  To: Pedro Alves; +Cc: gdb

> -----Original Message-----
> From: Pedro Alves [mailto:pedro@codesourcery.com] 
> Sent: Wednesday, September 21, 2011 1:39 PM
> To: Burkhardt, Glenn
> Cc: gdb@sourceware.org
> Subject: Re: fail to attach to process on Solaris
> 
> 
> Okay, the linux code ignores zombie threads, like in the 
> patch below.  Does that help?  There's a couple more places 
> where it ignores zombie threads, that we may need to bring 
> over as well.
> Look for TD_THR_ZOMBIE in linux-thread-db.c.
> 
> --
> Pedro Alves
> 
> ---
>  gdb/sol-thread.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> Index: src/gdb/sol-thread.c
> ===================================================================
> --- src.orig/gdb/sol-thread.c	2011-03-01 16:00:06.000000000 +0000
> +++ src/gdb/sol-thread.c	2011-09-21 18:34:30.029928904 +0100
> @@ -1177,6 +1177,9 @@ sol_find_new_threads_callback (const td_
>    if (retval != TD_OK)
>      return -1;
>  
> +  if (ti.ti_state == TD_THR_UNKNOWN || ti.ti_state == TD_THR_ZOMBIE)
> +    return 0;			/* A zombie -- ignore.  */
> +
>    ptid = BUILD_THREAD (ti.ti_tid, PIDGET (inferior_ptid));
>    if (!in_thread_list (ptid) || is_exited (ptid))
>      add_thread (ptid);
> 

The patch makes a big difference.  At least, I can continue the attached
process now.  I'll play with it a bit more, and report back.  I'll also
check for other places where
TD_THR_ZOMBIE in linux-thread-db.c is used.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: fail to attach to process on Solaris
@ 2011-09-22 13:13 Burkhardt, Glenn
  0 siblings, 0 replies; 12+ messages in thread
From: Burkhardt, Glenn @ 2011-09-22 13:13 UTC (permalink / raw)
  To: gdb

I took a look at the other places where changes might be needed, and I
only found one.  But I'm not completely sure about it.  

There are also a couple of places where the return code from
p_td_thr_get_info() is compared against the value TD_NOTHR, but the
documentation doesn't list that as a possible return value.  Probably
TD_BADTH is better.

I suggest these changes in addition to your patch.

*** sol-thread.c.orig   2011-09-21 18:42:12.506406776 -0400
--- sol-thread.c        2011-09-21 18:58:55.600242634 -0400
***************
*** 261,267 ****
      error (_("thread_to_lwp: td_ta_map_id2thr %s"), td_err_string
(val));
  
    val = p_td_thr_get_info (&th, &ti);
!   if (val == TD_NOTHR)
      return pid_to_ptid (-1);  /* Thread must have terminated.  */
    else if (val != TD_OK)
      error (_("thread_to_lwp: td_thr_get_info: %s"), td_err_string
(val));
--- 261,267 ----
      error (_("thread_to_lwp: td_ta_map_id2thr %s"), td_err_string
(val));
  
    val = p_td_thr_get_info (&th, &ti);
!   if (val == TD_BADTH)
      return pid_to_ptid (-1);  /* Thread must have terminated.  */
    else if (val != TD_OK)
      error (_("thread_to_lwp: td_thr_get_info: %s"), td_err_string
(val));
***************
*** 310,319 ****
      error (_("lwp_to_thread: td_thr_validate: %s."), td_err_string
(val));
  
    val = p_td_thr_get_info (&th, &ti);
!   if (val == TD_NOTHR)
      return pid_to_ptid (-1);  /* Thread must have terminated.  */
    else if (val != TD_OK)
      error (_("lwp_to_thread: td_thr_get_info: %s."), td_err_string
(val));
  
    return BUILD_THREAD (ti.ti_tid, PIDGET (lwp));
  }
--- 310,321 ----
      error (_("lwp_to_thread: td_thr_validate: %s."), td_err_string
(val));
  
    val = p_td_thr_get_info (&th, &ti);
!   if (val == TD_BADTH)
      return pid_to_ptid (-1);  /* Thread must have terminated.  */
    else if (val != TD_OK)
      error (_("lwp_to_thread: td_thr_get_info: %s."), td_err_string
(val));
+   else if (ti.ti_state == TD_THR_ZOMBIE)
+     return pid_to_ptid (-1);    /* Thread has terminated */
  
    return BUILD_THREAD (ti.ti_tid, PIDGET (lwp));
  }

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-09-22 13:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-22 15:05 fail to attach to process on Solaris Burkhardt, Glenn
2011-09-02 21:24 ` Pedro Alves
2011-09-20 23:22   ` Burkhardt, Glenn
2011-09-21 14:26     ` Pedro Alves
2011-09-21 16:19       ` Burkhardt, Glenn
2011-09-21 16:46         ` Pedro Alves
2011-09-21 17:16           ` Burkhardt, Glenn
2011-09-21 17:39             ` Pedro Alves
2011-09-21 18:38               ` Burkhardt, Glenn
2011-09-21 18:45               ` Burkhardt, Glenn
2011-09-21 16:28       ` Burkhardt, Glenn
2011-09-22 13:13 Burkhardt, Glenn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).