public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Partha Satapathy <partha.satapathy@oracle.com>
To: Pedro Alves <pedro@palves.net>,
	gdb-patches@sourceware.org,
	rajesh.sivaramasubramaniom@oracle.com, bert.barbe@oracle.com,
	blarsen@redhat.com, cupertino.miranda@oracle.com, tom@tromey.com
Subject: Re: [PATCH 1/1 V5] gdb : Signal to pstack/gdb kills the attached process.
Date: Mon, 10 Jun 2024 11:11:38 +0530	[thread overview]
Message-ID: <9b8a2ca5-d238-4cf6-9933-92445f693e75@oracle.com> (raw)
In-Reply-To: <53f492c7-f144-48ef-8e92-f4894644b88d@oracle.com>

On 6/3/2024 10:51 AM, Partha Satapathy wrote:
> On 5/16/2024 12:45 PM, Partha Satapathy wrote:
>> On 5/13/2024 8:19 PM, Pedro Alves wrote:
>>> Adding another question to the list below.  (I haven't tried to 
>>> reproduce this yet myself, btw.)
>>>
>>> On 2024-05-10 21:19, Pedro Alves wrote:
>>>> Hi.
>>>>
>>>> Just wanted to let you know that I've read all the discussion around 
>>>> this until this
>>>> email I'm replying to, and started thinking about it a bit. 
>>>> Unfortunately this is one of
>>>> those areas in GDB where the right change is rarely immediately 
>>>> obvious (to me).
>>>>
>>>> Some questions:
>>>>
>>>>   - If you ctrl-c to abort the attach, do we really abort the
>>>>     attach properly?  Or do we stay attached in some half broken state?
>>>>
>>>>   - Below you mention pstack, where can we find it?  And you mention
>>>>     that ctrl-c is pressed while that is printing a stack.  I'm 
>>>> assuming
>>>>     that's a backtrace command.  I'm confused in that case, as if 
>>>> that is
>>>>     so, then we should already be past the initial attach.  The 
>>>> question
>>>>     would then becomes, shouldn't gdb have the terminal at that point?
>>>>     How come it does not?
>>>
>>> #3 - The patch description states:
>>>
>>>   > Problem: While gdb is attaching an inferior, if ctrl-c is pressed 
>>> in the
>>>   > middle of the process attach,  the sigint is passed to the debugged
>>>   > process.  This triggers the exit of the inferior.
>>>
>>> This SIGINT passing is done with "kill(-pgrp, SIGINT)".  How does 
>>> that manage
>>> to trigger the exit of the inferior at all?  ptrace should intercept the
>>> SIGINT before the inferior ever sees it.  Did it not?
>>>
>>> Or could it be that the real issue is that because that sends the SIGINT
>>> to all the processes in the inferior's pgrp, we kill more processes than
>>> the one we're attaching to, and those processes exiting cause the 
>>> inferior
>>> to exit as well.  If so, then this is orthogonal to the initial attach,
>>> and can happen after the attach as well.  There is a bug open about this
>>> on bugzilla.
>>>
>>> Pedro Alves
>>>
>>>>
>>>> I'm wondering whether Baris's patch to eliminate the inferior
>>>> continuations would help with this, as it probably makes the attaching
>>>> sequence synchronous.  I should probably look at that one.
>>>>
>>>> Pedro Alves
>>>>
>>
>>
>> Thanks Pedro and Tom for reviewing the problem.
>>
>>
>> Problem :
>> pstack,  dumps the stack of all threads in a process. In some cases 
>> printing of stack can take significant time and ctrl-c is pressed to 
>> abort pstack/gdb application. This in turn kills the debugged process, 
>> which can be  critical for the system. In this case the intention of 
>> “ctrl+c” to kill pstack/gdb, but not the target application.
>>
>>
>> # tail pstack -n 12
>>
>> # Run GDB, strip out unwanted noise.
>> # --readnever is no longer used since .gdb_index is now in use.
>> $GDB --quiet -nx $GDBARGS /proc/$1/exe $1 <<EOF 2>&1 |
>> set width 0
>> set height 0
>> set pagination no
>> $backtrace
>> EOF
>> /bin/sed -n \
>>      -e 's/^\((gdb) \)*//' \
>>      -e '/^#/p' \
>>      -e '/^Thread/p'
>>
>>
>> This is the interest part in the pstack, rest is cosmetic.
>>
>> pstack uses:
>> # pstack 1
>>
>> #0  0x00007fa18cf44017 in epoll_wait () from /lib64/libc.so.6
>> #1  0x00007fa18e67e036 in sd_event_wait () from 
>> /usr/lib/systemd/libsystemd-shared-239.so
>> #2  0x00007fa18e67f33b in sd_event_run () from 
>> /usr/lib/systemd/libsystemd-shared-239.so
>> #3  0x000055c155da8c22 in manager_loop ()
>> #4  0x000055c155d5f133 in main ()
>>
>> Reproduction:
>>
>> The debugged application generally attached to process by:
>> gdb -p <<pid>>
>> or gdb /proc/<<pid>>/exe pid
>> pstack uses the latter  method to attach the debugged to gdb. If the
>> application is large or process of reading symbols is slow, gives a good
>> window to press the ctrl+c during attach. Spawning "gdb" under "strace
>> -k" makes gdb a lot slower and gives a larger window to easily press the
>> ctrl+c at the precise period i.e. during the attach of the debugged
>> process. The above strace hack will enhance rate of reproduction of the
>> issue. Testcase:
>>
>> With GDB 13.1
>> ps aux | grep abrtd
>> root     2195168   /usr/sbin/abrtd -d -s
>>
>> #strace -k -o log gdb -p 2195168
>> Attaching to process 2195168
>> [New LWP 2195177]
>> [New LWP 2195179]
>> ^C[Thread debugging using libthread_db enabled]
>> <<<<   Note the ctrl+c is pressed after attach is initiated and it’s
>> still reading the symbols from library >>>> Using host libthread_db
>> library "/lib64/libthread_db.so.1".
>> 0x00007fe3ed6d70d1 in poll () from /lib64/libc.so.6
>> (gdb) q
>> A debugging session is active.
>>             Inferior 1 [process 2195168] will be detached Quit anyway? (y
>> or n) y Detaching from program: /usr/sbin/abrtd, process 2195168
>>
>> # ps aux | grep 2195168
>> <<<< Process exited >>>>
>>
>> This is having a very narrow window to press the ctrlc.
>> Session1 :
>>
>> ]$ ps aux | grep abrtd
>> root        1329  0.0  0.0 602624 13076 ?        Ssl  May03   0:00 
>> /usr/sbin/abrtd -d -s
>>
>> Session2:
>>
>> # ./tpstack 1329
>>
>> + strace -o omlog -k ./gdb --quiet -nx -ex 'set width 0' -ex 'set 
>> height 0' -ex 'set pagination no' -ex 'set confirm off' -ex 'thread 
>> apply all bt' -ex quit /proc/1329/exe 1329
>> Reading symbols from /proc/1329/exe...
>> Python Exception <class 'AttributeError'>: module 'gdb' has no 
>> attribute '_handle_missing_debuginfo'
>> Reading symbols from .gnu_debugdata for /usr/sbin/abrtd...
>> (No debugging symbols found in .gnu_debugdata for /usr/sbin/abrtd)
>> Attaching to program: /proc/1329/exe, process 1329
>> [New LWP 1399]
>> [New LWP 1349] ^C
>>
>> Session1:
>> [opc@pssatapa-ol8 TEST]$ ps aux | grep abrtd
>> <<<1329 Is killed >>>
>>
>> This is a very small window, so a heavy application is good for 
>> reproduction. I modified the the last part of pstack like:
>> # Run GDB, strip out unwanted noise.
>> # --readnever is no longer used since .gdb_index is now in use.
>> strace -o omlog -k  ./gdb  --quiet -nx  -ex 'set width 0' -ex 'set 
>> height 0' -ex 'set pagination no' -ex 'set confirm off' -ex 'thread 
>> apply all bt' -ex quit  /proc/$1/exe $1
>>
>> The strace with -k on gdb make gdb slow and we get a window to press 
>> Ctrl+c.  otherwise the window is very small to time the signal. We 
>> observe the problem while the FileStsyem or Kernel or proc FS is slow.
>>
>> The signal is not intended to the inferior.
>> The signal is passed from "gdb" to the inferior.
>>
>> The SIGINT handler in gdb, marks the QUIT flag and
>> in some paths we check the quit flag and pass the signal to inferior.
>> That is killing the inferior.
>>
>> On :
>> +  check_quit_flag();
>> This should be set only when inf->attach_flag is true.
>> I will add the check in next iteration.
>> The idea here is to clear any pending QUIT flag set by sigint
>> else, post we set the sync_flag , a check to QUIT Flag
>> and can kill the inferior.
>>
>> Thanks
>> Partha
> 
> Hi Pedro,
> 
> I do understand the last reply way a bit long to explain you the 
> context. Please find the answers to your questions.
> 
> Here are the questions:
>  >  - If you ctrl-c to abort the attach, do we really abort the
>  >    attach properly?  Or do we stay attached in some half broken state?
>  >
> Let take the case of "gdb -p pid" and press a ctrl + c immediately.
> The intention is to kill the gdb , not to abort the attachment or
> kill the attached pid. In the problem case debugged process is killed.
> 
>  >  - Below you mention pstack, where can we find it?  And you mention
>  >    that ctrl-c is pressed while that is printing a stack.  I'm assuming
>  >    that's a backtrace command.  I'm confused in that case, as if that is
>  >    so, then we should already be past the initial attach.  The question
>  >    would then becomes, shouldn't gdb have the terminal at that point?
>  >    How come it does not?
> 
> pstack is a wrapper over gdb and run as:
> gdb --quiet -nx -ex 'set width 0' -ex 'set height
> 0' -ex 'set pagination no' -ex 'set confirm off' -ex 'thread apply all
> bt' -ex quit /proc/<<pid>/exe <<pid>>
> 
> Here gdb is run with -"ex" so so it does not need the gdb prompt to 
> issue the "back trace" command.
> 
> The Ctrl+C is issued in a window between initial attach and 
> target_post_attach.
> 
>  >This SIGINT passing is done with "kill(-pgrp, SIGINT)".  How does that 
>  >manage to trigger the exit of the inferior at all?  ptrace should 
>  >intercept the SIGINT before the inferior ever sees it.  Did it not?
> 
> /* Handle a SIGINT.  */
> void
> handle_sigint (int sig)
> {
>    signal (sig, handle_sigint);
> 
>    /* We could be running in a loop reading in symfiles or something so
>       it may be quite a while before we get back to the event loop.  So
>       set quit_flag to 1 here.  Then if QUIT is called before we get to
>       the event loop, we will unwind as expected.  */
>    set_quit_flag ();
> 
> In the problem case, gdb is reading the sysmbol files and we received 
> the sigint. We set the quit flag in this context.
> 
> In the event loop we check the quit flag (check_quit_flag) and pass the 
> signal to the inferior with "child_pass_ctrlc". This kills the target.
> 
> Its fine when gdb has the "GDB" prompt and it owns the terminal and we 
> once we press Ctrl+C and that is passed to the target. But with -"ex" 
> option, use never experience the GDB prompt and the signal should only 
> kill gdb not the attached.
> 
> Best way to visualize the issue is to run:
> gdb --quiet -nx -ex 'set width 0' -ex 'set height 0' -ex 'set pagination 
> no' -ex 'set confirm off' -ex 'thread apply all  bt' -ex quit /proc/1/exe 1
> 
> and we have a sigint (Ctrl+C) while gdb reading the symbol files.
> 
> Thanks
> Partha

Hi GDB team ,

Can you please update on this.

Thanks
Partha

  reply	other threads:[~2024-06-10  5:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-09 10:53 Partha Satapathy
2024-05-09 14:32 ` Tom Tromey
2024-05-10 20:19 ` Pedro Alves
2024-05-13 14:49   ` Pedro Alves
2024-05-16  7:15     ` Partha Satapathy
2024-06-03  5:21       ` Partha Satapathy
2024-06-10  5:41         ` Partha Satapathy [this message]
2024-06-14 17:19         ` Pedro Alves
2024-06-20  7:24           ` [External] : " Partha Satapathy
2024-06-23  5:42             ` Partha Satapathy
2024-06-24 20:04               ` Partha Satapathy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b8a2ca5-d238-4cf6-9933-92445f693e75@oracle.com \
    --to=partha.satapathy@oracle.com \
    --cc=bert.barbe@oracle.com \
    --cc=blarsen@redhat.com \
    --cc=cupertino.miranda@oracle.com \
    --cc=gdb-patches@sourceware.org \
    --cc=pedro@palves.net \
    --cc=rajesh.sivaramasubramaniom@oracle.com \
    --cc=tom@tromey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).