public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* Re:Re:GDB often is blocked at async_file_flush
@ 2021-07-05 12:25 周春明(日月)
  2021-07-05 12:38 ` Pedro Alves
  0 siblings, 1 reply; 6+ messages in thread
From: 周春明(日月) @ 2021-07-05 12:25 UTC (permalink / raw)
  To: 周春明(日月),
	Simon Marchi, Gdb-patches, gdb-patches

Hi Simon,
I did more experiments today, basically I can confirm the reason of GDB stuck in that loop is the linux_nat_event_pipe[1] is closed while SIGCHLD happens.
They are asynchronize, so the issue is random.
But I still don't know exactly why linux_nat_event_pipe[1] is closed with unexpected.  Even sometimes SIGCHLD happens at same time as below line52: target_async(0);

36| void
37| inferior_event_handler (enum inferior_event_type event_type)
38| {
39|  switch (event_type)
40|  {
41|  case INF_REG_EVENT:
42|  fetch_inferior_event ();
43|  break;
44|
45|  case INF_EXEC_COMPLETE:
46|  if (!non_stop) 
47|  {
48|  /* Unregister the inferior from the event loop. This is done
49|   so that when the inferior is not running we don't get
50|   distracted by spurious inferior output. */
51|  if (target_has_execution && target_can_async_p ())
52+> target_async (0);
53|  }

-David
------------------------------------------------------------------
发件人:周春明(日月) <riyue.zcm@alibaba-inc.com>
发送时间:2021年7月5日(星期一) 13:30
收件人:Simon Marchi <simon.marchi@polymtl.ca>; Gdb-patches <gdb-patches-bounces+riyue.zcm=alibaba-inc.com@sourceware.org>; gdb-patches <gdb-patches@sourceware.org>
主 题:Re:Re:GDB often is blocked at async_file_flush



------------------------------------------------------------------
发件人:Simon Marchi <simon.marchi@polymtl.ca>
发送时间:2021年7月5日(星期一) 08:53
收件人:周春明(日月) <riyue.zcm@alibaba-inc.com>; Gdb-patches <gdb-patches-bounces+riyue.zcm=alibaba-inc.com@sourceware.org>; gdb-patches <gdb-patches@sourceware.org>
主 题:Re: Re:GDB often is blocked at async_file_flush

On 2021-07-04 8:13 p.m., 周春明(日月) wrote:
> Hi Simon,
> Thanks for reply.
> and yes, gdb is stuck in this loop:
>   do
>     {
>       ret = read (linux_nat_event_pipe[0], &buf, 1);
>     }
>   while (ret >= 0 || (ret == -1 && errno == EINTR));
> 
> The ret from read is always 0 when stuck happens. With my further debug in kernel pipe_read, this situation happens when pipe->writers is NULL.
> Because this is random issue, I compared with normal execution, the pipe->writers is not NULL and pipe->wait_writers is null, pipe_read will return -EAGAIN, then above loop exit normally.
> So do you know when pipe->writers would be NULL? sub-process is suspended?

Hmm, does that mean that the writer end of the pipe would be closed, but
not the read end?  I don't see how that can happen, as they are both
closed as a pair in linux_async_pipe, when enable is 0.

I tried the following test program, and indeed read returns 0:

    #include <unistd.h>
    #include <stdio.h>
    #include <fcntl.h>

    int main ()
    {
      int fds[2];
      pipe(fds);
      fcntl(fds[0], F_SETFL, O_NONBLOCK);
      fcntl(fds[1], F_SETFL, O_NONBLOCK);
      close(fds[1]);

      char c;
      int ret = read (fds[0], &c, 1);
      if (ret < 0)
 perror("read");

      printf("ret = %d\n", ret);
    }

When you have that infinite loop, what is the value of the two elements
of linux_nat_event_pipe?
[David] I tried this, when infinit loop happens, two elements of linux_nat_event_pipe are "
*****pipe[1]:12, pipe[0]:11",  do you know any other case will result in pipe[0]-read returns 0 except closing pipe[1]?
If you could share a reproducer for how to get to this state, it would
be useful.
[David] The project is our custom project for our asic, which isn't public yet. I also tried narrow down special case to reproduce it in common GDB, but failed.

-David 

Simon



^ permalink raw reply	[flat|nested] 6+ messages in thread
* Re:Re:GDB often is blocked at async_file_flush
@ 2021-07-05  5:30 周春明(日月)
  0 siblings, 0 replies; 6+ messages in thread
From: 周春明(日月) @ 2021-07-05  5:30 UTC (permalink / raw)
  To: Simon Marchi, Gdb-patches, gdb-patches



------------------------------------------------------------------
发件人:Simon Marchi <simon.marchi@polymtl.ca>
发送时间:2021年7月5日(星期一) 08:53
收件人:周春明(日月) <riyue.zcm@alibaba-inc.com>; Gdb-patches <gdb-patches-bounces+riyue.zcm=alibaba-inc.com@sourceware.org>; gdb-patches <gdb-patches@sourceware.org>
主 题:Re: Re:GDB often is blocked at async_file_flush

On 2021-07-04 8:13 p.m., 周春明(日月) wrote:
> Hi Simon,
> Thanks for reply.
> and yes, gdb is stuck in this loop:
>   do
>     {
>       ret = read (linux_nat_event_pipe[0], &buf, 1);
>     }
>   while (ret >= 0 || (ret == -1 && errno == EINTR));
> 
> The ret from read is always 0 when stuck happens. With my further debug in kernel pipe_read, this situation happens when pipe->writers is NULL.
> Because this is random issue, I compared with normal execution, the pipe->writers is not NULL and pipe->wait_writers is null, pipe_read will return -EAGAIN, then above loop exit normally.
> So do you know when pipe->writers would be NULL? sub-process is suspended?

Hmm, does that mean that the writer end of the pipe would be closed, but
not the read end?  I don't see how that can happen, as they are both
closed as a pair in linux_async_pipe, when enable is 0.

I tried the following test program, and indeed read returns 0:

    #include <unistd.h>
    #include <stdio.h>
    #include <fcntl.h>

    int main ()
    {
      int fds[2];
      pipe(fds);
      fcntl(fds[0], F_SETFL, O_NONBLOCK);
      fcntl(fds[1], F_SETFL, O_NONBLOCK);
      close(fds[1]);

      char c;
      int ret = read (fds[0], &c, 1);
      if (ret < 0)
 perror("read");

      printf("ret = %d\n", ret);
    }

When you have that infinite loop, what is the value of the two elements
of linux_nat_event_pipe?
[David] I tried this, when infinit loop happens, two elements of linux_nat_event_pipe are "
*****pipe[1]:12, pipe[0]:11",  do you know any other case will result in pipe[0]-read returns 0 except closing pipe[1]?
If you could share a reproducer for how to get to this state, it would
be useful.
[David] The project is our custom project for our asic, which isn't public yet. I also tried narrow down special case to reproduce it in common GDB, but failed.

-David 

Simon


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-07-05 14:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-05 12:25 Re:Re:GDB often is blocked at async_file_flush 周春明(日月)
2021-07-05 12:38 ` Pedro Alves
2021-07-05 13:11   ` 回复:Re:Re:GDB " 周春明(日月)
2021-07-05 13:48     ` Pedro Alves
2021-07-05 14:06       ` 回复:回复:Re:Re:GDB " 周春明(日月)
  -- strict thread matches above, loose matches on Subject: below --
2021-07-05  5:30 Re:Re:GDB " 周春明(日月)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).