public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] gdb: Fix instability in thread groups test
@ 2018-08-10  9:58 Andrew Burgess
  2018-08-10 21:26 ` Simon Marchi
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Burgess @ 2018-08-10  9:58 UTC (permalink / raw)
  To: gdb-patches; +Cc: Andrew Burgess

In the test script gdb.mi/list-thread-groups-available.exp we ask GDB
to list all thread groups, and match the output against a
regexp. Occasionally, I would see this test fail.

The expected output is a list of entries, each entry looking roughly
like this:

  {id="<DECIMAL>",type="process",description="<STRING>",
   user="<STRING>",cores=["<DECIMAL>","<DECIMAL>",...]}

All the fields after 'id' and 'type' are optional, and the 'cores'
list can contain 1 or more "<DECIMAL>" entries.

On my machine (Running Fedora 27, kernel 4.17.3-100.fc27.x86_64)
usually the 'description' is a non-empty string, and the 'cores' list
has at least one entry in it.  But sometimes, very rarely, I'll see an
entry in the process group list where the 'description' is an empty
string, the 'user' is the string "?", and the 'cores' list is empty.
Such an entry looks like this:

   {id="19863",type="process",description="",user="?",cores=[]}

I believe that this is caused by the process exiting while GDB is
scanning /proc for process information.  The current code in
gdb/nat/linux-osdata.c is not (I think) resilient against exiting
processes.

This commit adjusts the regex that matches the 'cores' list so that an
empty list is acceptable, with this patch in place the test script
gdb.mi/list-thread-groups-available.exp never fails for me now.

gdb/testsuite/ChangeLog:

	* gdb.mi/list-thread-groups-available.exp: Update test regexp.
---
 gdb/testsuite/ChangeLog                               | 4 ++++
 gdb/testsuite/gdb.mi/list-thread-groups-available.exp | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
index c4dab2a2c34..88f9ee9b63d 100644
--- a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
+++ b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
@@ -45,7 +45,7 @@ set id_re "id=\"$decimal\""
 set type_re "type=\"process\""
 set description_re "description=\"$string_re\""
 set user_re "user=\"$string_re\""
-set cores_re "cores=\\\[\"$decimal\"(,\"$decimal\")*\\\]"
+set cores_re "cores=\\\[(\"$decimal\"(,\"$decimal\")*)?\\\]"
 
 # List all available processes.
 set process_entry_re "{${id_re},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
-- 
2.14.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: Fix instability in thread groups test
  2018-08-10  9:58 [PATCH] gdb: Fix instability in thread groups test Andrew Burgess
@ 2018-08-10 21:26 ` Simon Marchi
  2018-08-13  9:51   ` Pedro Alves
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Marchi @ 2018-08-10 21:26 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: gdb-patches

On 2018-08-10 05:57, Andrew Burgess wrote:
> In the test script gdb.mi/list-thread-groups-available.exp we ask GDB
> to list all thread groups, and match the output against a
> regexp. Occasionally, I would see this test fail.
> 
> The expected output is a list of entries, each entry looking roughly
> like this:
> 
>   {id="<DECIMAL>",type="process",description="<STRING>",
>    user="<STRING>",cores=["<DECIMAL>","<DECIMAL>",...]}
> 
> All the fields after 'id' and 'type' are optional, and the 'cores'
> list can contain 1 or more "<DECIMAL>" entries.
> 
> On my machine (Running Fedora 27, kernel 4.17.3-100.fc27.x86_64)
> usually the 'description' is a non-empty string, and the 'cores' list
> has at least one entry in it.  But sometimes, very rarely, I'll see an
> entry in the process group list where the 'description' is an empty
> string, the 'user' is the string "?", and the 'cores' list is empty.
> Such an entry looks like this:
> 
>    {id="19863",type="process",description="",user="?",cores=[]}
> 
> I believe that this is caused by the process exiting while GDB is
> scanning /proc for process information.  The current code in
> gdb/nat/linux-osdata.c is not (I think) resilient against exiting
> processes.
> 
> This commit adjusts the regex that matches the 'cores' list so that an
> empty list is acceptable, with this patch in place the test script
> gdb.mi/list-thread-groups-available.exp never fails for me now.
> 
> gdb/testsuite/ChangeLog:
> 
> 	* gdb.mi/list-thread-groups-available.exp: Update test regexp.
> ---
>  gdb/testsuite/ChangeLog                               | 4 ++++
>  gdb/testsuite/gdb.mi/list-thread-groups-available.exp | 2 +-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> index c4dab2a2c34..88f9ee9b63d 100644
> --- a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> +++ b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> @@ -45,7 +45,7 @@ set id_re "id=\"$decimal\""
>  set type_re "type=\"process\""
>  set description_re "description=\"$string_re\""
>  set user_re "user=\"$string_re\""
> -set cores_re "cores=\\\[\"$decimal\"(,\"$decimal\")*\\\]"
> +set cores_re "cores=\\\[(\"$decimal\"(,\"$decimal\")*)?\\\]"
> 
>  # List all available processes.
>  set process_entry_re
> "{${id_re},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"

Hi Andrew,

The patch LGTM.  I manually reproduced this case by spawning a process 
(tail -f /dev/null) and noting its pid.  In linux_xfer_osdata_processes, 
I added:

   if (pid == <pid>)
     sleep (5);

and killing the process during that sleep.

Simon

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: Fix instability in thread groups test
  2018-08-10 21:26 ` Simon Marchi
@ 2018-08-13  9:51   ` Pedro Alves
  2018-08-13 11:41     ` Andrew Burgess
  0 siblings, 1 reply; 9+ messages in thread
From: Pedro Alves @ 2018-08-13  9:51 UTC (permalink / raw)
  To: Simon Marchi, Andrew Burgess; +Cc: gdb-patches

On 08/10/2018 10:26 PM, Simon Marchi wrote:
> On 2018-08-10 05:57, Andrew Burgess wrote:
>> In the test script gdb.mi/list-thread-groups-available.exp we ask GDB
>> to list all thread groups, and match the output against a
>> regexp. Occasionally, I would see this test fail.
>>
>> The expected output is a list of entries, each entry looking roughly
>> like this:
>>
>>   {id="<DECIMAL>",type="process",description="<STRING>",
>>    user="<STRING>",cores=["<DECIMAL>","<DECIMAL>",...]}
>>
>> All the fields after 'id' and 'type' are optional, and the 'cores'
>> list can contain 1 or more "<DECIMAL>" entries.
>>
>> On my machine (Running Fedora 27, kernel 4.17.3-100.fc27.x86_64)
>> usually the 'description' is a non-empty string, and the 'cores' list
>> has at least one entry in it.  But sometimes, very rarely, I'll see an
>> entry in the process group list where the 'description' is an empty
>> string, the 'user' is the string "?", and the 'cores' list is empty.
>> Such an entry looks like this:
>>
>>    {id="19863",type="process",description="",user="?",cores=[]}
>>
>> I believe that this is caused by the process exiting while GDB is
>> scanning /proc for process information.  The current code in
>> gdb/nat/linux-osdata.c is not (I think) resilient against exiting
>> processes.
>>
>> This commit adjusts the regex that matches the 'cores' list so that an
>> empty list is acceptable, with this patch in place the test script
>> gdb.mi/list-thread-groups-available.exp never fails for me now.
>>
>> gdb/testsuite/ChangeLog:
>>
>>     * gdb.mi/list-thread-groups-available.exp: Update test regexp.
>> ---
>>  gdb/testsuite/ChangeLog                               | 4 ++++
>>  gdb/testsuite/gdb.mi/list-thread-groups-available.exp | 2 +-
>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
>> b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
>> index c4dab2a2c34..88f9ee9b63d 100644
>> --- a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
>> +++ b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
>> @@ -45,7 +45,7 @@ set id_re "id=\"$decimal\""
>>  set type_re "type=\"process\""
>>  set description_re "description=\"$string_re\""
>>  set user_re "user=\"$string_re\""
>> -set cores_re "cores=\\\[\"$decimal\"(,\"$decimal\")*\\\]"
>> +set cores_re "cores=\\\[(\"$decimal\"(,\"$decimal\")*)?\\\]"
>>
>>  # List all available processes.
>>  set process_entry_re
>> "{${id_re},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
> 
> Hi Andrew,
> 
> The patch LGTM.  I manually reproduced this case by spawning a process (tail -f /dev/null) and noting its pid.  In linux_xfer_osdata_processes, I added:
> 
>   if (pid == <pid>)
>     sleep (5);
> 
> and killing the process during that sleep.

But shouldn't we make GDB handle this better?  Make the output
more "atomic" in the sense that we either show a valid complete
entry, or no entry?  There's an inherent race
here, since we use multiple /proc accesses to fill up a process
entry.  If we start fetching process info for a process, and the process
disappears midway, I'd think it better to discard that process's entry,
as-if we had not even seen it, i.e., as if we had listed the set of
processes a tiny moment later.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: Fix instability in thread groups test
  2018-08-13  9:51   ` Pedro Alves
@ 2018-08-13 11:41     ` Andrew Burgess
  2018-08-13 12:03       ` Pedro Alves
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Burgess @ 2018-08-13 11:41 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Simon Marchi, gdb-patches

* Pedro Alves <palves@redhat.com> [2018-08-13 10:51:44 +0100]:

> On 08/10/2018 10:26 PM, Simon Marchi wrote:
> > On 2018-08-10 05:57, Andrew Burgess wrote:
> >> In the test script gdb.mi/list-thread-groups-available.exp we ask GDB
> >> to list all thread groups, and match the output against a
> >> regexp. Occasionally, I would see this test fail.
> >>
> >> The expected output is a list of entries, each entry looking roughly
> >> like this:
> >>
> >>   {id="<DECIMAL>",type="process",description="<STRING>",
> >>    user="<STRING>",cores=["<DECIMAL>","<DECIMAL>",...]}
> >>
> >> All the fields after 'id' and 'type' are optional, and the 'cores'
> >> list can contain 1 or more "<DECIMAL>" entries.
> >>
> >> On my machine (Running Fedora 27, kernel 4.17.3-100.fc27.x86_64)
> >> usually the 'description' is a non-empty string, and the 'cores' list
> >> has at least one entry in it.  But sometimes, very rarely, I'll see an
> >> entry in the process group list where the 'description' is an empty
> >> string, the 'user' is the string "?", and the 'cores' list is empty.
> >> Such an entry looks like this:
> >>
> >>    {id="19863",type="process",description="",user="?",cores=[]}
> >>
> >> I believe that this is caused by the process exiting while GDB is
> >> scanning /proc for process information.  The current code in
> >> gdb/nat/linux-osdata.c is not (I think) resilient against exiting
> >> processes.
> >>
> >> This commit adjusts the regex that matches the 'cores' list so that an
> >> empty list is acceptable, with this patch in place the test script
> >> gdb.mi/list-thread-groups-available.exp never fails for me now.
> >>
> >> gdb/testsuite/ChangeLog:
> >>
> >>     * gdb.mi/list-thread-groups-available.exp: Update test regexp.
> >> ---
> >>  gdb/testsuite/ChangeLog                               | 4 ++++
> >>  gdb/testsuite/gdb.mi/list-thread-groups-available.exp | 2 +-
> >>  2 files changed, 5 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> >> b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> >> index c4dab2a2c34..88f9ee9b63d 100644
> >> --- a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> >> +++ b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
> >> @@ -45,7 +45,7 @@ set id_re "id=\"$decimal\""
> >>  set type_re "type=\"process\""
> >>  set description_re "description=\"$string_re\""
> >>  set user_re "user=\"$string_re\""
> >> -set cores_re "cores=\\\[\"$decimal\"(,\"$decimal\")*\\\]"
> >> +set cores_re "cores=\\\[(\"$decimal\"(,\"$decimal\")*)?\\\]"
> >>
> >>  # List all available processes.
> >>  set process_entry_re
> >> "{${id_re},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
> > 
> > Hi Andrew,
> > 
> > The patch LGTM.  I manually reproduced this case by spawning a process (tail -f /dev/null) and noting its pid.  In linux_xfer_osdata_processes, I added:
> > 
> >   if (pid == <pid>)
> >     sleep (5);
> > 
> > and killing the process during that sleep.
> 
> But shouldn't we make GDB handle this better?  Make the output
> more "atomic" in the sense that we either show a valid complete
> entry, or no entry?  There's an inherent race
> here, since we use multiple /proc accesses to fill up a process
> entry.  If we start fetching process info for a process, and the process
> disappears midway, I'd think it better to discard that process's entry,
> as-if we had not even seen it, i.e., as if we had listed the set of
> processes a tiny moment later.

I agree.

We also need to think about process reuse.  So with multiple accesses
to /proc we might start with one process, and end up with a completely
new process.

I might be overthinking it, but my first guess at a reliable strategy
would be:

  1. Find each /proc/PID directory.
  2. Read /proc/PID/stat and extract the start time.  Failure to read
     this causes the process to be abandoned.
  3. Read all of the other /proc/PID/XXX files as needed.  Any failure
     results in the process being abandoned.
  4. Reread /proc/PID/stat and confirm the start time hasn't changed,
     this would indicate a new process having slipped in.

Given the system is still running, we can never be sure that we have
"all" processes, so throwing out anything that looks wrong seems like
the right strategy.

Also in step #4 we know we've just missed a process - something new
has started, but we ignore it.  I think this is fine though given the
racy nature of this sort of thing...

The only question is, could these thoughts be dropped into a bug
report, and the original patch to remove the unstable result applied?
Or maybe the test updated to either PASS or KFAIL?

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: Fix instability in thread groups test
  2018-08-13 11:41     ` Andrew Burgess
@ 2018-08-13 12:03       ` Pedro Alves
  2018-08-13 13:01         ` Andrew Burgess
  0 siblings, 1 reply; 9+ messages in thread
From: Pedro Alves @ 2018-08-13 12:03 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Simon Marchi, gdb-patches

On 08/13/2018 12:41 PM, Andrew Burgess wrote:
> * Pedro Alves <palves@redhat.com> [2018-08-13 10:51:44 +0100]:
> 
>> But shouldn't we make GDB handle this better?  Make the output
>> more "atomic" in the sense that we either show a valid complete
>> entry, or no entry?  There's an inherent race
>> here, since we use multiple /proc accesses to fill up a process
>> entry.  If we start fetching process info for a process, and the process
>> disappears midway, I'd think it better to discard that process's entry,
>> as-if we had not even seen it, i.e., as if we had listed the set of
>> processes a tiny moment later.
> 
> I agree.
> 
> We also need to think about process reuse.  So with multiple accesses
> to /proc we might start with one process, and end up with a completely
> new process.
> 
> I might be overthinking it, but my first guess at a reliable strategy
> would be:
> 
>   1. Find each /proc/PID directory.
>   2. Read /proc/PID/stat and extract the start time.  Failure to read
>      this causes the process to be abandoned.
>   3. Read all of the other /proc/PID/XXX files as needed.  Any failure
>      results in the process being abandoned.
>   4. Reread /proc/PID/stat and confirm the start time hasn't changed,
>      this would indicate a new process having slipped in.
> 

My initial quick thought was just to drop the process entry if
it turns out we end up with an empty core set.  

I wonder whether we can prevent PID reuse by keeping a descriptor
for /proc/PID/ open while we open the other files.  Probably not.
Otherwise, your scheme sounds like the next best.

> Given the system is still running, we can never be sure that we have
> "all" processes, so throwing out anything that looks wrong seems like
> the right strategy.
> 
> Also in step #4 we know we've just missed a process - something new
> has started, but we ignore it.  I think this is fine though given the
> racy nature of this sort of thing...
> 
> The only question is, could these thoughts be dropped into a bug
> report, 


Sure.


> and the original patch to remove the unstable result applied?
> Or maybe the test updated to either PASS or KFAIL?

I'd prefer the KFAIL option.  At the very least, a comment in
the .exp file.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: Fix instability in thread groups test
  2018-08-13 12:03       ` Pedro Alves
@ 2018-08-13 13:01         ` Andrew Burgess
  2018-08-13 13:38           ` Pedro Alves
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Burgess @ 2018-08-13 13:01 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Simon Marchi, gdb-patches

* Pedro Alves <palves@redhat.com> [2018-08-13 13:03:47 +0100]:

> On 08/13/2018 12:41 PM, Andrew Burgess wrote:
> > * Pedro Alves <palves@redhat.com> [2018-08-13 10:51:44 +0100]:
> > 
> >> But shouldn't we make GDB handle this better?  Make the output
> >> more "atomic" in the sense that we either show a valid complete
> >> entry, or no entry?  There's an inherent race
> >> here, since we use multiple /proc accesses to fill up a process
> >> entry.  If we start fetching process info for a process, and the process
> >> disappears midway, I'd think it better to discard that process's entry,
> >> as-if we had not even seen it, i.e., as if we had listed the set of
> >> processes a tiny moment later.
> > 
> > I agree.
> > 
> > We also need to think about process reuse.  So with multiple accesses
> > to /proc we might start with one process, and end up with a completely
> > new process.
> > 
> > I might be overthinking it, but my first guess at a reliable strategy
> > would be:
> > 
> >   1. Find each /proc/PID directory.
> >   2. Read /proc/PID/stat and extract the start time.  Failure to read
> >      this causes the process to be abandoned.
> >   3. Read all of the other /proc/PID/XXX files as needed.  Any failure
> >      results in the process being abandoned.
> >   4. Reread /proc/PID/stat and confirm the start time hasn't changed,
> >      this would indicate a new process having slipped in.
> > 
> 
> My initial quick thought was just to drop the process entry if
> it turns out we end up with an empty core set.  
> 
> I wonder whether we can prevent PID reuse by keeping a descriptor
> for /proc/PID/ open while we open the other files.  Probably not.

That was my first though, I tried:

  - chdir /proc/PID
  - opendir for /proc/PID

  - Kill /proc/PID

  - Read from the opendir handle, find nothing there.

Which didn't really surprise me, but was worth a try...

> Otherwise, your scheme sounds like the next best.
> 
> > Given the system is still running, we can never be sure that we have
> > "all" processes, so throwing out anything that looks wrong seems like
> > the right strategy.
> > 
> > Also in step #4 we know we've just missed a process - something new
> > has started, but we ignore it.  I think this is fine though given the
> > racy nature of this sort of thing...
> > 
> > The only question is, could these thoughts be dropped into a bug
> > report, 
> 
> 
> Sure.
> 
> 
> > and the original patch to remove the unstable result applied?
> > Or maybe the test updated to either PASS or KFAIL?
> 
> I'd prefer the KFAIL option.  At the very least, a comment in
> the .exp file.

I'll put something together...

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] gdb: Fix instability in thread groups test
  2018-08-13 13:01         ` Andrew Burgess
@ 2018-08-13 13:38           ` Pedro Alves
  2018-08-13 21:45             ` [PATCHv2] " Andrew Burgess
  0 siblings, 1 reply; 9+ messages in thread
From: Pedro Alves @ 2018-08-13 13:38 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Simon Marchi, gdb-patches

On 08/13/2018 02:01 PM, Andrew Burgess wrote:
> * Pedro Alves <palves@redhat.com> [2018-08-13 13:03:47 +0100]:

>> I wonder whether we can prevent PID reuse by keeping a descriptor
>> for /proc/PID/ open while we open the other files.  Probably not.
> 
> That was my first though, I tried:
> 
>   - chdir /proc/PID
>   - opendir for /proc/PID
> 
>   - Kill /proc/PID
> 
>   - Read from the opendir handle, find nothing there.
> 
> Which didn't really surprise me, but was worth a try...

Does it return "nothing else" even if you don't kill
the process?  Or does returning nothing indicate the
process is gone already?

Regardless, I don't think that proves keeping the opendir dir handle
open (or some other file under /proc/PID) does not prevent the kernel
from reusing the PID until the handle is closed, even though I
do suspect it does not.

But thinking a bit more, maybe it's useless to try to detect PID reuse,
because the process we're collecting info for can just as well exec,
which makes the info we had collected so far become invalid in pretty
much the same way...

>>> and the original patch to remove the unstable result applied?
>>> Or maybe the test updated to either PASS or KFAIL?
>>
>> I'd prefer the KFAIL option.  At the very least, a comment in
>> the .exp file.
> 
> I'll put something together...

Maybe it's not worth the bother.  After thinking about it some more,
I'll be happy with a comment in the .exp file.

Pedro Alves

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCHv2] gdb: Fix instability in thread groups test
  2018-08-13 13:38           ` Pedro Alves
@ 2018-08-13 21:45             ` Andrew Burgess
  2018-08-14 11:37               ` Pedro Alves
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Burgess @ 2018-08-13 21:45 UTC (permalink / raw)
  To: Pedro Alves; +Cc: Simon Marchi, gdb-patches

Here's an updated version of the patch based on previous feedback.
The new version has additional comments explaining why the regexp
allows an empty core list.  I also only allow the empty core list
when we scan all processes, its only in this case we might hit an
exiting process.  When we scan two known PIDs, I don't allow an empty
core list.

---

gdb: Fix instability in thread groups test

In the test script gdb.mi/list-thread-groups-available.exp we ask GDB
to list all thread groups, and match the output against a
regexp. Occasionally, I would see this test fail.

The expected output is a list of entries, each entry looking roughly
like this:

  {id="<DECIMAL>",type="process",description="<STRING>",
   user="<STRING>",cores=["<DECIMAL>","<DECIMAL>",...]}

All the fields after 'id' and 'type' are optional, and the 'cores'
list can contain 1 or more "<DECIMAL>" entries.

On my machine (Running Fedora 27, kernel 4.17.3-100.fc27.x86_64)
usually the 'description' is a non-empty string, and the 'cores' list
has at least one entry in it.  But sometimes, very rarely, I'll see an
entry in the process group list where the 'description' is an empty
string, the 'user' is the string "?", and the 'cores' list is empty.
Such an entry looks like this:

   {id="19863",type="process",description="",user="?",cores=[]}

I believe that this is caused by the process exiting while GDB is
scanning /proc for process information.  The current code in
gdb/nat/linux-osdata.c is not (I think) resilient against exiting
processes.

This commit adjusts the regex that matches the 'cores' list so that an
empty list is acceptable, with this patch in place the test script
gdb.mi/list-thread-groups-available.exp never fails for me now.

I've only adjusted the cores regexp for the occasion when we have GDB
read information about all processes, its only in this case that we
might encounter an exiting process.  When we read information about
two known PIDs, that we know will not exit for the duration of the
test, we require that the core list be non-empty.

gdb/testsuite/ChangeLog:

	* gdb.mi/list-thread-groups-available.exp: Update test regexp.
---
 gdb/testsuite/ChangeLog                               |  4 ++++
 gdb/testsuite/gdb.mi/list-thread-groups-available.exp | 12 +++++++++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
index c4dab2a2c34..11eddd577a3 100644
--- a/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
+++ b/gdb/testsuite/gdb.mi/list-thread-groups-available.exp
@@ -45,7 +45,11 @@ set id_re "id=\"$decimal\""
 set type_re "type=\"process\""
 set description_re "description=\"$string_re\""
 set user_re "user=\"$string_re\""
-set cores_re "cores=\\\[\"$decimal\"(,\"$decimal\")*\\\]"
+
+# The CORES_RE regexp allows a process to be running on zero or more
+# cores.  This can happen if a process exits while GDB is reading
+# information out of /proc.
+set cores_re "cores=\\\[(\"$decimal\"(,\"$decimal\")*)?\\\]"
 
 # List all available processes.
 set process_entry_re "{${id_re},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
@@ -64,6 +68,12 @@ set spawn_id_2 [remote_spawn target $binfile]
 set pid_2 [spawn_id_get_pid $spawn_id_2]
 set id_re_2 "id=\"$pid_2\""
 
+# Unlike the earlier CORES_RE this list must contain at least one
+# core.  Given that we know these processes will not exit while GDB is
+# reading their information from /proc we can expect at least one core
+# for each process.
+set cores_re "cores=\\\[\"$decimal\"(,\"$decimal\")*\\\]"
+
 set process_entry_re_1 "{${id_re_1},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
 set process_entry_re_2 "{${id_re_2},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
 
-- 
2.14.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCHv2] gdb: Fix instability in thread groups test
  2018-08-13 21:45             ` [PATCHv2] " Andrew Burgess
@ 2018-08-14 11:37               ` Pedro Alves
  0 siblings, 0 replies; 9+ messages in thread
From: Pedro Alves @ 2018-08-14 11:37 UTC (permalink / raw)
  To: Andrew Burgess; +Cc: Simon Marchi, gdb-patches

On 08/13/2018 10:45 PM, Andrew Burgess wrote:
> Here's an updated version of the patch based on previous feedback.
> The new version has additional comments explaining why the regexp
> allows an empty core list.  I also only allow the empty core list
> when we scan all processes, its only in this case we might hit an
> exiting process.  When we scan two known PIDs, I don't allow an empty
> core list.

Thanks.

> +# The CORES_RE regexp allows a process to be running on zero or more
> +# cores.  This can happen if a process exits while GDB is reading
> +# information out of /proc.
> +set cores_re "cores=\\\[(\"$decimal\"(,\"$decimal\")*)?\\\]"

LGTM, except "this" is slightly ambiguous here.  Maybe:

 s/This can happen/The former can happen/

?

OK with that resolved.

Thanks,
Pedro Alves

>  
>  # List all available processes.
>  set process_entry_re "{${id_re},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
> @@ -64,6 +68,12 @@ set spawn_id_2 [remote_spawn target $binfile]
>  set pid_2 [spawn_id_get_pid $spawn_id_2]
>  set id_re_2 "id=\"$pid_2\""
>  
> +# Unlike the earlier CORES_RE this list must contain at least one
> +# core.  Given that we know these processes will not exit while GDB is
> +# reading their information from /proc we can expect at least one core
> +# for each process.
> +set cores_re "cores=\\\[\"$decimal\"(,\"$decimal\")*\\\]"
> +
>  set process_entry_re_1 "{${id_re_1},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
>  set process_entry_re_2 "{${id_re_2},${type_re}(,$description_re)?(,$user_re)?(,$cores_re)?}"
>  
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-08-14 11:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-10  9:58 [PATCH] gdb: Fix instability in thread groups test Andrew Burgess
2018-08-10 21:26 ` Simon Marchi
2018-08-13  9:51   ` Pedro Alves
2018-08-13 11:41     ` Andrew Burgess
2018-08-13 12:03       ` Pedro Alves
2018-08-13 13:01         ` Andrew Burgess
2018-08-13 13:38           ` Pedro Alves
2018-08-13 21:45             ` [PATCHv2] " Andrew Burgess
2018-08-14 11:37               ` Pedro Alves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).