[RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages

public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed

* [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages
@ 2015-03-25  0:06 Sergio Durigan Junior
  2015-03-27 10:06 ` Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Sergio Durigan Junior @ 2015-03-25  0:06 UTC (permalink / raw)
  To: GDB Patches; +Cc: Pedro Alves

Hi,

While hacking the coredump_filter patch, I noticed that, when you load a
corefile on GDB and receive a "Cannot access memory at address..."
message, gdb_core_cmd will fail and return -1, which means that some
fatal error happened.

Unfortunately, this kind of message does not mean that the user cannot
continue debugging with the corefile; it meant that some memory region
(sometimes not important) was inaccessible.  Given that
gcore_create_callback, nowadays, will dump memory regions if they don't
have the 'read' permission set (but have any other permission set), this
kind of error can be expected sometimes.

I would like to propose this patch to be applied, which will allow the
test to continue even if this message is triggered.  I was not sure
wether I should use a "pass" or "xfail" there, so I chose the second
(which makes more sense to me).

WDYT?

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

gdb/testsuite/ChangeLog:
2015-03-24  Sergio Durigan Junior  <sergiodj@redhat.com>

	* lib/gdb.exp (gdb_core_cmd): Handle "Cannot access memory at
	address..." message.

diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp
index f274b64..48b50b5 100644
--- a/gdb/testsuite/lib/gdb.exp
+++ b/gdb/testsuite/lib/gdb.exp
@@ -3573,7 +3573,7 @@ proc gdb_gcore_cmd {core test} {
 # -1 - core file failed to load

 proc gdb_core_cmd { core test } {
-    global gdb_prompt
+    global gdb_prompt hex

     gdb_test_multiple "core $core" "$test" {
 	-re "\\\[Thread debugging using \[^ \r\n\]* enabled\\\]\r\n" {
@@ -3591,6 +3591,10 @@ proc gdb_core_cmd { core test } {
 	    fail "$test (incomplete note section)"
 	    return 0
 	}
+	-re "Cannot access memory at address $hex\r\n$gdb_prompt $" {
+	    pass "$test (could not access some memory regions)"
+	    return 0
+	}
 	-re "Core was generated by .*\r\n$gdb_prompt $" {
 	    pass "$test"
 	    return 1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages
  2015-03-25  0:06 [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages Sergio Durigan Junior
@ 2015-03-27 10:06 ` Pedro Alves
  2015-03-27 22:34   ` [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages) Sergio Durigan Junior
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2015-03-27 10:06 UTC (permalink / raw)
  To: Sergio Durigan Junior, GDB Patches

On 03/25/2015 12:06 AM, Sergio Durigan Junior wrote:> Hi,
>
> While hacking the coredump_filter patch, I noticed that, when you load a
> corefile on GDB and receive a "Cannot access memory at address..."
> message, gdb_core_cmd will fail and return -1, which means that some
> fatal error happened.
>
> Unfortunately, this kind of message does not mean that the user cannot
> continue debugging with the corefile; it meant that some memory region
> (sometimes not important) was inaccessible.  Given that
> gcore_create_callback, nowadays, will dump memory regions if they don't
> have the 'read' permission set (but have any other permission set), this
> kind of error can be expected sometimes.

So, gdb itself errors and stops processing the core?

>
> I would like to propose this patch to be applied, which will allow the
> test to continue even if this message is triggered.  I was not sure
> wether I should use a "pass" or "xfail" there, so I chose the second
> (which makes more sense to me).
>

Seems like you chose the first, not second.

> WDYT?

I think I don't understand.  :-)  Can you please show an
example session?  Did GDB continue processing the core when
it printed that error, or was it just a warning and it continued?

From your message and patch, I assume the former, otherwise

 "Core was generated by .*\r\n$gdb_prompt $"

would still match, right?  If it didn't match, why not?

Your arguing that error "does not mean that the user cannot
continue debugging with the corefile".  So why paper over
this in the testsuite?  Saying "pass" in the testsuite when
it was "error" for gdb seems like a conflict.  It seems like what
we really should be discussing is whether that should be a
fatal error for the "core" command.

But as said, we'll need to see a gdb log and understand better
what gdb is doing to discuss this further.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages)
  2015-03-27 10:06 ` Pedro Alves
@ 2015-03-27 22:34   ` Sergio Durigan Junior
  2015-03-31 11:30     ` [PATCH] Catch exception on solib_svr4_r_ldsomap Pedro Alves
  0 siblings, 1 reply; 5+ messages in thread
From: Sergio Durigan Junior @ 2015-03-27 22:34 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Friday, March 27 2015, Pedro Alves wrote:

> On 03/25/2015 12:06 AM, Sergio Durigan Junior wrote:> Hi,
>>
>> While hacking the coredump_filter patch, I noticed that, when you load a
>> corefile on GDB and receive a "Cannot access memory at address..."
>> message, gdb_core_cmd will fail and return -1, which means that some
>> fatal error happened.
>>
>> Unfortunately, this kind of message does not mean that the user cannot
>> continue debugging with the corefile; it meant that some memory region
>> (sometimes not important) was inaccessible.  Given that
>> gcore_create_callback, nowadays, will dump memory regions if they don't
>> have the 'read' permission set (but have any other permission set), this
>> kind of error can be expected sometimes.
>
> So, gdb itself errors and stops processing the core?

No, GDB does not "error and stop", but some testcases do that.  For
example, the proposed testcase for the coredump_filter feature has:

        ...
	set core_loaded [gdb_core_cmd "$core" "load core"]
	if { $core_loaded == -1 } {
	    fail "loading $core"
	    return
	}
        ...

On some corefiles generated by the test, GDB emits the "Cannot
access..." warning, which makes gdb_core_cmd return -1, and then this
specific test aborts.

>> I would like to propose this patch to be applied, which will allow the
>> test to continue even if this message is triggered.  I was not sure
>> wether I should use a "pass" or "xfail" there, so I chose the second
>> (which makes more sense to me).
>>
>
> Seems like you chose the first, not second.

You are right, I made a mistake when choosing the right patch to include
in the message, sorry about that.  Please, consider it as having a
"xfail" instead of a "pass".

>> WDYT?
>
> I think I don't understand.  :-)  Can you please show an
> example session?  Did GDB continue processing the core when
> it printed that error, or was it just a warning and it continued?

Sure, sorry for not sending the example session before!  Here is the
pertinent part:

  (gdb) core /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore
  [New LWP 28468]
  Cannot access memory at address 0x355fc21148
  Cannot access memory at address 0x355fc21140
  (gdb) FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: load core
  FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: loading /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore
  spawn /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../data-directory
  ...

GDB correctly loaded the corefile (despite the warnings), and the
debugging sessions could continue, but, because gdb_core_cmd returned
-1, the test was interrupted (and another one began).

> From your message and patch, I assume the former, otherwise
>
>  "Core was generated by .*\r\n$gdb_prompt $"
>
> would still match, right?  If it didn't match, why not?

It did not match because GDB did not generate this message.  I also
found this to be very strange, but I did not investigate further.  But
see below for more details.

> Your arguing that error "does not mean that the user cannot
> continue debugging with the corefile".  So why paper over
> this in the testsuite?  Saying "pass" in the testsuite when
> it was "error" for gdb seems like a conflict.

I totally agree.  As I explained above, it was a thinko: the patch
should really contain a "xfail" instead of a "pass".

> It seems like what
> we really should be discussing is whether that should be a
> fatal error for the "core" command.

And that's why I marked the patch as RFC :-).  I knew that maybe someone
would like to raise this question.

> But as said, we'll need to see a gdb log and understand better
> what gdb is doing to discuss this further.

Right, so I took some time and found the right fix, I think.  As we
agreed above, the fact that GDB is not printing the "Core was generated
by..." message is really strange, so I decided to investigate why it is
doing that.

The answer is that we are forgetting to check for an exception on
solib_svr4_r_ldsomap.  When loading the corefile, GDB calls this
function, which then calls read_memory_unsigned_integer, which throws an
error.  This error is not being caught by the function, so it propagates
until the main loop catches it.  The fix is obvious: we should catch
this regression and continue in the function.  With it, GDB now
correctly prints the "Core was generated by..." message, and the patch
to adjust gdb_core_cmd is no longer needed.

Regression-tested on Fedora 20 for x86_64, i686 and native-gdbserver.

Does that make more sense now?

Thanks,

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

gdb/ChangeLog:
2015-03-27  Sergio Durigan Junior  <sergiodj@redhat.com>

	* solib-svr4.c (solib_svr4_r_ldsomap): Catch possible exception by
	read_memory_unsigned_integer.

diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 0cecc2a..dd93847 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -910,13 +910,22 @@ solib_svr4_r_ldsomap (struct svr4_info *info)
   struct link_map_offsets *lmo = svr4_fetch_link_map_offsets ();
   struct type *ptr_type = builtin_type (target_gdbarch ())->builtin_data_ptr;
   enum bfd_endian byte_order = gdbarch_byte_order (target_gdbarch ());
-  ULONGEST version;
+  ULONGEST version = 0;
+
+  TRY
+    {
+      /* Check version, and return zero if `struct r_debug' doesn't have
+	 the r_ldsomap member.  */
+      version
+	= read_memory_unsigned_integer (info->debug_base + lmo->r_version_offset,
+					lmo->r_version_size, byte_order);
+    }
+  CATCH (ex, RETURN_MASK_ERROR)
+    {
+      exception_print (gdb_stderr, ex);
+    }
+  END_CATCH

-  /* Check version, and return zero if `struct r_debug' doesn't have
-     the r_ldsomap member.  */
-  version
-    = read_memory_unsigned_integer (info->debug_base + lmo->r_version_offset,
-				    lmo->r_version_size, byte_order);
   if (version < 2 || lmo->r_ldsomap_offset == -1)
     return 0;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Catch exception on solib_svr4_r_ldsomap
  2015-03-27 22:34   ` [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages) Sergio Durigan Junior
@ 2015-03-31 11:30     ` Pedro Alves
  2015-03-31 23:39       ` Sergio Durigan Junior
  0 siblings, 1 reply; 5+ messages in thread
From: Pedro Alves @ 2015-03-31 11:30 UTC (permalink / raw)
  To: Sergio Durigan Junior; +Cc: GDB Patches

On 03/27/2015 10:34 PM, Sergio Durigan Junior wrote:
> On Friday, March 27 2015, Pedro Alves wrote:
> 
>> On 03/25/2015 12:06 AM, Sergio Durigan Junior wrote:> Hi,
>>>
>>> While hacking the coredump_filter patch, I noticed that, when you load a
>>> corefile on GDB and receive a "Cannot access memory at address..."
>>> message, gdb_core_cmd will fail and return -1, which means that some
>>> fatal error happened.
>>>
>>> Unfortunately, this kind of message does not mean that the user cannot
>>> continue debugging with the corefile; it meant that some memory region
>>> (sometimes not important) was inaccessible.  Given that
>>> gcore_create_callback, nowadays, will dump memory regions if they don't
>>> have the 'read' permission set (but have any other permission set), this
>>> kind of error can be expected sometimes.
>>
>> So, gdb itself errors and stops processing the core?
> 
> No, GDB does not "error and stop", but some testcases do that.  

Well, it clearly does do that.   Hence your new patch.  :-)

>> I think I don't understand.  :-)  Can you please show an
>> example session?  Did GDB continue processing the core when
>> it printed that error, or was it just a warning and it continued?
> 

I meant "Did GDB stop processing the core", of course.

> Sure, sorry for not sending the example session before!  Here is the
> pertinent part:
> 
>   (gdb) core /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore
>   [New LWP 28468]
>   Cannot access memory at address 0x355fc21148
>   Cannot access memory at address 0x355fc21140
>   (gdb) FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: load core
>   FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: loading /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore
>   spawn /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../data-directory
>   ...
> 
> GDB correctly loaded the corefile (despite the warnings), and the

The error made the rest of core_open be skipped: if the core is still
loaded, gdb is potentially in an inconsistent state at this point.
I'd think we should completely discard the core/target if something
errors out.  And then if we can be tolerant to specific parts of
loading a core failing, we should handle those before the error escapes
out of core_open.  We do something like that
already (note core_close_cleanup and the TRY/CATCH'S), but it's clearly
not complete.  After:

  push_target (&core_ops);
  discard_cleanups (old_chain);

... several things can throw and let an exception escape.

> Right, so I took some time and found the right fix, I think.  As we
> agreed above, the fact that GDB is not printing the "Core was generated
> by..." message is really strange, so I decided to investigate why it is
> doing that.
> 
> The answer is that we are forgetting to check for an exception on
> solib_svr4_r_ldsomap.  When loading the corefile, GDB calls this
> function, which then calls read_memory_unsigned_integer, which throws an
> error.  This error is not being caught by the function, so it propagates
> until the main loop catches it.  The fix is obvious: we should catch
> this regression and continue in the function.  With it, GDB now
> correctly prints the "Core was generated by..." message, and the patch
> to adjust gdb_core_cmd is no longer needed.
> 
> Regression-tested on Fedora 20 for x86_64, i686 and native-gdbserver.
> 
> Does that make more sense now?
> 

Yes, this is OK.

Thanks,
Pedro Alves

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Catch exception on solib_svr4_r_ldsomap
  2015-03-31 11:30     ` [PATCH] Catch exception on solib_svr4_r_ldsomap Pedro Alves
@ 2015-03-31 23:39       ` Sergio Durigan Junior
  0 siblings, 0 replies; 5+ messages in thread
From: Sergio Durigan Junior @ 2015-03-31 23:39 UTC (permalink / raw)
  To: Pedro Alves; +Cc: GDB Patches

On Tuesday, March 31 2015, Pedro Alves wrote:

>>> So, gdb itself errors and stops processing the core?
>> 
>> No, GDB does not "error and stop", but some testcases do that.  
>
> Well, it clearly does do that.   Hence your new patch.  :-)
>
>>> I think I don't understand.  :-)  Can you please show an
>>> example session?  Did GDB continue processing the core when
>>> it printed that error, or was it just a warning and it continued?
>> 
>
> I meant "Did GDB stop processing the core", of course.

Sorry I misunderstood your question.

>> Sure, sorry for not sending the example session before!  Here is the
>> pertinent part:
>> 
>>   (gdb) core /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore
>>   [New LWP 28468]
>>   Cannot access memory at address 0x355fc21148
>>   Cannot access memory at address 0x355fc21140
>>   (gdb) FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: load core
>>   FAIL: gdb.base/coredump-filter.exp: loading and testing corefile
>> for non-Private-Anonymous: loading
>> /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore
>>   spawn
>> /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../../gdb/gdb
>> -nw -nx -data-directory
>> /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../data-directory
>>   ...
>> 
>> GDB correctly loaded the corefile (despite the warnings), and the
>
> The error made the rest of core_open be skipped: if the core is still
> loaded, gdb is potentially in an inconsistent state at this point.
> I'd think we should completely discard the core/target if something
> errors out.  And then if we can be tolerant to specific parts of
> loading a core failing, we should handle those before the error escapes
> out of core_open.  We do something like that
> already (note core_close_cleanup and the TRY/CATCH'S), but it's clearly
> not complete.  After:
>
>   push_target (&core_ops);
>   discard_cleanups (old_chain);
>
> ... several things can throw and let an exception escape.

Yeah, certainly we should be catching those before they escape.  I will
see if I can take a look at more places to improve/fix later.

>> Right, so I took some time and found the right fix, I think.  As we
>> agreed above, the fact that GDB is not printing the "Core was generated
>> by..." message is really strange, so I decided to investigate why it is
>> doing that.
>> 
>> The answer is that we are forgetting to check for an exception on
>> solib_svr4_r_ldsomap.  When loading the corefile, GDB calls this
>> function, which then calls read_memory_unsigned_integer, which throws an
>> error.  This error is not being caught by the function, so it propagates
>> until the main loop catches it.  The fix is obvious: we should catch
>> this regression and continue in the function.  With it, GDB now
>> correctly prints the "Core was generated by..." message, and the patch
>> to adjust gdb_core_cmd is no longer needed.
>> 
>> Regression-tested on Fedora 20 for x86_64, i686 and native-gdbserver.
>> 
>> Does that make more sense now?
>> 
>
> Yes, this is OK.

Thanks, pushed:

  <https://sourceware.org/ml/gdb-cvs/2015-03/msg00274.html>
  416f679e68468ea6dd7384213994ce74201f82e7

-- 
Sergio
GPG key ID: 0x65FC5E36
Please send encrypted e-mail if possible
http://sergiodj.net/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-03-31 23:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-25  0:06 [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages Sergio Durigan Junior
2015-03-27 10:06 ` Pedro Alves
2015-03-27 22:34   ` [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages) Sergio Durigan Junior
2015-03-31 11:30     ` [PATCH] Catch exception on solib_svr4_r_ldsomap Pedro Alves
2015-03-31 23:39       ` Sergio Durigan Junior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).