* [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages @ 2015-03-25 0:06 Sergio Durigan Junior 2015-03-27 10:06 ` Pedro Alves 0 siblings, 1 reply; 5+ messages in thread From: Sergio Durigan Junior @ 2015-03-25 0:06 UTC (permalink / raw) To: GDB Patches; +Cc: Pedro Alves Hi, While hacking the coredump_filter patch, I noticed that, when you load a corefile on GDB and receive a "Cannot access memory at address..." message, gdb_core_cmd will fail and return -1, which means that some fatal error happened. Unfortunately, this kind of message does not mean that the user cannot continue debugging with the corefile; it meant that some memory region (sometimes not important) was inaccessible. Given that gcore_create_callback, nowadays, will dump memory regions if they don't have the 'read' permission set (but have any other permission set), this kind of error can be expected sometimes. I would like to propose this patch to be applied, which will allow the test to continue even if this message is triggered. I was not sure wether I should use a "pass" or "xfail" there, so I chose the second (which makes more sense to me). WDYT? -- Sergio GPG key ID: 0x65FC5E36 Please send encrypted e-mail if possible http://sergiodj.net/ gdb/testsuite/ChangeLog: 2015-03-24 Sergio Durigan Junior <sergiodj@redhat.com> * lib/gdb.exp (gdb_core_cmd): Handle "Cannot access memory at address..." message. diff --git a/gdb/testsuite/lib/gdb.exp b/gdb/testsuite/lib/gdb.exp index f274b64..48b50b5 100644 --- a/gdb/testsuite/lib/gdb.exp +++ b/gdb/testsuite/lib/gdb.exp @@ -3573,7 +3573,7 @@ proc gdb_gcore_cmd {core test} { # -1 - core file failed to load proc gdb_core_cmd { core test } { - global gdb_prompt + global gdb_prompt hex gdb_test_multiple "core $core" "$test" { -re "\\\[Thread debugging using \[^ \r\n\]* enabled\\\]\r\n" { @@ -3591,6 +3591,10 @@ proc gdb_core_cmd { core test } { fail "$test (incomplete note section)" return 0 } + -re "Cannot access memory at address $hex\r\n$gdb_prompt $" { + pass "$test (could not access some memory regions)" + return 0 + } -re "Core was generated by .*\r\n$gdb_prompt $" { pass "$test" return 1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages 2015-03-25 0:06 [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages Sergio Durigan Junior @ 2015-03-27 10:06 ` Pedro Alves 2015-03-27 22:34 ` [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages) Sergio Durigan Junior 0 siblings, 1 reply; 5+ messages in thread From: Pedro Alves @ 2015-03-27 10:06 UTC (permalink / raw) To: Sergio Durigan Junior, GDB Patches On 03/25/2015 12:06 AM, Sergio Durigan Junior wrote:> Hi, > > While hacking the coredump_filter patch, I noticed that, when you load a > corefile on GDB and receive a "Cannot access memory at address..." > message, gdb_core_cmd will fail and return -1, which means that some > fatal error happened. > > Unfortunately, this kind of message does not mean that the user cannot > continue debugging with the corefile; it meant that some memory region > (sometimes not important) was inaccessible. Given that > gcore_create_callback, nowadays, will dump memory regions if they don't > have the 'read' permission set (but have any other permission set), this > kind of error can be expected sometimes. So, gdb itself errors and stops processing the core? > > I would like to propose this patch to be applied, which will allow the > test to continue even if this message is triggered. I was not sure > wether I should use a "pass" or "xfail" there, so I chose the second > (which makes more sense to me). > Seems like you chose the first, not second. > WDYT? I think I don't understand. :-) Can you please show an example session? Did GDB continue processing the core when it printed that error, or was it just a warning and it continued? From your message and patch, I assume the former, otherwise "Core was generated by .*\r\n$gdb_prompt $" would still match, right? If it didn't match, why not? Your arguing that error "does not mean that the user cannot continue debugging with the corefile". So why paper over this in the testsuite? Saying "pass" in the testsuite when it was "error" for gdb seems like a conflict. It seems like what we really should be discussing is whether that should be a fatal error for the "core" command. But as said, we'll need to see a gdb log and understand better what gdb is doing to discuss this further. Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages) 2015-03-27 10:06 ` Pedro Alves @ 2015-03-27 22:34 ` Sergio Durigan Junior 2015-03-31 11:30 ` [PATCH] Catch exception on solib_svr4_r_ldsomap Pedro Alves 0 siblings, 1 reply; 5+ messages in thread From: Sergio Durigan Junior @ 2015-03-27 22:34 UTC (permalink / raw) To: Pedro Alves; +Cc: GDB Patches On Friday, March 27 2015, Pedro Alves wrote: > On 03/25/2015 12:06 AM, Sergio Durigan Junior wrote:> Hi, >> >> While hacking the coredump_filter patch, I noticed that, when you load a >> corefile on GDB and receive a "Cannot access memory at address..." >> message, gdb_core_cmd will fail and return -1, which means that some >> fatal error happened. >> >> Unfortunately, this kind of message does not mean that the user cannot >> continue debugging with the corefile; it meant that some memory region >> (sometimes not important) was inaccessible. Given that >> gcore_create_callback, nowadays, will dump memory regions if they don't >> have the 'read' permission set (but have any other permission set), this >> kind of error can be expected sometimes. > > So, gdb itself errors and stops processing the core? No, GDB does not "error and stop", but some testcases do that. For example, the proposed testcase for the coredump_filter feature has: ... set core_loaded [gdb_core_cmd "$core" "load core"] if { $core_loaded == -1 } { fail "loading $core" return } ... On some corefiles generated by the test, GDB emits the "Cannot access..." warning, which makes gdb_core_cmd return -1, and then this specific test aborts. >> I would like to propose this patch to be applied, which will allow the >> test to continue even if this message is triggered. I was not sure >> wether I should use a "pass" or "xfail" there, so I chose the second >> (which makes more sense to me). >> > > Seems like you chose the first, not second. You are right, I made a mistake when choosing the right patch to include in the message, sorry about that. Please, consider it as having a "xfail" instead of a "pass". >> WDYT? > > I think I don't understand. :-) Can you please show an > example session? Did GDB continue processing the core when > it printed that error, or was it just a warning and it continued? Sure, sorry for not sending the example session before! Here is the pertinent part: (gdb) core /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore [New LWP 28468] Cannot access memory at address 0x355fc21148 Cannot access memory at address 0x355fc21140 (gdb) FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: load core FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: loading /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore spawn /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../data-directory ... GDB correctly loaded the corefile (despite the warnings), and the debugging sessions could continue, but, because gdb_core_cmd returned -1, the test was interrupted (and another one began). > From your message and patch, I assume the former, otherwise > > "Core was generated by .*\r\n$gdb_prompt $" > > would still match, right? If it didn't match, why not? It did not match because GDB did not generate this message. I also found this to be very strange, but I did not investigate further. But see below for more details. > Your arguing that error "does not mean that the user cannot > continue debugging with the corefile". So why paper over > this in the testsuite? Saying "pass" in the testsuite when > it was "error" for gdb seems like a conflict. I totally agree. As I explained above, it was a thinko: the patch should really contain a "xfail" instead of a "pass". > It seems like what > we really should be discussing is whether that should be a > fatal error for the "core" command. And that's why I marked the patch as RFC :-). I knew that maybe someone would like to raise this question. > But as said, we'll need to see a gdb log and understand better > what gdb is doing to discuss this further. Right, so I took some time and found the right fix, I think. As we agreed above, the fact that GDB is not printing the "Core was generated by..." message is really strange, so I decided to investigate why it is doing that. The answer is that we are forgetting to check for an exception on solib_svr4_r_ldsomap. When loading the corefile, GDB calls this function, which then calls read_memory_unsigned_integer, which throws an error. This error is not being caught by the function, so it propagates until the main loop catches it. The fix is obvious: we should catch this regression and continue in the function. With it, GDB now correctly prints the "Core was generated by..." message, and the patch to adjust gdb_core_cmd is no longer needed. Regression-tested on Fedora 20 for x86_64, i686 and native-gdbserver. Does that make more sense now? Thanks, -- Sergio GPG key ID: 0x65FC5E36 Please send encrypted e-mail if possible http://sergiodj.net/ gdb/ChangeLog: 2015-03-27 Sergio Durigan Junior <sergiodj@redhat.com> * solib-svr4.c (solib_svr4_r_ldsomap): Catch possible exception by read_memory_unsigned_integer. diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c index 0cecc2a..dd93847 100644 --- a/gdb/solib-svr4.c +++ b/gdb/solib-svr4.c @@ -910,13 +910,22 @@ solib_svr4_r_ldsomap (struct svr4_info *info) struct link_map_offsets *lmo = svr4_fetch_link_map_offsets (); struct type *ptr_type = builtin_type (target_gdbarch ())->builtin_data_ptr; enum bfd_endian byte_order = gdbarch_byte_order (target_gdbarch ()); - ULONGEST version; + ULONGEST version = 0; + + TRY + { + /* Check version, and return zero if `struct r_debug' doesn't have + the r_ldsomap member. */ + version + = read_memory_unsigned_integer (info->debug_base + lmo->r_version_offset, + lmo->r_version_size, byte_order); + } + CATCH (ex, RETURN_MASK_ERROR) + { + exception_print (gdb_stderr, ex); + } + END_CATCH - /* Check version, and return zero if `struct r_debug' doesn't have - the r_ldsomap member. */ - version - = read_memory_unsigned_integer (info->debug_base + lmo->r_version_offset, - lmo->r_version_size, byte_order); if (version < 2 || lmo->r_ldsomap_offset == -1) return 0; ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Catch exception on solib_svr4_r_ldsomap 2015-03-27 22:34 ` [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages) Sergio Durigan Junior @ 2015-03-31 11:30 ` Pedro Alves 2015-03-31 23:39 ` Sergio Durigan Junior 0 siblings, 1 reply; 5+ messages in thread From: Pedro Alves @ 2015-03-31 11:30 UTC (permalink / raw) To: Sergio Durigan Junior; +Cc: GDB Patches On 03/27/2015 10:34 PM, Sergio Durigan Junior wrote: > On Friday, March 27 2015, Pedro Alves wrote: > >> On 03/25/2015 12:06 AM, Sergio Durigan Junior wrote:> Hi, >>> >>> While hacking the coredump_filter patch, I noticed that, when you load a >>> corefile on GDB and receive a "Cannot access memory at address..." >>> message, gdb_core_cmd will fail and return -1, which means that some >>> fatal error happened. >>> >>> Unfortunately, this kind of message does not mean that the user cannot >>> continue debugging with the corefile; it meant that some memory region >>> (sometimes not important) was inaccessible. Given that >>> gcore_create_callback, nowadays, will dump memory regions if they don't >>> have the 'read' permission set (but have any other permission set), this >>> kind of error can be expected sometimes. >> >> So, gdb itself errors and stops processing the core? > > No, GDB does not "error and stop", but some testcases do that. Well, it clearly does do that. Hence your new patch. :-) >> I think I don't understand. :-) Can you please show an >> example session? Did GDB continue processing the core when >> it printed that error, or was it just a warning and it continued? > I meant "Did GDB stop processing the core", of course. > Sure, sorry for not sending the example session before! Here is the > pertinent part: > > (gdb) core /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore > [New LWP 28468] > Cannot access memory at address 0x355fc21148 > Cannot access memory at address 0x355fc21140 > (gdb) FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: load core > FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: loading /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore > spawn /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../data-directory > ... > > GDB correctly loaded the corefile (despite the warnings), and the The error made the rest of core_open be skipped: if the core is still loaded, gdb is potentially in an inconsistent state at this point. I'd think we should completely discard the core/target if something errors out. And then if we can be tolerant to specific parts of loading a core failing, we should handle those before the error escapes out of core_open. We do something like that already (note core_close_cleanup and the TRY/CATCH'S), but it's clearly not complete. After: push_target (&core_ops); discard_cleanups (old_chain); ... several things can throw and let an exception escape. > Right, so I took some time and found the right fix, I think. As we > agreed above, the fact that GDB is not printing the "Core was generated > by..." message is really strange, so I decided to investigate why it is > doing that. > > The answer is that we are forgetting to check for an exception on > solib_svr4_r_ldsomap. When loading the corefile, GDB calls this > function, which then calls read_memory_unsigned_integer, which throws an > error. This error is not being caught by the function, so it propagates > until the main loop catches it. The fix is obvious: we should catch > this regression and continue in the function. With it, GDB now > correctly prints the "Core was generated by..." message, and the patch > to adjust gdb_core_cmd is no longer needed. > > Regression-tested on Fedora 20 for x86_64, i686 and native-gdbserver. > > Does that make more sense now? > Yes, this is OK. Thanks, Pedro Alves ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Catch exception on solib_svr4_r_ldsomap 2015-03-31 11:30 ` [PATCH] Catch exception on solib_svr4_r_ldsomap Pedro Alves @ 2015-03-31 23:39 ` Sergio Durigan Junior 0 siblings, 0 replies; 5+ messages in thread From: Sergio Durigan Junior @ 2015-03-31 23:39 UTC (permalink / raw) To: Pedro Alves; +Cc: GDB Patches On Tuesday, March 31 2015, Pedro Alves wrote: >>> So, gdb itself errors and stops processing the core? >> >> No, GDB does not "error and stop", but some testcases do that. > > Well, it clearly does do that. Hence your new patch. :-) > >>> I think I don't understand. :-) Can you please show an >>> example session? Did GDB continue processing the core when >>> it printed that error, or was it just a warning and it continued? >> > > I meant "Did GDB stop processing the core", of course. Sorry I misunderstood your question. >> Sure, sorry for not sending the example session before! Here is the >> pertinent part: >> >> (gdb) core /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore >> [New LWP 28468] >> Cannot access memory at address 0x355fc21148 >> Cannot access memory at address 0x355fc21140 >> (gdb) FAIL: gdb.base/coredump-filter.exp: loading and testing corefile for non-Private-Anonymous: load core >> FAIL: gdb.base/coredump-filter.exp: loading and testing corefile >> for non-Private-Anonymous: loading >> /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/gdb.base/non-private-anon.gcore >> spawn >> /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../../gdb/gdb >> -nw -nx -data-directory >> /home/sergio/work/src/git/binutils-gdb/rhbz1085906-coredump-filter/build-64-3/gdb/testsuite/../data-directory >> ... >> >> GDB correctly loaded the corefile (despite the warnings), and the > > The error made the rest of core_open be skipped: if the core is still > loaded, gdb is potentially in an inconsistent state at this point. > I'd think we should completely discard the core/target if something > errors out. And then if we can be tolerant to specific parts of > loading a core failing, we should handle those before the error escapes > out of core_open. We do something like that > already (note core_close_cleanup and the TRY/CATCH'S), but it's clearly > not complete. After: > > push_target (&core_ops); > discard_cleanups (old_chain); > > ... several things can throw and let an exception escape. Yeah, certainly we should be catching those before they escape. I will see if I can take a look at more places to improve/fix later. >> Right, so I took some time and found the right fix, I think. As we >> agreed above, the fact that GDB is not printing the "Core was generated >> by..." message is really strange, so I decided to investigate why it is >> doing that. >> >> The answer is that we are forgetting to check for an exception on >> solib_svr4_r_ldsomap. When loading the corefile, GDB calls this >> function, which then calls read_memory_unsigned_integer, which throws an >> error. This error is not being caught by the function, so it propagates >> until the main loop catches it. The fix is obvious: we should catch >> this regression and continue in the function. With it, GDB now >> correctly prints the "Core was generated by..." message, and the patch >> to adjust gdb_core_cmd is no longer needed. >> >> Regression-tested on Fedora 20 for x86_64, i686 and native-gdbserver. >> >> Does that make more sense now? >> > > Yes, this is OK. Thanks, pushed: <https://sourceware.org/ml/gdb-cvs/2015-03/msg00274.html> 416f679e68468ea6dd7384213994ce74201f82e7 -- Sergio GPG key ID: 0x65FC5E36 Please send encrypted e-mail if possible http://sergiodj.net/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-03-31 23:39 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-03-25 0:06 [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages Sergio Durigan Junior 2015-03-27 10:06 ` Pedro Alves 2015-03-27 22:34 ` [PATCH] Catch exception on solib_svr4_r_ldsomap (was: Re: [RFC/PATCH] Extend gdb_core_cmd to allow "Cannot access memory..." messages) Sergio Durigan Junior 2015-03-31 11:30 ` [PATCH] Catch exception on solib_svr4_r_ldsomap Pedro Alves 2015-03-31 23:39 ` Sergio Durigan Junior
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).