public inbox for gdb-patches@sourceware.org
 help / color / mirror / Atom feed
From: Pedro Alves <palves@redhat.com>
To: Sergio Durigan Junior <sergiodj@redhat.com>
Cc: gdb-patches@sourceware.org
Subject: Re: Possible regression on gdb.multi/multi-arch-exec.exp
Date: Thu, 28 Jun 2018 12:09:00 -0000	[thread overview]
Message-ID: <91c04ab2-ccbe-37ce-4a63-3350442dd406@redhat.com> (raw)
In-Reply-To: <87in649jtd.fsf@redhat.com>

On 06/27/2018 07:16 PM, Sergio Durigan Junior wrote:
> On Thursday, June 07 2018, Pedro Alves wrote:
> 
>> This is more preparation bits for multi-target support.
> 
> Hi Pedro,
> 
> While preparing a new Fedora GDB rawhide release, I noticed a regression
> related to this commit.  The curious thing is that I am only able to
> reproduce the regression on a Fedora Rawhide system; it doesn't happen
> on my Fedora 27 machine (initially I thought it might be related to GCC,
> but testing against GCC HEAD on my Fedora 27 machine also did not
> trigger the regression).
> 
> The test failing is gdb.multi/multi-arch-exec.exp, and here's what I'm seeing:
> 
>   (gdb) break all_started
>   Breakpoint 1 at 0x400848: file /home/sergio/build/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.multi/multi-arch-exec.c, line 42.
>   (gdb) run 
>   Starting program: /home/sergio/build/gdb/testsuite/outputs/gdb.multi/multi-arch-exec/1-multi-arch-exec 
>   [Thread debugging using libthread_db enabled]
>   Using host libthread_db library "/lib64/libthread_db.so.1".
>   [New Thread 0x7ffff7476700 (LWP 1354)]
> 
>   Thread 1 "1-multi-arch-ex" hit Breakpoint 1, all_started () at /home/sergio/build/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.multi/multi-arch-exec.c:42
>   42      }
>   (gdb) delete breakpoints
>   Delete all breakpoints? (y or n) y
>   (gdb) info breakpoints
>   No breakpoints or watchpoints.
>   (gdb) break main
>   Breakpoint 2 at 0x400862: file /home/sergio/build/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.multi/multi-arch-exec.c, line 51.
>   (gdb) thread 1
>   [Switching to thread 1 (Thread 0x7ffff7fdf740 (LWP 1350))]
>   #0  all_started () at /home/sergio/build/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.multi/multi-arch-exec.c:42
>   42      }
>   (gdb) PASS: gdb.multi/multi-arch-exec.exp: first_arch=1: selected_thread=1: follow_exec_mode=new: thread 1
>   set follow-exec-mode new
>   (gdb) PASS: gdb.multi/multi-arch-exec.exp: first_arch=1: selected_thread=1: follow_exec_mode=new: set follow-exec-mode new
>   continue
>   Continuing.
>   [Thread 0x7ffff7476700 (LWP 1354) exited]
>   process 1350 is executing new program: /home/sergio/build/gdb/testsuite/outputs/gdb.multi/multi-arch-exec/1-multi-arch-exec-hello
>   [New inferior 2 (process 0)]
>   [New process 1350]
>   ../../binutils-gdb/gdb/target.c:3200: internal-error: gdbarch* default_thread_architecture(target_ops*, ptid_t): Assertion `inf != NULL' failed.
>   A problem internal to GDB has been detected,
>   further debugging may prove unreliable.
>   Quit this debugging session? (y or n) FAIL: gdb.multi/multi-arch-exec.exp: first_arch=1: selected_thread=1: follow_exec_mode=new: continue across exec that changes architecture (GDB internal error)
> 
> 
> I spent some time investigating this, and here's what I've learned so
> far:
> 
> 1) When infrun.c:handle_inferior_event_1 is called and deals with
> TARGET_WAITKIND_EXECD (around line 5275), it does:
> 
>     ...
>     case TARGET_WAITKIND_EXECD:
>       if (debug_infrun)
>         fprintf_unfiltered (gdb_stdlog, "infrun: TARGET_WAITKIND_EXECD\n");
> 
>       /* Note we can't read registers yet (the stop_pc), because we
> 	 don't yet know the inferior's post-exec architecture.
> 	 'stop_pc' is explicitly read below instead.  */
>       switch_to_thread_no_regs (ecs->event_thread);
> 
>       /* Do whatever is necessary to the parent branch of the vfork.  */
>       handle_vfork_child_exec_or_exit (1);
> 
>       /* This causes the eventpoints and symbol table to be reset.
>          Must do this now, before trying to determine whether to
>          stop.  */
>       follow_exec (inferior_ptid, ecs->ws.value.execd_pathname);   // <---- #1
> 
>       stop_pc = regcache_read_pc (get_thread_regcache (ecs->event_thread)); // <---- #2
>       ...
> 
> 2) When follow_exec is called (#1 above), it does:
> 
>   ...
>   /* The target reports the exec event to the main thread, even if
>      some other thread does the exec, and even if the main thread was
>      stopped or already gone.  We may still have non-leader threads of
>      the process on our list.  E.g., on targets that don't have thread
>      exit events (like remote); or on native Linux in non-stop mode if
>      there were only two threads in the inferior and the non-leader
>      one is the one that execs (and nothing forces an update of the
>      thread list up to here).  When debugging remotely, it's best to
>      avoid extra traffic, when possible, so avoid syncing the thread
>      list with the target, and instead go ahead and delete all threads
>      of the process but one that reported the event.  Note this must
>      be done before calling update_breakpoints_after_exec, as
>      otherwise clearing the threads' resources would reference stale
>      thread breakpoints -- it may have been one of these threads that
>      stepped across the exec.  We could just clear their stepping
>      states, but as long as we're iterating, might as well delete
>      them.  Deleting them now rather than at the next user-visible
>      stop provides a nicer sequence of events for user and MI
>      notifications.  */
>   ALL_THREADS_SAFE (th, tmp)
>     if (ptid_get_pid (th->ptid) == pid && !ptid_equal (th->ptid, ptid))
>       delete_thread (th);
>   ...
> 
> On my Fedora Rawhide box, delete_thread is being called to delete the
> same thread as ecs->event_thread.  On my Fedora 27 machine, it deletes a
> different thread.
> 
> 3) Back to handle_inferior_event_1, when #2 is called, ecs->event_thread
> points to an invalid object, which triggers the assertion.
> 
> 
> I haven't progressed much further (other things to wrap up), but I
> decided to get the ball rolling already.  If you need access to a Fedora
> Rawhide VM, please let me know and I can provide this to you.

I think the "gdb: Eliminate the 'stop_pc' global" patch
(<https://sourceware.org/ml/gdb-patches/2018-06/msg00524.html>)
will fix this, because it moves the stop_pc assignment until
after ecs->event_thread is refreshed:

> @@ -5289,16 +5294,18 @@ Cannot fill $_exitsignal with the correct signal number.\n"));
>           stop.  */
>        follow_exec (inferior_ptid, ecs->ws.value.execd_pathname);
>  
> -      stop_pc = regcache_read_pc (get_thread_regcache (ecs->event_thread));
> -
>        /* In follow_exec we may have deleted the original thread and
>  	 created a new one.  Make sure that the event thread is the
>  	 execd thread for that case (this is a nop otherwise).  */
>        ecs->event_thread = inferior_thread ();
>  
> +      ecs->event_thread->suspend.stop_pc
> +	= regcache_read_pc (get_thread_regcache (ecs->event_thread));
> +

Thanks,
Pedro Alves

  parent reply	other threads:[~2018-06-28 12:09 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-07 18:07 [PATCH] Use thread_info and inferior pointers more throughout Pedro Alves
2018-06-07 18:28 ` Tom Tromey
2018-06-21 15:57   ` Pedro Alves
2018-06-21 16:21     ` Pedro Alves
2018-06-25 10:18       ` Ulrich Weigand
2018-06-25 10:23         ` Pedro Alves
2018-06-27 11:34       ` Ulrich Weigand
2018-06-27 12:43         ` [PATCH] Fix Cell debugging regression (Re: [PATCH] Use thread_info and inferior pointers more throughout) Pedro Alves
2018-06-27 13:12           ` Ulrich Weigand
2018-06-27 13:17             ` Pedro Alves
2018-06-27 15:30               ` [PATCH v2] " Pedro Alves
2018-06-27 16:05                 ` Ulrich Weigand
2018-06-27 16:25                   ` Pedro Alves
2019-02-14 15:45       ` [PATCH] Use thread_info and inferior pointers more throughout Thomas Schwinge
2018-06-27 18:16 ` Possible regression on gdb.multi/multi-arch-exec.exp (was: Re: [PATCH] Use thread_info and inferior pointers more throughout) Sergio Durigan Junior
2018-06-27 18:39   ` Keith Seitz
2018-06-28 12:09   ` Pedro Alves [this message]
2018-06-28 16:02     ` [pushed] Fix follow-exec regression / crash (Re: Possible regression on gdb.multi/multi-arch-exec.exp) Pedro Alves
2018-06-28 16:37       ` Sergio Durigan Junior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91c04ab2-ccbe-37ce-4a63-3350442dd406@redhat.com \
    --to=palves@redhat.com \
    --cc=gdb-patches@sourceware.org \
    --cc=sergiodj@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).