From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=SaMF=DA=redhat.com=aburgess@sourceware.org>
Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124])
	by sourceware.org (Postfix) with ESMTPS id 045CA3858410
	for <gdb-patches@sourceware.org>; Fri, 14 Jul 2023 15:20:56 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 045CA3858410
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
	s=mimecast20190719; t=1689348056;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:mime-version:mime-version:content-type:content-type:
	 in-reply-to:in-reply-to:references:references;
	bh=02WZbTWg0YAj/bBIlE1/9QiCYPNSdurGJkgIkkYuHBc=;
	b=Nddi4cVs2aqUCLTmP/VZbZtF6VSNB7/hadFATinX2+wrW63fFWAhuugu4pID7Ez1MshUxn
	b265zWu/qtDpHmnq9VKQgmuVyZpj9cqNsax8g8LQztiFvdwPLZGyjG/u4HOahIv3TN8XwY
	Fmhkz0fXptN2R8e2IZ6/2yj9NkTw+cc=
Received: from mail-vs1-f69.google.com (mail-vs1-f69.google.com
 [209.85.217.69]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id
 us-mta-620-kqV3pO7BMEaqD-OkT_iaRA-1; Fri, 14 Jul 2023 11:20:55 -0400
X-MC-Unique: kqV3pO7BMEaqD-OkT_iaRA-1
Received: by mail-vs1-f69.google.com with SMTP id ada2fe7eead31-4435c2b3029so189218137.2
        for <gdb-patches@sourceware.org>; Fri, 14 Jul 2023 08:20:55 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1689348054; x=1691940054;
        h=mime-version:message-id:date:references:in-reply-to:subject:to:from
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=02WZbTWg0YAj/bBIlE1/9QiCYPNSdurGJkgIkkYuHBc=;
        b=QTqRF44jtFDOfulOEP2zg3h0nF4g+At6VQgL/JOaknJYKUGUS1kVC2eJCHLgVaRJ5R
         0ZxwUuVoR0kPFH7GOEfhRK6kib/NjDtWTxVpVTvG76D/zx4L6LlJMsbUKKWiE0SaEfae
         qJCdUCcDESkV5QRLHqjVClmqQZUSb2f8bqbnuX7JYeeTtJ8zrLzVEB0tkgtpyfGwKlu3
         ETZ3VcZEYGLgZBCI/2AZTaZGVXSKvkr0x3kR8shOWN0aVKgGBDWug6mE9WUO2BmoHgT3
         EbRzBilcwWWM8LWu737zlw7djU7VpVEYT89yw/UGmIWvcUzFceDrm9Axek87h/qtZxdJ
         1S0w==
X-Gm-Message-State: ABy/qLYL3TxfZXduH1TWCn5XWcg/yijY3dMmrS7SpT/Tb99a0kVvpO/q
	wFD4yiFmvWMycUeYanv624XutcGiYBuhZehNBkP2BE0O94hNJ/mdk9jInzY7cy1mWjqHhxqXDBX
	MKkKTwCwBd6IVO2wKUKrY2m5blqWBkg==
X-Received: by 2002:a67:fdc8:0:b0:446:8c09:4f1a with SMTP id l8-20020a67fdc8000000b004468c094f1amr1942630vsq.30.1689348054207;
        Fri, 14 Jul 2023 08:20:54 -0700 (PDT)
X-Google-Smtp-Source: APBJJlEnVT2c92FcMSeasa9dd7uErRuzncMz5FJrM8mo8eGiqDFNQRrbfAToRT5jPTUcZbQqN1j7qA==
X-Received: by 2002:a67:fdc8:0:b0:446:8c09:4f1a with SMTP id l8-20020a67fdc8000000b004468c094f1amr1942608vsq.30.1689348053457;
        Fri, 14 Jul 2023 08:20:53 -0700 (PDT)
Received: from localhost (2.72.115.87.dyn.plus.net. [87.115.72.2])
        by smtp.gmail.com with ESMTPSA id b21-20020a0cb3d5000000b0062ff0dd0332sm627900qvf.38.2023.07.14.08.20.52
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 14 Jul 2023 08:20:53 -0700 (PDT)
From: Andrew Burgess <aburgess@redhat.com>
To: Pedro Alves <pedro@palves.net>, gdb-patches@sourceware.org
Subject: Re: [PATCHv6 3/6] gdb: add timeouts for inferior function calls
In-Reply-To: <4267025a-c07d-0d82-4ea6-1638e2aeff9e@palves.net>
References: <cover.1680530116.git.aburgess@redhat.com>
 <2550eb8f3e77778e95bf8ded2775a31d9502f89a.1680530116.git.aburgess@redhat.com>
 <4267025a-c07d-0d82-4ea6-1638e2aeff9e@palves.net>
Date: Fri, 14 Jul 2023 16:20:51 +0100
Message-ID: <87jzv2ikoc.fsf@redhat.com>
MIME-Version: 1.0
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain
X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gdb-patches.sourceware.org>

Pedro Alves <pedro@palves.net> writes:

> On 2023-04-03 15:01, Andrew Burgess via Gdb-patches wrote:
>
>> diff --git a/gdb/NEWS b/gdb/NEWS
>> index 10a1a70fa52..70987994e7b 100644
>> --- a/gdb/NEWS
>> +++ b/gdb/NEWS
>> @@ -96,6 +96,24 @@ info main
>>     $2 = 1
>>     (gdb) break func if $_shell("some command") == 0
>>  
>> +set direct-call-timeout SECONDS
>> +show direct-call-timeout
>> +set indirect-call-timeout SECONDS
>> +show indirect-call-timeout
>> +  These new settings can be used to limit how long GDB will wait for
>> +  an inferior function call to complete.  The direct timeout is used
>> +  for inferior function calls from e.g. 'call' and 'print' commands,
>> +  while the indirect timeout is used for inferior function calls from
>> +  within a conditional breakpoint expression.
>
> What happens with expressions in other commands, basically any command that
> accepts an expression?  For example, "x foo()".  Are those direct, or
> indirect?  I assume direct?

Correct, these would be direct.  I struggled to come up with a good
name, but the basic idea was:

  direct -- user enters a command and as a result GDB performs an
            inferior function call.  The user can only enter the next
            command once the first command (and hence inferior call) has
            completed.

  indirect -- user enters a command that accepts an expression,
              e.g. breakpoint condition, but the expression is only
              evaluated at some future time which is largely outside of
              the users control, e.g. when the inferior hits the
              breakpoint.  The user might not even be aware that the
              inferior call is taking place (as b/p conditions are not
              announced until they complete or timeout).

>
> I wonder whether you have plans/ideas for other kinds of indirect calls.
> Just thinking about whether naming the option as something about
> "breakpoint-condition" wouldn't be better by being more direct (ah!) and
> to the point, while leaving the possibility of other kinds of situations
> having different timeouts.  to avoid long command names, we could have
> a prefix setting, like:

I guess we could, but I'm not sure why a user might want such fine
grained control -- they want to limit how long a breakpoint condition
can take to evaluate, but want a different limit on some-other indirect
case.  This just seems overly complex, surely you'd just pick a timeout
that satisfies your expected worst case and go with that.

To be honest, the reason I initially split direct and indirect is so
that the direct case could be unlimited to match GDB's current
behaviour.  But, now I've written it, I do think there's an argument
that a user might want to allow direct calls to take longer.  In the
direct case the user is (hopefully) aware that an inferior call has
taken place, and can manually interrupt if the call is taking too long,
so I think this split does make sense.

>
>  set call-timeout direct  # maybe there's a better name for this.
>  set call-timeout breakpoint-conditions
>  set call-timeout some-other-case
>
> Just some thoughts, by no way am I objecting to what you have.
>
>> +
>> +  The default for the direct timeout is unlimited, while the default
>> +  for the indirect timeout is 30 seconds.
>
> While working on Windows non-stop support recently, I noticed that
> gdb.threads/multiple-successive-infcall.exp has infcalls that would
> just hang "forever", the infcall never completed.  The test
> enables schedlock, and then calls a function in each thread in the
> program [like, (gdb) p get_value()].  The issue turns out to be about
> calling a function in a thread that is currently running Windows kernel
> code.  On Linux, most system calls are interruptible (EINTR), and
> restartable.  When the debugger pauses a thread and the thread is in a
> syscall, the syscall is interrupted and restarted later when the thread
> is resumed.  On Windows, system calls are NOT interruptible.  The threads
> in question in the testcase were stopped inside the syscall done by
> ntdll!ZwWaitForMultipleObjects.  In that scenario, you can still pause the
> hung thread with Ctrl-C, and you'll see that the (userspace) PC of the thread
> in question hasn't changed, it is still pointing to the entry to the
> function GDB wants to call -- not surprising since the thread is really
> still blocked inside the syscall and never ran any userspace instruction.
>
> This looks like something that Windows GDB users are likely to trip on more
> frequently than GNU/Linux users.
>
> So I looked at how Visual Studio (not vscode) handles it, to check how it 
> handles this, maybe it just doesn't let you call functions on threads that are
> stopped inside a syscall?  Nope.  You guessed it, it handles it with a timeout.
> If you add a watch expression (like a gdb "display") involving infcall, and the thread
> is in kernel code, VS will still try the call, and then after a few short
> seconds (maybe some 5s), it aborts the expression, popping a dialog box informing
> you about it.
>
> All that to say that I would think it reasonable to default to a
> shorter timeout in GDB too.
>
> Actually, I remembered now that LLDB also has a timeout for infcalls.
> On the version I have handy installed, "help expression" talks about
> a timeout of "currently .25 seconds", and then retrying with all threads
> running, (that's 0.25s, not 25s IIUC, curiously, higher resolution than
> second), but I don't know how long that second retry has for timeout,
> if it has one.
>
> For breakpoint conditions, I think it may be nice (but not a
> requirement of this patch, just an idea) if after some time less than
> the whole timeout time, for GDB to print a warning, something like:
>
>  warning: a function call in the condition of breakpoint 2.3 is taking long.
>
> Like, we could print that warning after 1 second, even if the timeout
> is set to higher than that.
>
> Anyhow, all that is a lot easier to code than debate and it can always
> be done later.
>
>> +
>> +  These timeouts will only have an effect for targets that are
>> +  operating in async mode.  For non-async targets the timeouts are
>> +  ignored, GDB will wait indefinitely for an inferior function to
>> +  complete, unless interrupted by the user using Ctrl-C.
>> +
>>  * MI changes
>>  
>>  ** mi now reports 'no-history' as a stop reason when hitting the end of the
>> diff --git a/gdb/doc/gdb.texinfo b/gdb/doc/gdb.texinfo
>> index fe76e5e0a0e..46f17798510 100644
>> --- a/gdb/doc/gdb.texinfo
>> +++ b/gdb/doc/gdb.texinfo
>> @@ -20885,6 +20885,72 @@
>>  @code{step}, etc).  In this case, when the inferior finally returns to
>>  the dummy-frame, @value{GDBN} will once again halt the inferior.
>>  
>> +On targets that support asynchronous execution (@pxref{Background
>> +Execution}) @value{GDBN} can place a timeout on any functions called
>> +from @value{GDBN}.  If the timeout expires and the function call is
>> +still ongoing, then @value{GDBN} will interrupt the program.
>
> In the patch introducing "set unwind-on-timeout", I think it would be
> good to mention the setting here.  I didn't notice it being added there.
> Because, as I read this, I wondered "OK, but what happens after GDB
> interrupts the program?  Do we unwind according to set unwind-on-signal?" .
>
>> +
>> +For targets that don't support asynchronous execution
>> +(@pxref{Background Execution}) then timeouts for functions called from
>> +@value{GDBN} are not supported, the timeout settings described below
>> +will be treated as @code{unlimited}, meaning @value{GDBN} will wait
>> +indefinitely for function call to complete, unless interrupted by the
>> +user using @kbd{Ctrl-C}.
>> +
>
> ...
>
>> diff --git a/gdb/infcall.c b/gdb/infcall.c
>> index 4fb8ab07db0..bb57faf700f 100644
>> --- a/gdb/infcall.c
>> +++ b/gdb/infcall.c
>> @@ -95,6 +95,53 @@ show_may_call_functions_p (struct ui_file *file, int from_tty,
>>  	      value);
>>  }
>>  
>> +/* A timeout (in seconds) for direct inferior calls.  A direct inferior
>> +   call is one the user triggers from the prompt, e.g. with a 'call' or
>> +   'print' command.  Compare with the definition of indirect calls below.  */
>> +
>> +static unsigned int direct_call_timeout = UINT_MAX;
>> +
>> +/* Implement 'show direct-call-timeout'.  */
>> +
>> +static void
>> +show_direct_call_timeout (struct ui_file *file, int from_tty,
>> +			  struct cmd_list_element *c, const char *value)
>> +{
>> +  if (target_has_execution () && !target_can_async_p ())
>> +    gdb_printf (file, _("Current target does not support async mode, timeout "
>> +			"for direct inferior calls is \"unlimited\".\n"));
>> +  else if (direct_call_timeout == UINT_MAX)
>> +    gdb_printf (file, _("Timeout for direct inferior function calls "
>> +			"is \"unlimited\".\n"));
>> +  else
>> +    gdb_printf (file, _("Timeout for direct inferior function calls "
>> +			"is \"%s seconds\".\n"), value);
>> +}
>> +
>> +/* A timeout (in seconds) for indirect inferior calls.  An indirect inferior
>> +   call is one that originates from within GDB, for example, when
>> +   evaluating an expression for a conditional breakpoint.  Compare with
>> +   the definition of direct calls above.  */
>> +
>> +static unsigned int indirect_call_timeout = 30;
>> +
>> +/* Implement 'show indirect-call-timeout'.  */
>> +
>> +static void
>> +show_indirect_call_timeout (struct ui_file *file, int from_tty,
>> +			  struct cmd_list_element *c, const char *value)
>> +{
>> +  if (target_has_execution () && !target_can_async_p ())
>> +    gdb_printf (file, _("Current target does not support async mode, timeout "
>> +			"for indirect inferior calls is \"unlimited\".\n"));
>> +  else if (indirect_call_timeout == UINT_MAX)
>> +    gdb_printf (file, _("Timeout for indirect inferior function calls "
>> +			"is \"unlimited\".\n"));
>> +  else
>> +    gdb_printf (file, _("Timeout for indirect inferior function calls "
>> +			"is \"%s seconds\".\n"), value);
>> +}
>> +
>>  /* How you should pass arguments to a function depends on whether it
>>     was defined in K&R style or prototype style.  If you define a
>>     function using the K&R syntax that takes a `float' argument, then
>> @@ -589,6 +636,86 @@ call_thread_fsm::should_notify_stop ()
>>    return true;
>>  }
>>  
>> +/* A class to control creation of a timer that will interrupt a thread
>> +   during an inferior call.  */
>> +struct infcall_timer_controller
>> +{
>> +  /* Setup an event-loop timer that will interrupt PTID if the inferior
>> +     call takes too long.  DIRECT_CALL_P is true when this inferior call is
>> +     a result of the user using a 'print' or 'call' command, and false when
>> +     this inferior call is a result of e.g. a conditional breakpoint
>> +     expression, this is used to select which timeout to use.  */
>> +  infcall_timer_controller (thread_info *thr, bool direct_call_p)
>> +    : m_thread (thr)
>> +  {
>> +    unsigned int timeout
>> +      = direct_call_p ? direct_call_timeout : indirect_call_timeout;
>> +    if (timeout < UINT_MAX && target_can_async_p ())
>> +      {
>> +	int ms = timeout * 1000;
>> +	int id = create_timer (ms, infcall_timer_controller::timed_out, this);
>> +	m_timer_id.emplace (id);
>> +	infcall_debug_printf ("Setting up infcall timeout timer for "
>> +			      "ptid %s: %d milliseconds",
>> +			      m_thread->ptid.to_string ().c_str (), ms);
>> +      }
>> +  }
>> +
>> +  /* Destructor.  Ensure that the timer is removed from the event loop.  */
>> +  ~infcall_timer_controller ()
>> +  {
>> +    /* If the timer has already triggered, then it will have already been
>> +       deleted from the event loop.  If the timer has not triggered, then
>> +       delete it now.  */
>> +    if (m_timer_id.has_value () && !m_triggered)
>> +      delete_timer (*m_timer_id);
>> +
>> +    /* Just for clarity, discard the timer id now.  */
>> +    m_timer_id.reset ();
>> +  }
>> +
>> +  /* Return true if there was a timer in place, and the timer triggered,
>> +     otherwise, return false.  */
>> +  bool triggered_p ()
>> +  {
>> +    gdb_assert (!m_triggered || m_timer_id.has_value ());
>> +    return m_triggered;
>> +  }
>> +
>> +private:
>> +  /* The thread we should interrupt.  */
>> +  thread_info *m_thread;
>> +
>> +  /* Set true when the timer is triggered.  */
>> +  bool m_triggered = false;
>> +
>> +  /* Given a value when a timer is in place.  */
>> +  gdb::optional<int> m_timer_id;
>> +
>> +  /* Callback for the timer, forwards to ::trigger below.  */
>> +  static void
>> +  timed_out (gdb_client_data context)
>> +  {
>> +    infcall_timer_controller *ctrl
>> +      = static_cast<infcall_timer_controller *> (context);
>> +    ctrl->trigger ();
>> +  }
>> +
>> +  /* Called when the timer goes off.  Stop thread m_thread.  */
>
> Uppercase M_THREAD.

Fixed.

>
>> +  void
>> +  trigger ()
>> +  {
>> +    m_triggered = true;
>> +
>> +    scoped_disable_commit_resumed disable_commit_resumed ("infcall timeout");
>> +
>> +    infcall_debug_printf ("Stopping thread %s",
>> +			  m_thread->ptid.to_string ().c_str ());
>> +    target_stop (m_thread->ptid);
>> +    m_thread->stop_requested = true;
>
> As per the discussion in the remote patch, I think this will need
> to be adjusted.  Maybe something like:
>
>     if (target_is_non_stop_p ())
>       {
>         target_stop (m_thread->ptid);
>         m_thread->stop_requested = true;
>       }
>     else
>       target_interrupt ();

I understand your critique of the 'avoid SIGINT after calling
remote_target::stop' patch, but I don't understand this comment.  What
we want to do is stop the target, not interrupt it, thus, surely, we
should call target_stop.

The fact that we can't target_stop for a !non-stop target is surely
something the target should deal with.  And indeed, if we check out
remote_target::stop we see that for !non-stop target we call
remote_interrupt_as.  In contrast, calling remote_target::interrupt for
a !non-stop target also calls remote_interrupt_as, which I think is very
much the point of your critique, right?

My thinking here is that, if we _did_ come up with some clever way to
support ::stop for a !non-stop target, this code would be added to
remote_target::stop, but _not_ to remote_target::interrupt, so we should
call the function that matches our intention, even if, right now, GDB
can't actually satisfy our needs.

>
>> +  }
>> +};
>> +
>>  /* Subroutine of call_function_by_hand to simplify it.
>>     Start up the inferior and wait for it to stop.
>>     Return the exception if there's an error, or an exception with
>> @@ -599,13 +726,15 @@ call_thread_fsm::should_notify_stop ()
>>  
>>  static struct gdb_exception
>>  run_inferior_call (std::unique_ptr<call_thread_fsm> sm,
>> -		   struct thread_info *call_thread, CORE_ADDR real_pc)
>> +		   struct thread_info *call_thread, CORE_ADDR real_pc,
>> +		   bool *timed_out_p)
>>  {
>>    INFCALL_SCOPED_DEBUG_ENTER_EXIT;
>>  
>>    struct gdb_exception caught_error;
>>    ptid_t call_thread_ptid = call_thread->ptid;
>>    int was_running = call_thread->state == THREAD_RUNNING;
>> +  *timed_out_p = false;
>>  
>>    infcall_debug_printf ("call function at %s in thread %s, was_running = %d",
>>  			core_addr_to_string (real_pc),
>> @@ -617,6 +746,16 @@ run_inferior_call (std::unique_ptr<call_thread_fsm> sm,
>>    scoped_restore restore_in_infcall
>>      = make_scoped_restore (&call_thread->control.in_infcall, 1);
>>  
>> +  /* If the thread making the inferior call stops with a time out then the
>> +     stop_requested flag will be set.  However, we don't want changes to
>> +     this flag to leak back to our caller, we might be here to handle an
>> +     inferior call from a breakpoint condition, so leaving this flag set
>> +     would appear that the breakpoint stop was actually a requested stop,
>> +     which is not true, and will cause GDB to print extra messages to the
>> +     output.  */
>> +  scoped_restore restore_stop_requested
>> +    = make_scoped_restore (&call_thread->stop_requested, false);
>
> I'm confused by this.  If stop_requested was set when the breakpoint was hit,
> are we still evaluating the breakpoint condition (and re-resuming the thread
> if the condition is false) ?

I don't really understand your question here, but I don't think that it
matters now.  This stop_requested stuff was only in place to support the
'avoid SIGINT after calling remote_target::stop' patch (the next one),
which I'm going to drop after your feedback.

>
>> +
>>    clear_proceed_status (0);
>>  
>
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.base/infcall-timeout.c
>> @@ -0,0 +1,36 @@
>> +/* Copyright 2022-2023 Free Software Foundation, Inc.
>> +
>> +   This file is part of GDB.
>> +
>> +   This program is free software; you can redistribute it and/or modify
>> +   it under the terms of the GNU General Public License as published by
>> +   the Free Software Foundation; either version 3 of the License, or
>> +   (at your option) any later version.
>> +
>> +   This program is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +   GNU General Public License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
>> +
>> +#include <unistd.h>
>> +
>> +/* This function is called from GDB.  */
>> +int
>> +function_that_never_returns ()
>> +{
>> +  while (1)
>> +    sleep (1);
>> +
>> +  return 0;
>> +}
>> +
>> +int
>> +main ()
>> +{
>> +  alarm (300);
>> +
>> +  return 0;
>> +}
>> diff --git a/gdb/testsuite/gdb.base/infcall-timeout.exp b/gdb/testsuite/gdb.base/infcall-timeout.exp
>
>
> ...
>
>> +standard_testfile
>> +
>> +if { [build_executable "failed to prepare" ${binfile} "${srcfile}" \
>> +	  {debug}] == -1 } {
>> +    return
>> +}
>> +
>> +# Start GDB according to TARGET_ASYNC and TARGET_NON_STOP, then adjust
>> +# the direct-call-timeout, and make an inferior function call that
>> +# will never return.  GDB should eventually timeout and stop the
>> +# inferior.
>> +proc_with_prefix run_test { target_async target_non_stop } {
>> +    save_vars { ::GDBFLAGS } {
>> +	append ::GDBFLAGS \
>> +	    " -ex \"maint set target-non-stop $target_non_stop\""
>
> It's curious that target-non-stop on|off is tested, but not "set non-stop on".

I've extended the tests to cover this case.

>
>> +	append ::GDBFLAGS \
>> +	    " -ex \"maintenance set target-async ${target_async}\""
>> +
>> +	clean_restart ${::binfile}
>> +    }
>> +
>
> ...
>
>> diff --git a/gdb/testsuite/gdb.threads/infcall-from-bp-cond-timeout.exp b/gdb/testsuite/gdb.threads/infcall-from-bp-cond-timeout.exp
>> new file mode 100644
>> index 00000000000..4159288a39c
>> --- /dev/null
>> +++ b/gdb/testsuite/gdb.threads/infcall-from-bp-cond-timeout.exp
>
> ...
>
>> +
>> +    gdb_breakpoint \
>> +	"${::srcfile}:${::cond_bp_line} if (condition_func ())"
>> +    set bp_num [get_integer_valueof "\$bpnum" "*UNKNOWN*" \
>> +		    "get number for conditional breakpoint"]
>> +
>> +    gdb_breakpoint "${::srcfile}:${::final_bp_line}"
>> +    set final_bp_num [get_integer_valueof "\$bpnum" "*UNKNOWN*" \
>> +			  "get number for final breakpoint"]
>> +
>> +    # The thread performing an inferior call relies on a second
>> +    # thread.  The second thread will segfault unless it hits a
>> +    # breakpoint first.  In either case the initial thread will not
>> +    # complete its inferior call.
>> +    if { $other_thread_bp } {
>> +	gdb_breakpoint "${::srcfile}:${::segfault_line}"
>> +	set segfault_bp_num [get_integer_valueof "\$bpnum" "*UNKNOWN*" \
>> +				 "get number for segfault breakpoint"]
>> +    }
>> +
>> +    # When non-stop mode is off we get slightly different output from GDB.
>> +    if { [gdb_is_remote_or_extended_remote_target] && !$target_non_stop} {
>> +	set stopped_line_pattern "Thread ${::decimal} \"\[^\r\n\"\]+\" received signal SIGINT, Interrupt\\."
>> +    } else {
>> +	set stopped_line_pattern "Thread ${::decimal} \"\[^\r\n\"\]+\" stopped\\."
>> +    }
>
> Something is going on in this test that when testing against gdbserver with
> all-stop, it is always Thread 2 that reports the SIGINT, which is coincidentally
> the thread that was hitting the breakpoint and running the infcall, AFAICS.
>
>  continue
>  Continuing.
>  [New Thread 3506594.3506599]
>  [New Thread 3506594.3506600]
>  [New Thread 3506594.3506601]
>  [New Thread 3506594.3506602]
>  [New Thread 3506594.3506603]
>
>  Thread 2 "infcall-from-bp" received signal SIGINT, Interrupt.
> __futex_abstimed_wait_common64 (private=<optimized out>, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x555555558080 <thread_1_semaphore>) at ./nptl/futex-internal.c:57
>
> Why is that?

The answer lies in the 'gdb: fix b/p conditions with infcalls in
multi-threaded inferiors' patch, specifically the change to
user_visible_resume_ptid, which ensures that, when evaluating a B/P
condition, only the thread evaluating the condition is resumed.

The code didn't originate with me[1], but I didn't question it too much
when I incorporated it into this series, and maybe I should have.

I wonder if in all-stop mode we should be resuming all threads when
evaluating the condition?

I'll think about this some more follow up..

[1] https://inbox.sourceware.org/gdb-patches/20201009112719.629-3-natalia.saiapova@intel.com/

Thanks,
Andrew

>
> Normally, it's usually the main thread that manages to dequeue the signal
> on the kernel side.  But it can really be any other thread.  Note
> linux_process_target::request_interrupt() does:
>
>   /* Send a SIGINT to the process group.  This acts just like the user
>      typed a ^C on the controlling terminal.  */
>   int res = ::kill (-signal_pid, SIGINT);
>
> If the main thread is ptrace-stopped, then it will be another thread, so
> I guess the main thread is stopped?  But where?
>
> My point is that this is showing a weakness in the testcase.  I would
> expect that (in all-stop with gdbserver) the SIGINT would be reported
> for the main thread, and that would exercise the scenario that the thread
> that is running the infcall is not the same as the thread that reports
> the interruption signal.  I think that scenario should be exercised if
> possible.  But the testcase as written somehow makes them be the same
> threads, which might well hide problems with all-stop targets.
>
> I realize that I'm reading the series out of order, and the answer may
> be in the previous patch.  I guess I should read that in detail
> first.  :-P
>
> Pedro Alves